Onboarding Windows nodes to Kubernetes cluster

Below are step by step instructions how to onboard Windows nodes to Kubernetes cluster. For cluster master I used Ubuntu 18.04 (Kubernetes control plane is still UNIX only setup and probably will stay forever this way). For Windows worker nodes I used Windows Server 1909 images (but any version of Windows 2019 and up can be used instead. I run my cluster in Azure but did not use Azure CNI so steps can be replicated with on-prem clusters as well.

Install single control-plane cluster

  1. Create Ubuntu VM in Azure and download Kubernetes binaries required for installation of control plane. I will use kubeadm tool both for settings up cluster as well as onboarding Windows nodes to cluster (master1 server).
  2. Install docker on master1 server ()
  3. Flannel POD network plugin will be used for PODs and hence additional parameters should be passed to kubeadm tool (--pod-network-cidr=10.244.0.0/16). Run on master1 sysctl net.bridge.bridge-nf-call-iptables=1
  4. Initialize single control-plane cluster by running kubeadm init --pod-network-cidr=10.244.0.0/16 on master1 node
  5. Copy last line from installation for joining nodes to cluster. In my case it’s (kubeadm join 10.0.0.4:6443 --token k54f1t.5rr385g1upol2njr --discovery-token-ca-cert-hash sha256:a4994328cc8b51386101983a4f860cbd08de95c56e7714b252b6ea7d13cf6d9d)
  6. Execute following to copy config file for kubectl to access your cluster
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
  1. Install flannel POD network plugin (kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml)
  2. Verify that your cluster is healthy by executing (kubectl get nodes). Your master node shall read as Ready
  3. Follow instructions here to configure flannel to allow Windows nodes to join

Add Windows nodes

  1. Nodes need to be able to talk to each other by name so make sure DNS works. If you are in Azure you can setup private DNS zone and associate it with Virtual Network and enabled Auto-Registration.
PS C:\Users\cloudadmin> resolve-dnsname master1.kubernetes.my

Name                                           Type   TTL   Section    IPAddress
----                                           ----   ---   -------    ---------
master1.kubernetes.my                          A      10    Answer     10.0.0.4
Azure private DNS registration
  1. Install Windows version 2019+. I use image of Windows Server 1909 with containers from Azure marketplace. It shall automatically register its name with private zone.
    Set default DNS suffix to be your private zone name (kubernetes.my for me)
    Set-DnsClientGlobalSetting -SuffixSearchList "kubernetes.my"
  2. Download Windows kubernetes tools and expand to local folder.
Invoke-WebRequest https://github.com/kubernetes-sigs/sig-windows-tools/archive/master.zip -OutFile master.zip
Expand-Archive .\master.zip -DestinationPath .
  1. Modify file called Kubeclustervxlan.json under (sig-windows-tools-master\kubeadm\v1.15.0) . Values for object called ControlPlane shall be modified to point to your master1 server and use token which was copied earlier. Change username to username you use on master1 node as well. Also make sure your default Ethernet adapter is in fact called Ethernet (Get-NetAdapter). If it’s not then modify line in file "InterfaceName":"Ethernet" to whatever name adapter is. Modify Source object to point to the same version of kubernetes as the master1 node is running. Modify CRI item in configuration file to change Pause image to multi-arch image as below since default pause image does not support 1909 base OS. My complete file is below, modify with your relevant entries
 {
    "Cri" : {
        "Name" : "dockerd",
        "Images" : {
            "Pause" : "mcr.microsoft.com/oss/kubernetes/pause:1.3.0",
            "Nanoserver" : "mcr.microsoft.com/windows/nanoserver:1809",
            "ServerCore" : "mcr.microsoft.com/windows/servercore:ltsc2019"
        }
    },
    "Cni" : {
        "Name" : "flannel",
        "Source" : [{ 
            "Name" : "flanneld",
            "Url" : "https://github.com/coreos/flannel/releases/download/v0.11.0/flanneld.exe"
            }
        ],
        "Plugin" : {
            "Name": "vxlan"
        },
        "InterfaceName" : "Ethernet 2"
    },
    "Kubernetes" : {
        "Source" : {
            "Release" : "1.17.4",
            "Url" : "https://dl.k8s.io/v1.17.4/kubernetes-node-windows-amd64.tar.gz"
        },
        "ControlPlane" : {
            "IpAddress" : "master1",
            "Username" : "gregory",
            "KubeadmToken" : "c5pi79.39te6ro1fnufx5jt",
            "KubeadmCAHash" : "sha256:a4994328cc8b51386101983a4f860cbd08de95c56e7714b252b6ea7d13cf6d9d"
        },
        "KubeProxy" : {
            "Gates" : "WinOverlay=true"
        },
        "Network" : {
            "ServiceCidr" : "10.96.0.0/12",
            "ClusterCidr" : "10.244.0.0/16"
        }
    },
    "Install" : {
        "Destination" : "C:\\ProgramData\\Kubernetes"
    }
}
  1. Execute powershell script under kubeadm folder and pass location of modified configuration file .\KubeCluster.ps1 -ConfigFile .\v1.15.0\Kubeclustervxlan.json -install
  2. Open generated public key of SSH cert (called id_rsa.pub under .ssh folder) and copy it contents. Add this contents to file called .ssh/authorized_keys on master1 node.
  3. Reboot computer after successful install
  4. Once computer comes back execute script again now with -join parameter to join node to a cluster .\KubeCluster.ps1 -ConfigFile .\v1.15.0\Kubeclustervxlan.json -join
  5. If everything went with no errors you shall see node joined to K8 cluster and be in Ready state
root@master1:~# k get nodes -o wide
NAME         STATUS   ROLES    AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                    KERNEL-VERSION     CONTAINER-RUNTIME
master1      Ready    master   142m    v1.17.4   10.0.0.4      <none>        Ubuntu 18.04.4 LTS          5.0.0-1032-azure   docker://19.3.6
winworker1   Ready    <none>   2m20s   v1.17.4   10.0.0.5      <none>        Windows Server Datacenter   10.0.18363.720     docker://19.3.5


10. You can schedule windows containers now and verify they work. Example below creates deployment with 2 pods which outputs random numbers to STDOUT

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: win-webserver
  name: win-webserver
spec:
  replicas: 2
  selector:
    matchLabels:
      app: win-webserver
  template:
    metadata:
      labels:
        app: win-webserver
      name: win-webserver
    spec:
      containers:
      - command:
        - powershell.exe
        - -command
        - while ($true) { "[{0}] [{2}] {1}" -f (Get-Date),(Get-Random),$env:COMPUTERNAME;
          Start-Sleep 5}
        image: mcr.microsoft.com/windows/servercore:1909
        imagePullPolicy: IfNotPresent
        name: windowswebserver
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      nodeSelector:
        beta.kubernetes.io/os: windows
      restartPolicy: Always
status: {}
PS C:\Users\cloudadmin> kubectl get pods
NAME                            READY   STATUS    RESTARTS   AGE
win-webserver-fffd4486f-cmdgx   1/1     Running   0          34m
win-webserver-fffd4486f-rp96t   1/1     Running   0          34m
PS C:\Users\cloudadmin> kubectl logs win-webserver-fffd4486f-cmdgx
[3/25/2020 12:48:07 AM] [WIN-WEBSERVER-F] 1105704259
[3/25/2020 12:48:12 AM] [WIN-WEBSERVER-F] 356015894
[3/25/2020 12:48:17 AM] [WIN-WEBSERVER-F] 1136900039
[3/25/2020 12:48:22 AM] [WIN-WEBSERVER-F] 111352898
[3/25/2020 12:48:27 AM] [WIN-WEBSERVER-F] 593146587
[3/25/2020 12:48:32 AM] [WIN-WEBSERVER-F] 1438304716
[3/25/2020 12:48:37 AM] [WIN-WEBSERVER-F] 1357778278

6 thoughts on “Onboarding Windows nodes to Kubernetes cluster

  1. Hi, thank you for the detailed step-by-step. We have tried mightily to make this work, without luck. One question – you mention that you’re running server 1909 as the host, yet flanneld 0.12.0 is built for 1809, and pause 1.3.0 is built for 1903, and you also reference nanoserver 1809 and servercore ltsc2019 (1809). Everything I’ve read says that hyperv isolation isn’t supported, but hyperv is the only way I can see to run these differently-based images. Am I missing something? Are you using hyperv isolation, or is there a way to work with these images I’m not understanding?

    Like

  2. Thank you for your quick response! I’m still confused though (am I making some underlying assumption that you’re not?). On my servercore 1903 host, if I run “docker run -it –isolation=process mcr.microsoft.com/windows/nanoserver:1809 cmd.exe”, it refuses to run, saying “The container operating system does not match the host operating system”. Also this article: https://docs.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/version-compatibility?tabs=windows-server-1909%2Cwindows-10-1909 (as I read it, and as I have experienced) says that a major version match is required for running in process isolation mode. Sorry if I seem slow, but do you know what I might be missing? Thanks again for your help.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s