Domesticating Kubernetes

Kubernetes as home server on bare metal in 150 minutes

Published in

Quickbird

17 min readApr 29, 2020

This is a guide to run K8S in a home network, and use it as a home server — run your blog, media library, smart home, pet projects, etc.
The cluster is actually straight-forward to set up, but we, developers are so cuddled, we are forgetting some basic networking and other low-level stuff — I found the experience educational.

The cluster will serve real workloads — we will deal with exposing it to the internet, IP assignments in home network, reasonable security, distributed storage and monitoring.
It is aimed at a home network, and does not rely on loadbalancers, SAN’s, multiple public IPs or any other fancy infrastructure. I am keeping it as simple (read reliable) as possible — there are no ‘enterprise’ bells and whistles.

To proceed, make sure you are comfortable with basic kubernetes concepts, know what’s a master node, an agent, a LoadBalancer service, a deployment, ingress, persistent volume, etc.

Anatomy of a Kubernetes cluster

Let’s consider K8S cluster as a layered cake and take a look at each layer

At the top are the Applications that you are writing and/or running — this is the part that actually delivers value and where developers will spend most of their time. This might be your wordpress blog, some API you’ve written and your bitcoin trading bot.

Next level down are Services for administration and running the applications — that’s your own MySQL database, ELK Stack, Monitoring, etc. They don’t have to run in your cluster — Amazon/Azure/GCP offer PAAS versions with their managed K8S serviceg. DevOps and administrators are spending a lot of their time here.

At the System-level we’ve got the components that make up a functional cluster— you can’t skip on any of these:

software components of K8S (kubelet, API-server, etc.)
storage provider for K8S persistent volumes
authentication provider for kubernetes users
loadbalancing and external ips

Smaller managed K8S providers like OvhCloud and DigitalOcean typically operate at this level. You have to configure them if you are bootstrapping your own cluster. System administrators and IT services might be spending majority of their time here.

Infrastructure layer is self-explanatory — that’s the metal, CPU, RAM, Disk, and physical network.

Compute and Storage

You might be tempted to get a bunch of Rasberri Pi’s, but there are better alternatives.Before we dive into them, consider the following:

CPU and RAM get pooled together in a cluster, you can get a solid 20GB ram and 6 cores out of a couple old laptops or other outdated kit lying around.
I would not buy old rack servers — they are cheap and powerful, but they sound like a stalling turbofan and their power consumption is crazy.
Storage works the other way — we will install a distributed storage system on our cluster, they (typically) keep 3 copies of data for redundancy. SBCs like the PI are very gimped in this regard, their performance is 10x lower than that of anything with a proper SSD, and reliability is lower.
Mini-pc like the Intel-NUC usually come in two versions — the ‘fatter’ one, it has an extra SATA drive slot, which you could use for an HDD.
You want a system with ‘always-on’ functionality, so that the computer starts itself after power loss, at least for the master-node. It’s in BIOS settings of most desktops and SBCs, but most laptops don’t have it.

Here is my K8S cluster, it fits on a single shelf in the closet:

All the kit is plugged into a gigabit Ethernet switch. Left to right, these are:

Beelink Gemini X45 with J4105 8GB RAM, 128GB SSD and 320 GB HDD, this is the master node.
Intel NUC with 5th Gen i3, 8Gb RAM, 128GB SSD and 320 GB HDD
Linx1010B — an joke of a windows tablet, Intel Atom, 2GB Ram.
An old Samsung laptop with 3rd gen i5, 6GB Ram and 256GB SSD
An obligatory PI4, 2GB (not pictured)

Looking at the benchmark, Raspberri PI’s hardly make any sense:

PI4 with 4Gb ram, sd card, case,etc. is about £100. For the same money you can get a no-name Intel-atom mini-pc, and those come with the benefit of x86 arch, real bios and real Sata or m.2 ports.
My Beelink set me back about ~£150 and it’s a noticeable upgrade
For £200 you can buy used mini-desktops, like ThinkCentre M700 i5–6400T. That a major performance improvement, but the device is larger.
At £300 and up you can build a brand-new, compact HTPC system, for example based on ASRock DeskMini A300 and full-power desktop components. Or you can always go with Intel-NUC if space is at a premium.

Network and IP Ranges

First and foremost if you want to host any web-services you need to make sure aren’t behind carrier-grade NAT. My provider uses it by default, but I got a static IP for extra £5 a month.

Next, let’s assume you have a DNS registrar, got yourself the domain timmy.com. Unlike in a typical deployment in the cloud, we have only one IP address to play with, so setup records to direct traffic from timmy.com and *.timmy.com (any subdomain) to your public IP address, so it arrives at your router.

Behind your router, your LAN IPs will be split into three ranges:

A range for static IPs assigned to important devices in your home network, it typically starts with your router, i used 192.168.0.1–255. All computers / nodes in the cluster should be given a static IP.
A range for DHCP assignments, this is for various devices that connect to your network ‘just to use the internet’, like your mobile phone. I configured DHCP server in the router to use 192.168.0–255.
Kubernetes services will have a floating IP addresses of their own, and the actual service might be located on any of the nodes in our cluster, depending on load and the whims of the kubernetes scheduler. We will be using MetalLB in ARP mode to achieve this, and the range of 192.168.2.0–255.

I have changed subnet /netmask of my router to 255.255.240.0. The actual range you use does not matter, you could leave default router subnet and use the ‘higher’ end IPs of 220–250 for static IP and load balancing. If you pick a different subnet, an IP calculator can help.

Once the traffic arrives at your router, we have to use port-forwarding to direct it to the right place. Traffic for the Kubernetes API server, typically on TCP:6443, must be directed to the master node — this will enable you to connect to your cluster using Kubectl from the internet.

In this setup we are only considering a single master node — if you had several of them for HA, you’d have to configure keepalived or HAproxy, or both.

Notice that only services of type LoadBalancer will be given an IP address on your LAN network. All other resources will reside on a VLAN setup with flannel, they can reach each-other but are isolated from the outside world.

Traffic on TCP:80 and 443 must be directed to the ingress service using it’s IP — from there it will be routed to the correct application depending on the domain name, and we can host virtually unlimited number of websites that way.

Only HTTP traffic can be routed based on domain name, so if we want to expose a MySQL database, we must port-forward that particular service. If we have two such databases, we have to give them different ports.

To proceed you need to have setup a domain / DNS records, have decided on your IP ranges and have your router / DHCP configured accordingly.

OS setup

I have chosen Ubuntu Server 20.04 LTS, just because of familiarity and it’s ubiquity — there is even a version for Raspberri PI. In this setup, very little depends on a particular OS. Install it on each node, consider the following:

Stick to simple alphanumerics in the hostname of each computer or Kubernetes won’t start and you will have to specify a K8S-acceptable name for the node separately.
If you plan to use the same drive for OS and for storing data of Persistent Volumes, enable LVM — this will allow you to split the primary disk into an OS and storage partition, and resize them as needed.
Use ssh-import-id-gh <username> if you need to import SSH key from Github after installing OS without internet access.
Configure a static IP address in the /etc/netplan/50-cloud-init.yaml , I prefer to keep DHCP enabled in addition to static IP.
Kubernes is not meant to work with swap, at least for the time being.
You disable swap with sudo swapoff -a
If you are using a laptop, disable suspend on closing laptop lid. I don’t know why a server OS does that!

To proceed, make sure all your nodes are setup and you can SSH into all of them.

Kubernetes Setup

Kubernetes is like linux — there are different takes on it, and for a homelab MicroK8S and K3S make the most sense as the two simplified distributions.
Pick the most reliable/fastest/whatever machine, and that will be our master-node. Begin installing K8S with it.

MicroK8s vs K3s

My experience with MicroK8s has been substantially better — it is mostly a vanilla K8S packaged into a Snap, if you want to understand what it’s doing, you can read the standard configuration files for kubelet, kubeapi server, etcd, etc.

K3S is much stranger — all components of K8S have been packed into a single binary, and run as a single service/deamon. In my mind there are only three reasons to use K3S:

You are not satisfied with etcd and want to use the unique K3S databastore options: either running SQLite, or PostgreSQL/MySQL
You really need to minimise resource overhead of K8S
You want to install Rancher server in the cluster to take advantage of it’s great UI and Auth features. You can only install Rancher on K3S or RKE clusters.

The Rancher management server can only be run on Kubernetes cluster in an infrastructure provider where Kubernetes is installed using K3s or RKE. Use of Rancher on hosted Kubernetes providers, such as EKS, is not supported.

K3S configuration — option 1

K3S comes with lots of components we want to replace.

curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644 --tls-san k3s.timmy.com --no-deploy servicelb --no-deploy local-storage

TLS-SAN — this parameter is necessary if you want to be able to access your cluster from the internet through kubectl. It is used to generate a certificate for kube-api server. The certificate will be valid for the LAN IP address of the machine and the domain you specify. This information is encoded in certificate-authority-datain the kubectl config file.
ServiceLB — it can’t create IP addresses independent of a machine, we will be replacing it with MetalLB
LocalStorage — provisions storage through hostpath. We will be replacing it with proper distributed storage.

Once the command is complete, your masternode should be up and running. Retrieve your kubeconfig from/etc/rancher/k3s/k3s.yaml and merge / replace kubeconfig on your personal machine. Replace the server: https://127.0.0.1:16443 with the domain name of the you spesified above — for example k3s.timmy.com. Validate that kubectl works form your dev machine and you can get pods, etc.

To add other machines as agents in the cluster, retrieve the token from /var/lib/rancher/k3s/server/node-token on the master node.
You can then get them to join the cluster by running:

curl -sfL https://get.k3s.io | K3S_URL=192.168.1.2:6443 K3S_TOKEN=mynodetoken sh -

Avoid using domain name for connecting agents to the master node — it will work but any issues with DNS will result in your cluster falling apart. Validate that you have a collection of functional nodes with kubectl get nodes

MicroK8S configuration — option 2

MicroK8S comes with a rich CLI tool that allows you to inspect and configure a cluster:

sudo snap install microk8s --classic --channel=1.18/stable
microk8s status --wait-ready
microk8s kubectl get nodes

To enable access to kube-api server through it’s public IP and DNS name, edit /var/snap/microk8s/current/certs/csr.conf.template to include them. It will look something like this:

...
[ alt_names ]
DNS.1 = kubernetes
DNS.2 = kubernetes.default
...
DNS.6 = timmy.com
DNS.7 = microk8s.timmy.comIP.1 = 127.0.0.1
IP.2 = 10.152.183.1
IP.3 = 192.168.1.2
...

The `apiserver-kicker` will automatically detect the difference, generate new certificated and restart the apiserver. Unlike K3S, we can have as many domain names as we please.

Retrieve kubeconfig using microk8s config command and merge / replace kubeconfig on your personal/dev machine . Replace the server IP address with it’s proper DNS name, or you could have two entries in your kubeconfig — one for local access, and one for remote.

Installing MetalLB

On MicroK8S you install MetalLB by enabling the corresponding addon. SSH into masternode an execute: microk8s enable metallb . It will ask you for an IP range you’d like to use.

On K3S you must install MetalLB through kubectl:

kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/namespace.yaml
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/metallb.yaml
# On first install only
kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"

Then you must create a configmap in the metallb-system namespace to specify the IP range it can use:

kind: ConfigMap
apiVersion: v1
metadata:
  name: config
  namespace: metallb-system
data:
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - 192.168.2.0-192.168.2.254

Verify that MetalLB works by deploying a blank nginx application with service of type LoadBalancer

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello
  labels:
    app: hello
spec:
  replicas: 3
  selector:
    matchLabels:
      app: hello
  template:
    metadata:
      labels:
        app: hello
    spec:
      containers:
        - name: nginx
          image: nginx:1.14.2
          ports:
            - containerPort: 80
---
kind: Service
apiVersion: v1
metadata:
  name: hello
  labels:
    app: hello
spec:
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 80
  selector:
    app: hello
  type: LoadBalancer
  externalTrafficPolicy: Cluster

It should be assigned a valid LAN IP and be reachable form your dev/personal computer.

Ingress Configuration

Ingress software is not part of the Kubernetes software project, instead Ingress Controllers are third party software that is installed in a cluster and configured by Kubernetes — like anything else, they run in a pod/container, and needs a service to be reachable form the outside world. Each has it’s perks, but they fulfil the same need. For all of them you should:

Switch the ingress type from ClusterIp to LoadBalancer to so that it’s assigned an IP address on our LAN
Configure the router to port-forward TCP connections on port 80 (http) and 443 (https) to this address.
Explicitly specify a loadBalancerIP so that you don’t have to change router settings if you need to re-install ingress for some reason.
Pick an address near the end of the available range, so that some other service does not occupy it and get in the way — MetalLbassigns them sequentially.
Snippet from the ingress yaml definition file:

kind: Service
apiVersion: v1
spec:
  type: LoadBalancer
  loadBalancerIP: 192.168.2.100

Nginx on MicroK8S

Nginx is considered the standard ingress. It’s pre-installed on MicroK8S. Edit existing ingress service in accordance with the above, and you are done.

Traefik on K3S

There are a couple advantages to using Traefik — it’s comes with a pretty dashboard and unlike nginx it can update configuration without reloading. Additionally, it’s smart enough to realise that any service with port 443 or port names https requires https connection (shock!).
The downsides are — there is less documentation and it’s less powerful when it comes to acting as an authentication proxy — it does not support OAUTH authentication out of the box, and needs an extra component if you want t authenticate with Github, etc.

Traefik comes pre-installed on K3S, but we need to modify it’s configuration. Do not modify existing kubernetes resources — K3S has an annoying add-on-like system, where it will monitor manifests in /var/lib/rancher/k3s/server/manifests/ for changes, and deploy them into your cluster. Any changes you make directly to the kubernetes resources will be overwritten.
Instead, edit the traefik.yaml file in the manifests folder. It is basically a helm chart values file. Set the following vlaues, in addition to defaults:

ssl.enabled: "true"
ssl.insecureSkipVerify: "true" 
metrics.prometheus.enabled: "true"
metrics.serviceMonitor.enabled: "true" 
dashboard.enabled: "true"
dashboard.serviceType: "LoadBalancer"
dashboard.auth.basic.admin: "$apr1$tM.asdgfs$kljkuwd" 
loadBalancerIP: 192.168.2.100
logLevel: "debug"

don’t use ssl.enforced , it breaks cert-manager’s let’s encrypt validation
ssl.insecureSkipVerify will allow Traefik to work with HTTPS backends with untrusted / self signed certificates. This can be useful for Kubernetes dashboard and in-cluster databases. I consider this acceptable in home deployments.
Prometheus options are usefull for monitoring, but we will not cover their use in this tutorial.
dashboard.enabled enables Traefik dashboard, which to me is half the value of having it as an ingress controller. Use htpasswdutility to generate SHA hash for dashboard.auth.basic.admin , where admin is the username.

Save the resulting file as traefik-customised.yaml and delete the original — otherwise K3S will revert all changes and deploy Traefik the way it was.

Finally, edit K3S configuration in /etc/systemd/system/k3s.service and add —-no-deploy traefik

Verify that your ingress works correctly by creating an ingress for docker hello-world application, making it available at hello.<yourdomain>.com

kind: Ingress
apiVersion: extensions/v1beta1
metadata:
  name: hello
spec:
  rules:
    - host: hello.<replaceme>.me
      http:
        paths:
          - backend:
              serviceName: hello
              servicePort: 80

Configuration Tips:

If you wish to expose some HTTP service on your LAN, such as your router’s dashboard, a NAS or some other device, you can create an endpoint and a corresponding service, then use Ingress to direct HTTP traffic as usual. Please use TLS, authentication options in the ingress, and be careful exposing your router or anything else sensitive.

Cert-Manager

Cert manager issues and maintains up-to-date Let’sEncrypt certificates for any ingress in your cluster. It is not strictly necessary, and you might have your own way of dealing with certificates.
It’s and is super-straight-forward to install:

# Kubernetes 1.15+
$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.14.3/cert-manager.crds.yaml
kubectl create namespace cert-manager
# Helm 3
$ helm repo add jetstack https://charts.jetstack.io
$ helm repo update
$ helm install cert-manager jetstack/cert-manager --namespace cert-manager --version v0.14.3

In addition to installing helm, we need to configure Let’s Encrypt Cluster Issuer, just apply the following yaml:

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
 name: letsencrypt
spec:
 acme:
   # The ACME server URL
   server: https://acme-v02.api.letsencrypt.org/directory
   # Email address used for ACME registration
   email: replace@me.
   # Name of a secret used to store the ACME account private key
   privateKeySecretRef:
     name: letsencrypt
   # Enable the HTTP-01 challenge provider
   solvers:
   - http01:
       ingress:
         class:  traefik / nginx

Don’t forget to replace ingress class with appropriate one for your cluster! Validate your setup by updating your ingress with TLS settings and an annotation that informs cert manager that it should create a certificate:

kind: Ingress
apiVersion: extensions/v1beta1
metadata:
  name: hello
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt
    ingress.kubernetes.io/ssl-redirect: 'true'
spec:
  tls:
    - hosts:
        - hello.<replaceme>.me
      secretName: hello.<replaceme>.me
  rules:
    - host: hello.<replaceme>.me
      http:
        paths:
          - backend:
              serviceName: hello
              servicePort: 80

You should see a pod appear with acme in it’s name — it’s responsible for responding to Let’s Enrcypt acme challenge. Also, a secret will be created, and it will contain tls.crt and tls.key records. The key record will only be populated once the challenge completes — validate that it works. If you can monitor progress of a certificate being issues with kubectl describe certs and debug issues by checking logs of the cert manager pod.

Storage:

Some applications aren’t stateless: these are databases, image galleries, Wordpress, you name it. There are two ways of dealing with storage in Kubernetes — the plebian way and the proper way.

The plebian option is to directly expose a disk or directory from our server to the container — that’s HostPath and Local Persistent Storage. Hostpath is a total hack, the kubernetes scheduler could move the pod to a different machine at any time, and the data will not travel with it. The scheduler does respect Local PS and won’d move the pod — it’s a reasonable option if you are deploying a distributed database, or similar system which is designed to handle redundancy, replication, and clustering.

Distributed storage systems are designed to solve this problem, they pool together the storage space of all servers, and will provision a persistant volume for any pod that requests it. Data will be replicated to protect against disk failures, and it will move with the pod to a new node.

Notable Open-source Projects:

Ceph — Block, Object and Network Attached storage. This battle-tested project significantly predates kubernetes, and can be used stanalone without K8S to create storage systems — it underlies block storage by Digial Ocean. It can be deployed on top of kubernetes with Rook.io. Requires entire disk or partition, which it will use raw — i.e. without a file system. Setup is not trivial.
EdgeFs — Block, Object and Network Attached storage. Cloud native storage tailored for kubernetes, supposedly production-ready. Beautiful dashboard, but the setup is very involved. Can use both folders and raw devices, but really the folder approach is impractical.
OpenEBS — Block Storage Only, requires whole device / partition. Setup is reasonably straightforward, this project has probably the largest amount of community support. It is very feature-rich, comes with prometheus metrics, etc.
Rancher Longhorn — Block Storage Only, and by far the easiest distributed storage to deploy — you will be done in 5 minutes. This is what I ended up using. Installation steps:

git clone https://github.com/longhorn/longhorn && cd longorn
kubectl create namespace longhorn-system
#Helm 3
helm install longhorn ./longhorn/chart/ --namespace longhorn-system

That’s it! It comes with a great dashboard, edit the it’s service to Loadbalancer and open it in a browser — you will be presented with a summary of your cluster:

Summary of your cluster, available storage, nodes and volumes

You can define storage filepaths in UI (left) and review currently provisioned volumes (right)

Configuration

In the Nodes tab, edit every node and add all the disks. They have to be formatted and mounted — you add them as a filepath.
If you have different classes of disks, like SSD and HDD, use storage tags, and then create two storage classes with different diskSelectors:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.beta.kubernetes.io/is-default-class: "false"
  name: longhorn-hdd
parameters:
  numberOfReplicas: "2"
  staleReplicaTimeout: "30"
  diskSelector: "hdd"
provisioner: driver.longhorn.io
reclaimPolicy: Delete
volumeBindingMode: Immediate

Longhorn only provides block storage, which can be attached to a single pod at a time. If you need NFS-style shared storage, you will have to standup a separate service in a container, on top of it. Same goes for object storage.
The UI has no authentication mechanism and allows anyone to delete all of your data . Once you’ve configured longhorn, I would advice reverting the servicetype back to ‘ClusterIP’, and configure ingress as an authenticating proxy, at least with basic authentication.
It’s worthwhile setting up backups of your storage, longhorn can be provided with a S3-compatible or NFS-compatiable storage.
To validate that it’s working, deploy WordPress helm chart- it’s will deploy two PVs, one for itself and one for MariaDB
The longhorn project is still in beta, consult the roadmap if you’d like more info.

Now your cluster has all the essentials — you are basically your own cloud provider. You can spend more time improving your cluster and deploying prometheus, grafana, and other services, or you could jump straight in and host your blog, or whatever else you have on your mind.