
Hardware setup
What you would need for this kind of setup is:
- At least one Raspberry Pi 2B/3B/3B+. You can run some apps even on a single board but getting two or more boards is recommended for spreading the load and for increased redundancy.
- Power supplies and SD cards for the Pis, an ethernet switch or free ports in your existing one, and some cables.
In our setup, we currently have four Raspberry Pi 3 Model B+ boards, so in the cluster, there is one master/server and three agent nodes. The Raspberry Pi boards of course need some kind of housing and this is where things got a little out of hand. A fellow Boogieman who is very able with CAD and 3D printers designed and printed a neat case for the boards, which would deserve a story on its own. The casing has two fans for cooling in the back and each board sits on a tray that can be hot-swapped in and out for maintenance. The trays also have places at the front for an activity/heartbeat LED and a shutdown/power switch that both connect back to the board’s GPIO header.
Software stack
For the Kubernetes implementation, we chose to use k3s from Rancher Labs. For such a young project, it is actually very stable and usable, likely due to the fact that it just bundles the official Kubernetes components in a smaller, easy-to-install package. What makes k3s different from other smaller Kubernetes distributions, is that it is intended for production use, whereas projects like microk8s or Minikube are more suitable for development purposes, and the fact that it is very lightweight and runs nicely on also on ARM-based hardware. In k3s, the essentials of a Kubernetes system have been combined into a single 40 meg binary that integrates all the required components and processes.
K3s will run pretty much on almost any Linux distribution, and we decided to go with Raspbian Stretch Lite as the base OS because we don’t need any additional services or desktop user interfaces on the boards. K3s does require cgroups to be enabled in the Linux kernel, and this can be done on Raspbian by adding the following parameters to
/boot/cmdline.txt
:cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory |
Installing k3s
The authors of k3s have done a nice job with smoothing the installation process. Once you have your server hardware ready, it is super easy to setup in just a couple of minutes: it takes only one command to install the server (master node):
and the same goes for agent nodes:
where
curl -sfL https://get.k3s.io | sh - |
curl -sfL https://get.k3s.io | K3S_TOKEN=<token_from_server> K3S_URL=https://<server_ip>:6443 sh - |
token_from_server
is the contents of the file /var/lib/rancher/k3s/server/node-token
from the server and server_ip
is the IP address of the server node. At this point, our cluster was already up and running, and we could start deploying workloads:
root@k3s-server:~# kubectl get nodes | |
NAME STATUS ROLES AGE VERSION | |
k3s-node1 Ready <none> 40s v1.13.4-k3s.1 | |
k3s-server Ready <none> 108s v1.13.4-k3s.1 |
/etc/rancher/k3s/k3s.yaml
into the local kubeconfig file (usually ${HOME}/.kube/config
).Exposing the services with a load balancer
By default, the applications deployed to a Kubernetes cluster are only reachable from within the cluster (default service type is
ClusterIP
). To make them reachable from outside the cluster, there are two options. You can either configure the service with the type NodePort
, which exposes the service on each node's IP at a static port, or you can use a load balancer (service type LoadBalancer
). NodePort services are, however, quite limited: they use their own dedicated port range and we can only differentiate apps by their port number. K3s does also provide a simple built-in service load balancer but since it uses the nodes’ IP addresses, we might quickly run out of IP/port combinations and binding the services to a certain virtual IP is not possible. For these reasons, we decided to deploy MetalLB - a load-balancer implementation that is intended for bare metal clusters.MetalLB can be installed simply by applying the YAML manifest. The simplest way to run MetalLB in an existing network is to use the so-called layer 2 mode, which means that the cluster nodes announce the virtual IPs of the services in the local network with ARP protocol. For that purpose, we reserved a small pool of IP addresses from our internal network for the cluster services. The config for MetalLB thus looked like this:
With this config, the cluster services would be exposed at addresses in the range 10.10.10.50-10.10.10.99. To bind a service to a specific IP, you can use the
It is with load balancing where we saw most of our challenges. For example, Kubernetes has a limitation that prevents having both TCP and UDP ports in a single load balancer service. To work around that, you can define two service instances, one for TCP ports and another for UDP ports. The downside is that then you will run these two services in different IP addresses, unless you enable IP address sharing. And as MetalLB is a young project, there was a small wrinkle with this as well, but we are confident that all these will be ironed out soon.
apiVersion: v1 | |
kind: ConfigMap | |
metadata: | |
namespace: metallb-system | |
name: config | |
data: | |
config: | | |
address-pools: | |
- name: company-office | |
protocol: layer2 | |
addresses: | |
- 10.10.10.50-10.10.10.99 |
loadBalancerIP
parameter in your service manifest:apiVersion: v1 | |
kind: Service | |
metadata: | |
name: my-web-app | |
spec: | |
ports: | |
- name: http | |
port: 80 | |
protocol: TCP | |
targetPort: 8080 | |
loadBalancerIP: 10.10.10.51 | |
selector: | |
app: my-web-app | |
type: LoadBalancer |
Adding storage
K3s doesn't have a built-in storage solution yet, so in order to give the pods access to persistent file storage, we need create one by using one of the plugins supported by Kubernetes. Since one of the goals of Kubernetes is to decouple the applications from the infrastructure and make them portable, the Kubernetes storage lingo defines an abstraction layer for storage with the concepts of PersistentVolume (PV) and PersistentVolumeClaim (PVC). PVs are storage resources that are typically configured and made available for the apps by the administrator. PVCs, on the other hand, describe the application's need for a certain kind and amount of storage. When a PVC is created (typically as part of the application), it is bound to a PV, if there is one available that is not yet in use and satisfies the app's PVC requirements. Configuring and maintaining all this would mean manual work, which is why there is a way to provision volumes dynamically.
In our infrastructure we already had an existing NFS server available, so we decided to use that for the cluster persistent file storage. The easiest way to accomplish this in our case was by using NFS-Client Provisioner that supports dynamic provisioning of PVs. The provisioner simply creates new directories on the existing NFS share for each new PV (that the cluster maps to a PVC) and then the PV directory is mounted in the container where it is used. This way there is no need to configure the NFS shares into volumes in individual pods but it all works dynamically.
Cross-building container images for ARM
Obviously, when running app containers on ARM-based hardware like Raspberry Pi, the containers need to be built for ARM architecture. There are a few gotchas that you might face when building your own apps into ARM architecture containers. First of all, the base image needs to be available for your target architecture. In the case of Raspberry Pi 3, you typically want to use an "arm32v7" base image, as they are called in most Docker registries. So, when cross-building your app, make sure your Dockerfile contains e.g.
FROM arm32v7/alpine:latest |
The second thing to note is that your host Docker needs to be able to run ARM binaries. If you are running Docker for Mac, things are easy because it has built-in support for this. On Linux, there are a few steps that you must take, outlined below.
Adding QEMU binary into your base image
To run ARM binaries in Docker on Linux, the image needs have a QEMU binary. You can either choose a base image that already contains the QEMU binary, like the images from Balena, or copy the
qemu-arm-static
binary into the image during build, e.g. by adding the following line into your Dockerfile:COPY --from=biarms/qemu-bin /usr/bin/qemu-arm-static /usr/bin/qemu-arm-static |
Security notice: Please be aware that downloading and running an unknown container is like downloading and running an unknown EXE. For anything else but hobby projects, you should always use either scanned/vetted images (e.g. Docker Official Images) or container images from trusted organizations or companies.
Then, QEMU needs be registered on your host OS where you create your Docker images. This can be achieved simply with:
docker run --rm --privileged multiarch/qemu-user-static:register --reset |
Dockerfile.arm
would look e.g. something like this:FROM arm32v7/alpine:latest | |
COPY --from=biarms/qemu-bin /usr/bin/qemu-arm-static /usr/bin/qemu-arm-static | |
# commands to build your app go here… | |
# e.g. RUN apk add --update <pkgs that you need…> |
docker run --rm --privileged multiarch/qemu-user-static:register --reset |
docker build -t my-custom-image-arm . -f Dockerfile.arm |
Automating builds and registry uploads
The final step is to automate the whole process so that the container images are built automatically and uploaded to a registry from where they can easily be deployed to our k3s cluster. Internally, we are using GitLab for our source code management and CI/CD, so we naturally wanted to get these builds running in there. It even includes a built-in container registry, so there was no need to set up a separate one.
GitLab has good documentation on building Docker images, so we won't repeat all the stuff here. After configuring the GitLab Runner for docker builds, all there is left to do is to create a
.gitlab-ci.yml
file for the project. In our case it looked like this:image: docker:stable | |
stages: | |
- build | |
- release | |
variables: | |
DOCKER_DRIVER: overlay2 | |
CONTAINER_TEST_IMAGE: ${CI_REGISTRY_IMAGE}/${CI_PROJECT_NAME}-arm:${CI_COMMIT_REF_SLUG} | |
CONTAINER_RELEASE_IMAGE: ${CI_REGISTRY_IMAGE}/${CI_PROJECT_NAME}-arm:latest | |
before_script: | |
- docker info | |
- docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY | |
build_image: | |
stage: build | |
script: | |
- docker pull $CONTAINER_RELEASE_IMAGE || true | |
- docker run --rm --privileged multiarch/qemu-user-static:register --reset | |
- docker build --cache-from $CONTAINER_RELEASE_IMAGE -t $CONTAINER_TEST_IMAGE . -f Dockerfile.arm | |
- docker push $CONTAINER_TEST_IMAGE | |
release: | |
stage: release | |
script: | |
- docker pull $CONTAINER_TEST_IMAGE | |
- docker tag $CONTAINER_TEST_IMAGE $CONTAINER_RELEASE_IMAGE | |
- docker push $CONTAINER_RELEASE_IMAGE |
kubectl create secret docker-registry deploycred --docker-server=<your-registry-server> --docker-username=<token-username> --docker-password=<token-password> --docker-email=<your-email> |
imagePullSecrets: | |
- name: deploycred | |
containers: | |
- name: myapp | |
image: gitlab.mycompany.com:4567/my/project/my-app-arm:latest |
Conclusions

One small downside is that k3s doesn’t support high-availability (multi-master configuration) yet. Although a single master setup is already quite resilient because the services continue running on the agent nodes even if the master goes offline, we’d like to get some redundancy also for the master node. Apparently, this feature is in the works, but until it is available, we recommend taking backups from the server node configuration.
Boogie Software Oy is a private Finnish company headquartered in Oulu, Finland. Our unique company profile builds upon top level software competence, entrepreneurial spirit and humane work ethics. The key to our success is close co-operation with the most demanding customers, understanding their business and providing accurate software solutions for complex problems.
Written by Jari Tenhunen -- 2019-03-31