Oops: Caffeinate Fully Before Considering Docker Prune

416: Don’t do that…

The internet is not holding its breath for this: like the majority of the stuff on this site, it’s really a memory aid for me. But for what it’s worth, here is the sorry tale…

In the very unlikely event that you have looked at this site at some point between November of 2021 and last Saturday and have noticed a chunk of posts missing, that’s because I had a little bit of an accident. An accident of the unbacked up data deletion variety.

I originally set this up back in 2008 as a way of sharing photos with family and friends as I either didn’t like the alternatives, or they hadn’t been invented yet. It subsequently morphed into a blog principally on the technology aspects of photography, then posts on holidays and then, as my interests in photography waned and were replaced with other obsessions, random posts on techie projects.

In the summer of 2019, I replaced what had been some standalone software for both the blog and a mail server with some home-rolled containers, as a way of starting to familiarise myself with cloud technology.

This started to get quite complicated (messy), particularly dealing with Let’s Encrypt certificate rollover in as hands-free a manner as possible.

What Went Wrong…

I have had a long-running problem with the containerised mail server software that I’ve been using for years. It has an anti-spam mechanism which delays responses to connections. This is a known issue and one, which both confuses my email clients (which think the server is offline) and that I was never able resolve.

A few weeks ago I started to experiment with an alternative, which, barring a couple of foibles, is fantastic. So I stood up a separate VM on my hosting service, bought a really cheap domain name, and got it up and running.

Testing cyclicly with Docker can turn up some weird state based problems. Over the weekend, I was staring at two identical docker-compose file sections for the same container on separate machines, one of which worked and one of which didn’t. There was a change in a config file mounted in from the file system that just wasn’t getting picked up. With the benefit of hindsight – there has been a lot of that going around – the path was wrong and I was looking in the wrong directory on the running container.

I thought, just to be safe, I’ll take the nuke-the-planet-from-orbit option: I’ll prune the volumes. This didn’t strike me as a bad idea at the time (spoiler alert: it was a really bad idea) because what I’ve been doing for the last couple of years is just mounting directories in from the file system. As well as facilitating sharing of volumes between containers in a very readable way, it is blindingly obvious what state the files are in without having to shell onto the running container (or figuring out where the snap install might have squirrelled them away to).

Unfortunately, the MySQL container was using a volume managed by docker itself. Also, I brought all of the containers down, rather than just the mail server before running the prune command.

Oops.

My last backup had actually been a lot more recent than late 2021: I actually deleted the file from a machine that has storage constraints a few months back.

Inevitably It Will Happen To You

In the summer of 1990, one of my undergraduate friends became the victim of an early computer virus. It turned out that one of the lab technicians had been using free software – distributed on floppy disks on the front of magazines at the time – on the machines, and infected the lot of them.

Unfortunately, my friend was in the final stages of preparing his thesis for submission and lost the lot of it. He was able to cobble together a group of volunteers who could touch-type (myself included. Reading his notes, which looked like they had been written by a toddler with an unusual gift for macro-economics, was another matter) and submit it. His tutor was sympathetic and gave him a short extension to the deadline, but it was quite a hectic few days. The experience was in the back of my mind when I was preparing the documentation for my Masters project a couple of years later, and it has stayed with me.

Well, up until last Saturday at least.

I have documented various photo backup strategies I’ve gone through on this site, going back as far as 2011. After abandoning a labour intensive process of cutting DVDs, I got a network based RAID array, which was awful; then I tried a cloud based backup, which got very expensive over a period of about 5 years. Having discussed it with my wife, finally I decided just to sling another hard drive in a PC and manually copy files over on a folder by folder basis. I figured that if the house burns down there will be bigger priorities.

That machine is up for some major work in the next couple of months: the CPU won’t support Windows 11, so I’m going to have to replace both it and the motherboard. I may take another copy of our photos: it’s the only way to be sure :).

So lost to the ether are 3 years of posts including write-ups of holidays to Korea, Bhutan and Nepal; excruciating detail on writing the software for an E-Ink screen for Google Calendar; software for a touchscreen to control Philips Hue bulbs; and, no doubt a bunch of other things that I will remember over the coming weeks and months.

Unless you are a Torvalds-esque internet megawatt superstar whose every utterance (or code commit) is round the world in a flash, losing some data that is important to you is absolutely inevitable. Draw a circle around the stuff that matters, think about how you might lose it, and put a structure in place to minimise the impact. Just like I didn’t do :).

In the meantime, I have added a repeating calendar reminder to back up the blog…

Jenkins Container on Kubernetes

For ease of deployment, but without wanting to dive straight into Helm (just yet), I decided to try to stand up a very simple / crude Kubernetes deployment based on Jenkins. Needless to say, it wasn’t as simple as I’d hoped: I ran into a number of permission related problems.

On first pass, I got a log message which said that the Jenkins didn’t have permission to write to /var/jenkins_home/copy_reference_file.log. This post suggested a quick and easy fix, which was to run the container as root. This translates to a runAsUser 0 in a securityContext definition. And, per the recommendation (which includes caveats), the permission problem went away. While the pod started correctly, I then started having persistent problems with the connection refused, even when I tried on localhost from within the container. I suspected (possibly incorrectly) this was related to the root perms, so based on this issue, removed the block, deleted everything under the persistent volume that was already installed, and chown’ed to 1000 (which is the ubuntu user). This fixed the perms problem, but I was still getting a connection refused.

What I suspect was the problem was that I was trying to map a loadBalancer service definition onto port 80, which the non-privileged user 1000 didn’t have perms for. Changing this to the default of 8080 worked. This is the working spec. I’m slightly suspicious about the use of the environment variable for the volume but, as I say, it works.

Running Kubernetes on a Raspberry Pi Cluster

TL;DR:

  • Microk8s: eminently stable and usable.
  • Q: “Will I end up with something usable when I deploy a workload?” A: It depends on both your budget and design choices.
  • Persistent Volumes: the trickiest part, if you need them.

I have spent some spare time over the last few weeks building up, and then installing a Kubernetes cluster on 3 Raspberry Pis using Microk8s.

One point worth getting out of the way up front: if you have stumbled here via Google and are interested in finding out whether or not you are going to have something practical / usable at the end of the exercise, unfortunately that’s not a clear cut question. On first pass, I didn’t, and that was for a simple workload definition. On a system which may be resource bound to start with, decisions that you make on the software stack and hardware / budget choices (storage arguably above all else) are going to have more of an impact than in other environments.

Another reason you might land here is because there isn’t a huge amount of documentation once you get off the beaten track. What I’m principally going to write about here is a couple of problems that I found tough to solve. I’m by no means an expert in Kubernetes – the whole point of this exercise is to learn more about it – and a better informed person might well have trivially avoided the territory I got myself into.

And to conclude the scene setting, unless you have exceptionally broad experience (I don’t, having lingered at layer 7 for all of my career), it’s a fair bet that you will bump up against technical puzzles that are either outside your comfort zone, or at least the reason that might be driving your interest in a project like this in the first place.

I have linked to working examples on my GitHub repo throughout: I’ve not generalised them (e.g., references to NFS path etc) because to all intents and purposes they are my own backups – but there might some useful pointers in them and what I’ve documented below.

Here is the list of hardware that I’m currently running:

  • Cluster master (also file and database servers): Pi 4B with 8GB RAM. Storage is a Samsung T5 250 GB SSD.
  • 2x worker nodes: Pi 4B with 4GB RAM. Storage for both is SandDisk Extreme Pro 32GB micro SD.
  • Power: Anker 63W 5-Port USB charger.
  • Network: TP-Link 5-Port Gigabit Ethernet managed switch.
  • Case: Acrylic rack with cooling fans.

Power
Early in the build process, I started to run into problems with reboots on the master node when it was under load, which were almost certainly due to power. I saw a recommendation for the Anker charger in a blog post on a cluster build; after I bought the adapter, I also saw a comment on Amazon specifically saying to avoid it for the same use. Having tried a number of different options including swapping to a dedicated charger, what ended up stabilising the issue was changing the USB C cable that I was using. My research was inconclusive on whether or not the cables make a difference. There are potentially a few different moving parts here and, to cut to the chase, changing it worked for me (for now!).

Storage Attempt #1: Samba
Without a shadow of a doubt, if your use case for using K8s requires persistent volumes, figuring out an approach and then how to get it working is the most complicated part of the cluster setup. Part of my reason for deciding to use the SSD on the higher spec master was in order to run some sort of file server on the same hardware. My test case – WordPress. Not ground breaking but enough to exercise some general principles – needs 3 separate volumes: one for the webroot, one for changing the default file upload size in /usr/local/php/conf.d/, and finally one for an Apache configuration directive, which I’ll come back to shortly (the short version: CIFS related).

I made an arbitrary decision to use Samba for the file server because of a half-formed idea that it might be handy for working on files from my pc, as well as mounting in the same directories into pods as volumes. I also use it quite a lot elsewhere, e.g., for transferring data between VMs and my pc. I tried a couple of different K8s storage drivers which I couldn’t get working before finding the CIFS FLexvolume plugin which was easy to set up and configure.

Database
At the time of writing, there appears to be no support for ARM V8 in the offical MySQL container build. I spent about a week’s worth of spare time trying to make my own, complicated by the fact that I decided that it would be a fabulous idea to mount the persistent store for the database in over CIFS. It’s not, so I’m pretending it didn’t happen, and am just using the MariaDb container off the shelf. I run this on the same Pi as the master, but as a standalone container outside Microk8s, accessed via a service.

Microk8s
It was the throw of a coin for me whether or not to go with MicroK8s or K3s. Having gone with the former, it seems pretty solid. I had to do a couple of reinstalls after doing some daft newbie stuff like changing the names and IPs of the nodes after they were joined to the cluster.

Load Balancer using Metallb
This introduced a problem that was difficult to diagnose: it turned out that the arp mapping for the dynamically assigned IP wasn’t propagating. I eventually stumbled on expanding all of the comments below the original question here which in turn points at this github issue: putting all the WiFi cards in promiscuous mode worked for me.

CIFS and Apache
I went round the houses on this one and, to be clear, it has nothing to do with K8s. I’d gotten to a point where everything seemed to be working but, for some reason, all of the images that I was trying to load from WordPress were broken when the browser rendered them. Trying to refer to them directly by full URL caused the browser to download them (again, rather than trying to display them).

I had an idea early on to drop one of the downloaded files into an editor just to see if there was anything obviously wrong. While I love Sublime, it tripped me up at this point: it tries to display images rather than showing the binary content. Because the image was corrupted, it just displayed a message saying ‘loading image…’ …indefinitely.

I then tried Postman, which showed something I’d never seen before: ‘Parse Error: Expected HTTP/’. Even WireShark wasn’t much help, just displaying a warning that there were ‘illegal characters found in header name’. That threw me completely: I thought there might be a bug in the WiFi driver (based on the earlier issue with arp propagation). At this point I bought the switch and abandoned WiFi in favour of wired ethernet, which had the sole effect of delivering the corrupt binary files faster.

Having inadvisedly tried to run a database over CIFs right at the start, I then tried googling to see if the file server might be a contributory factor and fairly soon after found this. Sure enough, opening one of the corrupt image files in Notepad showed the headers incorrectly prepended to the binary content:

19:25:54 GMT
ETag: "400d-5bf13149eaf08"
Accept-Ranges: bytes
Content-Length: 16397
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: image/png

So, the third of the 3 volumes that I mount into my WordPress container spec is a httpd.conf file containing the directive EnableMMAP off.

I’ve been using Apache off and on since its first release in the mid 90s, and have never come across anything like this before. Then again, you are – hopefully! – much less likely to be doing this sort of weird plumbing in a work environment, which throws up problems like this.

Working
This is my working spec for Samba, which:

  • creates a service of type LoadBalancer;
  • refers to the service definition for the database, which is running as an external container;
  • refers to a secret, which is for MySQL;
  • has the three Samba directory mounts defined (each of which itself needs to refer to a secret).

(Note the reference to 5.6 for the WordPress container version, as the latest version seems to have problems constructing the database config on first run. Also note, per the comments immediately below that this is not a particularly useful example!)

Storage Attempt #2: NFS
Having gotten this far, I decided I should have a look at NFS, which I rejected early on because it was clearly an even worse idea(!?!) than Samba for mounting into MySQL.

I did a little bit of research into any performance differences. While there are plenty of side by side comparisons like this, there wasn’t anything that sounded terribly compelling. That said, the load performance that I was seeing was poor to the point of unusable over Samba: say 7+ seconds to render the WordPress homepage, which has very little content straight out of the box.

I was initially confused at the difference between using persistent claims and mounting volumes directly into the pod spec. This was the best explanation for the configuration that I found, and my understanding is that the claim is carving out some space in the volume which starts blank. If you want to start with, say a bunch of configuration files (or you are picking up the already installed webroot as I was from Samba), you are better off mounting directly into the pod. While the blank starting state might not be strictly true, I couldn’t see any other way round it.

NFS, it turns out, isn’t without its foibles. On Ubuntu, just doing a test mount of an exported directory, straight out of the install guide, doesn’t work. You have to specify v3 as a command line option. The example command on this page that works for the client is:

sudo mount -o v3 a-nfs-server:/path/to/export /path/to/mount

Having done a bit of digging, it became clear that the NFS client that Microk8s is using is v4. This is no doubt for good reason, as the host based model that v3 uses to authenticate client requests isn’t very robust. Unfortunately, I wasn’t able to find a way of passing a version option into the Volume definitions. Back on the stackexchange link, there is an answer to get v4 working which looks like this in /etc/exports:

/Path/to/export 192.168.1.0/24(rw,sync,fsid=0, no_root_squash,crossmnt,no_subtree_check,no_acl)

..with the appropriate emphasis being on the no_acl config option. This works, but obviously is wildly unsuitable for industrial use cases.

Claims

So this config works for me with the pod spec based volumes, which is the only difference with the Samba variant above. FWIW, it is much faster than using CIFS, and eminently usable – possibly because it is unencumbered by any security(!).

For completeness, this is what I set up for the claims based approach, starting with the persistent volumes, which are vaguely notable for requiring labels which are used as selectors by the claims, and necessary when you have more than a single volume; these are then in turn referenced by the pod spec.

And there you have it. I’ll add a separate post about some Linkderd stuff such as how to bind the dashboard to a load balancer, as it took me a while to figure out.