Day to day stuff

System hardening: Migrating from docker compose to podman-compose

2026-01-10T14:49:00.000+01:00

TL;DR: Podman-Compose provides essential security features for homelabbers, albeit with a few inconveniences.

Many applications that appeal to homelabbers can be installed with a docker-compose file. Docker-compose files are awesome! They provide a nice uniform way to install wildly different applications.

The most used runtime is Docker. Unfortunately, docker runs as root and container-escape vulnerabilities are abundant. Therefore, a hacked application immediately leads to a fully compromised system. Luckily, there are viable alternatives. For example, podman explicitly allows running containers without root.

This article summarizes what we learned when we migrated our seven homelab services from docker compose to podman-compose on a Ubuntu 24.04 server.

Preparations

Install Podman and Podman-compose

This is by far the simplest section of this article. As root, run:

apt install podman podman-compose

How to prepare non-root user(s)

Since we want to run our services without root access (rootless), we need to create a non-root user. To maximize isolation, we have chosen to create one user per application. However, some applications share a mounted directory, so they run as the same user. For example, we run Syncthing and Apache HTTPD as the same user because Apache serves files from a Syncthing directory.

These are the steps:

Create a non-root user without a shell. This prevents remote logins.
Enable linger. This allows the user to run processes, even when it is not logged in.
By default Podman does not start containers after server boot. Most documentation will tell you to use podman generate systemd to generate a systemd unit file. However, a much simpler approach is to enable the existing podman-restart systemd service.

Below are the steps to perform this process for the user named "immich":

APPUSER=immich
# 1. create user:
sudo useradd -m -s /usr/sbin/nologin $APPUSER
# 2. allow user to run services even when it is not logged in:
loginctl enable-linger $APPUSER
# 3, Enable restart after boot:
sudo -u $APPUSER mkdir -p /home/$APPUSER/.config/systemd/user/
sudo -u $APPUSER cp /lib/systemd/system/podman-restart.service /home/$APPUSER/.config/systemd/user/
systemctl --user --machine ${APPUSER}@ enable podman-restart.service

How to run podman as the non-root user

In the previous section we created the users with the nologin pseudo-shell. It is therefore not possible to login as that user. It might be tempting to use su or sudo to switch to the non-root user. However, do not do this! Neither command creates the required 'login session'. If you try anyway, it may appear to work, but you will get problems in the future. For more details see sudo rootless podman.

One way to create a shell with a login session is to use machinectl:

APPUSER=immich
machinectl -q shell ${APPUSER}@ /bin/bash

In this shell it is safe to run commands like podman ps.

How to prepare the docker-compose file

Prepare a directory in the user's home directly, and place the docker-compose.yaml file you downloaded from the service's website in it. However, there are a few things that need to be changed to work with podman-compose and rootless.

These are the changes we found:

Make image names fully specified. In particular you will need to prepend docker.io/ when the container registry is missing from the name. For example, image: wallabag:1.41 becomes image: docker.io/wallabag:1.41. Podman has a catalog of some short names. For example, image: ubuntu works fine.
Configure your application to bind to ports above 1024 (unprivileged ports). Even if the application thinks it is running as root inside the container, and there is a port mapping to a higher port, it is still not allowed to bind privileged ports.

You may also need to adjust the application configurations within the container. See below for some tips.

Another solution is to change the lowest privileged port on the host system. For example with: sysctl -w net.ipv4.ip_unprivileged_port_start=80. This should be safe as long as you have a firewall in place (which you do, right?!).
Replace restart: unless-stopped with restart: always. The systemd restart service only supports always.
Disable health checks. We have yet to find a more better way to reduce the enormous amount of garbage logging that podmam produces.

Start the application as the non-root user

With the above done, we are ready to start the containers with podman-compose. For example, as root, run:

cd /home/wallabag/podman-wallabag
machinectl -q shell wallabag@ podman-compose pull
machinectl -q shell wallabag@ podman-compose up -d

To make things repeatable, you should create a script. Here is the script that we use to run Wallabag:

#!/bin/bash
set -euo pipefail
IFS=$'\n\t'
cd "$(dirname "$0")"

podman-compose pull
podman-compose down
podman-compose --podman-run-args=--log-driver=none up -d
sleep 2
podman-compose exec wallabag /var/www/wallabag/bin/console doctrine:migrations:migrate --env=prod --no-interaction
podman-compose exec db psql --user=postgres -c 'ALTER DATABASE wallabag REFRESH COLLATION VERSION'
podman image prune --force

Which root can run with:

machinectl -q shell wallabag@ /home/wallabag/podman-wallabag/pull-and-start.sh

User ID mapping and running Apache HTTPD

Apache HTTPD requires more attention than most services. It insists on running as root and then stepping down to another user. While this is a good security measure, it is also annoying because the user which it steps down to in the container (on Ubuntu, this defaults to www-data with user ID 33) gets mapped to a completely unique user ID (something like 100032) on the host. This makes it more difficult to share a mounted directory.

Fortunately, podmap has an option to map some users (inside the container) to the user that started the container on the host. For example, when podman-compose is run by user mainsite, the parameter --userns=keep-id:uid=0,uid=33 maps the in-container users 0 and 33 to the mainsite user on the host. (Note that user 0 (root) is already mapped as such by default.)

Here is the list of changes we had to make while building the apache container:

Add USER root somewhere to the end of the Dockerfile (but before ENTRYPOINT or CMD).
The file /etc/apache2/ports.conf should contain the text Listen 8080
Update the VirtualHost tags in all /etc/apache2/sites-enabled files so they look like this: <VirtualHost *:8080>.
Run podman like this: podman-compose --podman-run-args=--userns=keep-id:uid=0,uid=33 up -d

Why this is great!

Although Docker is very easy to use, we are happy that we no longer have to deal with these annoyances:

Solved Docker annoyance #1 — root access

This was the whole point of this exercise! Applications no longer have root access on the host system, which makes full system takeovers after a breach less likely.

Solved Docker annoyance #2 — network misery

Docker's networking setup collides very hard with the firewalls we have used (Shorewall and UFW). You have to jump through hoops to get everything working reliably. These issues are all gone with podman. The slirp4nets network mode simply opens a port without trying to change iptables.

(Hopefully) solved Docker annoyance #3 — poor image cleanup

Even running docker image prune often does not prevent an ever growing /var/lib/docker directory. Hopefully, podman does not have this problem. At least, the images are cached per user and are therefore easier to clean up.

Podman is not perfect

Here are some issues we encountered with podman:

Podman annoyance #1 — logging

For some reason, podman and podman-compose like to log every little detail. The result is that syslog becomes so cluttered with useless data that it becomes unusable. Moreover, this increases wear on our SSDs. We have yet to find a good way to deal with this. For now, we have disabled health checks, and added

[engine]
events_logger = "none"

to each user's .config/containers/containers.conf. You may have also noticed the --podman-run-args=--log-driver=none argument in the start script above.

Perhaps the solution lies in logging directly to syslog (which supports filters) instead of via journald (which doesn't).

Podman-compose annoyance #2 — no incremental changes

Docker compose is smart, it detects and applies only the necessary changes. Podman-compose, however, is not so advanced. Our workaround is to always run a podman-compose down before running a podman-compose up -d.

Removing Docker

Once all services have been migrated, docker can be removed. This is not just a matter of running apt remove docker.io docker-buildx docker-compose-v2 (in case you have Ubuntu's stock docker installed). You have to actively search for remnants. For example with find / -iname '*docker*' 2>/dev/null. In particular you should delete /var/lib/docker (for fun, first do du -h -s /var/lib/docker to see how much disk space Docker needed).

Aside: podman-compose or docker compose on top of podman?

It is possible to run docker compose over podman. This should give you the best of both worlds: the sleek and complete support from docker compose, and the rootless safety of podman.

We did not go this route because:

Even though podman-compose is a bit rough, it does what we need.
Using tools from the same family feels more future-proof. Good luck when you have interoperability issues!
For docker-compose to work, you need to set up a docker-socket. More moving parts mean less reliability.

Conclusions

With minor changes you can migrate from docker compose to podman-compose.
Podman is not as polished as Docker.
Some docker annoyances disappear, but are replaced by podman annoyances.
Using docker for internet-facing applications is irresponsible. Using rootless podman fixes that.

The incomplete guide for sending email notification from the Ubiquiti's Unifi Cloud Gateway (UCG)

2025-09-27T19:19:00.005+02:00

Unfortunately, sending notification emails from a Unifi Cloud Gateway (UCG) with remote management disabled, is not all straight forward. Here are some tips, though it will end with a disappointment.

Check list:

Send a test email
Weird stuff for local email servers
Configure an email address for the admin user
Configure an alert
Wait for Ubiquiti to fix it

1. Send a test email

In 'Settings', 'General', setting 'Email Services', select 'Custom Server'. Fill in the details and press 'Send test email'. If it works, skip to section 3, otherwise, read on.

As 'SMTP Server' you need to fill in the fully qualified hostname of the email server. The hostname must match the DNS name in the TLS certificate of the email server. If there is a mismatch, UCG will reject the connection. Therefore, an IP address does not work!

The standardized SMTP submission port is 587, with SSL disabled. No worries, due to STARTTLS the traffic is still encrypted.

If port 587 does not work, your email server may support the legacy port 465, with SSL enabled.

If that also does not work, you may try port 25 with SSL disabled (again, STARTTLS should encrypt the traffic).

2. Weird things for local email servers

Something odd happens when the DNS name of the email server actually resolves to the UCG, and you have port forwarding for port 25 and 587 to the local device that contains the email server.

The problem is that 'hairpinning' does not fully work on the UCG. Ubiquiti describes hairpinning as follows (source):

When a device on the local network attempts to connect to the public IP address of the UniFi gateway, the traffic is redirected internally, ensuring that port forwarding rules apply as they would for external requests.

Hairpinning is super useful, but Ubiquiti's interpretation is not good enough! To reach the email server, the UCG itself should also be able to use hairpinning. Unfortunately, this is not supported.

Luckily, I learned a workaround from Ubiquiti's support staff. We can give the local email server a 'local DNS record', a kind of a DNS override. We set the local DNS record equal to the fully qualified hostname (e.g, mail.example.com) of the email server. After this change, any DNS client in the local network, including the UCG itself, resolves mail.example.com to the IP address of the local device and not to the public IP address of the UCG.

Here is how to set this up: in 'Client Devices', click the device that runs the email server. In the left panel click the cog-icon (settings). Check 'Local DNS Record' and enter the fully qualified hostname of the email server, and click 'Apply Changes'.

We check can that it works by using something like dig from any machine in the local network:

# Before
% dig +short mail.example.com
1.1.1.1  # some public IP address

# After
% dig +short mail.example.com
192.168.1.24  # a local IP address

Try another test email (see section 1) before you continue.

3. Configure an email address for the admin user

Click 'Admin and users' (bottom left icon), click the relevant user. In the left panel click the cog-icon (settings). Enter the email address, and click 'Apply Changes'.

4. Configure an alert

Click 'Alarm Manager' (second icon from bottom left). Select all alarms you want to receive an email for. Then in the left panel make sure 'Email' is selected and click 'Save'. If you create new alarms, you may have to repeat the process.

5. Wait for Ubiquiti to fix it

If you have gotten this far (like I have), it was all for nothing. According to this discussion, you won't get email notifications, unless you enabled Remote Management, if only for 1 second.

I have reached out to Ubiquiti support and I will update this article when more information arrives.

Update 2025-09-28: The mentioned discussion thread shows a screenshot in which Ubiquiti states that 'it should work better' in UnifiOS 4.4.x. No release date is know at this moment.

Update 2025-12-03: Meanwhile our UCG upgraded to UnifiOS 4.4.9. Unfortunately, still no e-mail are being sent.

Reorganizing our server shelf

2025-08-31T13:01:00.001+02:00

TL;DR: With some planning and tinkering, you can fit a lot of hardware in a small space.

In Dutch homes, the meter cupboard (called a "meterkast" in Dutch) is a small closet, usually placed directly behind the front door. It houses the electric meter, gas meter, water meter and the circuit breakers. In our case it is also the entry point for the internet with an (ADSL) telephone line, and more recently two glass fibers (yes we can choose 🤷).

The top shelf of our meter cupboard is the perfect spot for our family's little 'data center'. The only downside is that it is small; it measures 18cm/7" high, 75cm/30" wide and 30cm/12" deep. The challenge is housing all our electronics there: our home server, an entry-model Synology NAS, the internet modem, an ethernet switch and all the cables and power adapters, including those for our wifi access points.

Since we switched from ADSL to fiber, it was a good moment to reorganize. Our internet provider (Freedom, highly recommended) supports bring-your-own internet modems. The default provided AVM Fritz!Box is quite large. Since our switch and wifi acces points are from Ubiquiti anyway, I bought the Unifi Cloud Gateway Ultra. It is very small and at €94, it is cheaper than the Fritz!Box 5590 which costs €180 through Freedom or €225 through a retailer. The Fritz!Box does come with smart home and DECT phone support. We don't need a smart home, but we do want DECT. To fill the gap, we found a secondhand Grandstream DECT/VoIP server with two handsets for just €50 on Marktplaats (our old handsets needed to be replaced anyway).

Although the internet modem, ethernet switch, and DECT/VoIP server are small, they still need to be stacked to save space. For this purpose I designed a small rack. Snijmeesters, a cutting shop in my city, laser-cut it from 3mm birch wood.

Here you see the parts.

Here you see the rack being glued together. To keep the sides straight and make it easy to remove any spilled glue, I used two glass containers.

And here is the result. It is quite sturdy, sturdier than I had imagined.

Using your own internet modem had one unexpected consequence. To convert from fiber to ethernet, you need an ONT. Freedom provides a Huawei EG8242H ONT for this purpose. It turns out that this ONT is a very large box! Making room for it on the shelf would have been difficult. However, since the provided fiber was not long enough anyway, it now hangs lower in the meter cupboard. I tried to find a smaller alternative, but it is hard to find a compatible product at a decent price. In the end we left it like this.

Here is our updated server shelf. The new rack is on the left. We have had it place for a couple of weeks now without any problems.

Self-hosted open-source multi-user multi-platform secret management

2025-08-10T09:21:00.007+02:00

TLDR: Syncthing, KeePassXC, Keepass2Android, AuthPass, and WebDAV via Apache HTTP Server allow for self-hosted open-source multi-user multi-platform secret management.

This article describes the secrets management setup used by me and my family. This is not a tutorial, but rather an overview of the possibilities and what works for us.

The setup:

is fully open source with Open Source Initiative-approved licenses
is multi-platform, it supports macOS, Linux, Windows, iPhone, and Android
is multi-user, you can share secrets
is self-hosted with low maintenance
supports passwords, TOTP, SSH keys and any other secret
has browser support
does not require tech-savvy family members (one is enough)

The tools

KeePassXC for macOS, Linux, and Windows
Keepass2Android Offline for Android
AuthPass for iPhone
Syncthing for macOS, Linux, and Windows. For Android: F-Droid or Google Play
Apache HTTP Server with WebDAV

KeePassXC, Keepass2Android Offline and AuthPass

These are three nice and complete apps that all support secret databases in the KeePass format. Although some variations exist, I have never experienced interoperability issues with these tools.

To use KeePassXC in the browser, you need a browser add-on. Many browsers are supported. Keepass2Android and AuthPass integrate well with the Android and iOS environments and don't require additional software.

On Andriod we're using the offline version of Keepass2Android since we already have Syncthing to get the keepass file.

Bonus features

Bonus feature 1: KeePassXC can also be used as an SSH-agent. This allows you to use SSH-keys as long as the KeePass database is unlocked. The SSH keys are synced along with all the other secrets. No more private key files on disk!

Bonus feature 2: if you ever lost a phone with Google Authenticator installed, you know how painful it is to set up 2FA with TOTP again. Configure TOTP in these apps instead, and that worry is gone.

Syncthing

Syncthing is an amazing program. It just continues to work with very little maintenance. It can synchronize multiple folders between multiple devices. Once a device is connected to another device, they automatically find each other over the internet.

Each person stores their main secrets database in a 'small' folder containing only the files they want to sync to their phone. This small folder is not shared between people. Then there are 'larger' folders that are optionally shared between multiple people. These larger folders are only synchronized between desktops and laptops and are a good place to store shared KeePass databases.

To ensure that all devices with Syncthing always stay in sync, it is a good idea to share all folders with a machine that is always on. Ideally the Syncthing port (22000) would be exposed directly to the internet. This reduces sync conflicts because it is more likely that all devices see the changes from the other devices.

Since you're going to create many folders, think about a naming convention. Our folders start with the name of the owner. The Syncthing folder name can be different from the directory in the file system. For example, the Syncthing folder could be named erik-documents while on my device the directory is called documents.

Even though there is a very nice Android application, Google has made it maddeningly difficult to publish to the Play store. So difficult even, that the maintainers have given up. Fortunately, you can still get a maintained fork via F-Droid or Google Play.

Bonus features and tips

Bonus feature 1: Store all your documents in a Syncthing folder so that you can access them from multiple devices.

Bonus feature 2: Configure a Syncthing instance to keep older file versions. Now you have a backup too!

Bonus feature 3: Sync the camera folder on your Android phone.

Tip 1: Using Homebrew? The most convenient way to install Syncthing is with the command brew install --cask syncthing-app.

Tip 2: When starting a new Syncthing device, remove the default folder shared from directory ~/Sync. Instead, put the folders you're sharing below the ~/Sync directory.

Tip 3: Before you create any folder in Syncthing, change the folder default configuration to have the following ignore pattern. This is especially important when you use Apple devices.

(?d)**/.DS_Store
(?d).DS_Store
#include .syncthing-patterns.txt

Tip 4: All the GUIs of the locally running Syncthing instances have the same 'localhost' URL. Since the URL is the same, you should also use the same password. Otherwise, storing the password in KeepassXC becomes difficult.

Support iPhone with Apache HTTP Server and WebDAV

Due to limitations imposed by iOS (no background threads unless you are paid by Apple), Syncthing does not run on iPhones. Fortunately, we found AuthPass which supports reading from and writing to a WebDAV folder. AuthPass does this really well; if you make changes while being offline, it automatically merges those changes into the latest version of the remote database once you go online again!

Fortunately, we already have a Linux server running Apache HTTP Server that is always on. (The websites there are also synced with Syncthing.) By configuring a WebDAV folder in Apache HTTP Server (protected by a username/password), we can share a Syncthing folder with AuthPass. Each person with an iPhone will need their own WebDAV folder.

Sharing secrets with KeeShare

KeeShare is a KeePassXC feature that allows you to synchronize a shared KeePass database with your main database. Since the main database contains a complete copy of the shared database, you only need to set up KeeShare on one device. Other devices, including your mobile phone, do not require direct access to the shared database.

Since KeeShare is only supported in KeePassXC, you must periodically open KeePassXC. Otherwise, you will miss changes in the shared databases. Shared databases won't sync if you only use the KeePass database on your mobile phone.

Tip: Sharing secrets is limited; you can only share entire databases. Therefore, plan ahead and decide how you want to organize secrets. We settled on a shared database for the whole family, and another shared database for just me and my partner.
Since each shared KeePass database is password-protected, you can store them all in the same shared Syncthing folder. However, if you are sharing other things as well, you may want to create multiple Syncthing folders.

Maintenance

Sometimes two offline devices modify the same KeePass file. Later, Syncthing detects the conflict and stores multiple files, one for each conflicting edit. You can merge all the conflict files using the database merge feature in KeePassXC. After merging them into the main database, you can delete the conflict files. Unfortunately, there is no default way to detect the presence of these conflict files. I manually check the synced folders once every few months (or when I miss a secret!). If you build a detection script, please share!

Since Syncthing runs so quietly in the background, you won't notice when things go wrong. To prevent this, check the Syncthing UI every few months.

Not explored options

KeePassXC has a command-line interface. This could be useful for Linux servers or scripts.

Conclusion

We have used this setup for over four years and have found it to be user-friendly and low-maintenance. Even my teenager kids are fervent users. Despite losing several devices, our secrets have never been lost.

Updates

Update 2025-10-12: updated the links to the syncthing android app on F-Droid and play store.

Update 2025-10-12: recommend Keepass2Android Offline instead of Keepass2Android.

Shutting down Version 99 does not exist

2025-07-08T21:27:00.004+02:00

July 2007, I was so fed up with Maven's inability to exclude commons-logging that I wrote a virtual maven repository to fake it. A few months later this became Version 99 does not exist. The virtual repository, has been running on my home server ever since.

In the early years, minor changes were made to increase compatibility with more tools.

Unfortunately, in 2011, the virtual repository lost its hostname.

Some time later, I reinstated a proper DNS name for the service: version99.grons.nl. For some unknown reason, I never blogged about this!

In 2013 the original version (90 line of Ruby with Camping), was replaced by a Go implementation written by Frank Schröder. This version (with super minor changes) has been in place ever since.

In the meantime, commons-logging has been replaced by Slf4j and tools have become better at excluding dependencies. Therefore, after almost 18 years, I am shutting down Version 99 does not exist. Version 99, it was fun having you, but luckily we no longer need you.

Installing a theme for Launch Drupal CMS

2025-06-21T13:49:00.002+02:00

Drupal CMS trial for Desktop is a wonderful way to try Drupal CMS. Unfortunately, you can't install new theme's from the admin api in the browser. Once you have selected a theme, for example Corporate Clean, there is an instruction on using a tool called composer. Its funny how there are so many pockets of developers where it is just assumed you have some particular tool installed.

As I found out, composer is a package manager for PHP, and it is installed inside the ~/Documents/drupal directory. This is the directory that the launcher creates on the Mac, the location may differ per OS. We also need a PHP runtime, I found one in the Launcher application bundle.
Bringing this all together, this are the commands to install a theme from the command line:

cd ~/Documents/drupal
/Applications/Launch\ Drupal\ CMS.app/Contents/Resources/bin/php \
  ./vendor/bin/composer require 'drupal/corporateclean:^1.0'

Running Wallabag with Posgresql and Docker compose

2025-06-08T21:35:00.005+02:00

Now that Pocket is going away, it is time to host a read-it-later app myself. After looking at a few options, my eyes fell on Wallabag. Its not all that smooth, but it works reasonably well.

I run several services with docker compose for its ease of upgrading, and, more importantly, for the ease with which you can get rid of a service once you no longer need it.

Since it didn't work out of the box, here is how I installed Wallabag with Postgresql, using Docker compose.

Installation

Create the directory /srv/wallabag. This is where wallabag will store the article images and its database.

Prepare the docker-compose.yaml file with:

services:
  wallabag:
    image: wallabag/wallabag
    restart: unless-stopped
    environment:
      - POSTGRES_PASSWORD=***_random_string_1_***
      - POSTGRES_USER=postgres
      - SYMFONY__ENV__DATABASE_DRIVER=pdo_pgsql
      - SYMFONY__ENV__DATABASE_HOST=db
      - SYMFONY__ENV__DATABASE_PORT=5432
      - SYMFONY__ENV__DATABASE_NAME=wallabag
      - SYMFONY__ENV__DATABASE_USER=wallabag
      - SYMFONY__ENV__DATABASE_PASSWORD=***_random_string_2_***
      - SYMFONY__ENV__DOMAIN_NAME=https://wallabag.domain.com
      - SYMFONY__ENV__SERVER_NAME="Wallabag"
      - SYMFONY__ENV__LOCALE=en
    ports:
      - "8000:80"
    volumes:
      - /srv/wallabag/images:/var/www/wallabag/web/assets/images
    healthcheck:
      test: ["CMD", "wget" ,"--no-verbose", "--tries=1", "--spider", "http://localhost/api/info"]
      interval: 1m
      timeout: 3s
    depends_on:
      - db
      - redis
  db:
    image: postgres:17
    restart: unless-stopped
    environment:
      - POSTGRES_PASSWORD=***_random_string_1_***
      - POSTGRES_USER=postgres
    volumes:
      - /srv/wallabag/data:/var/lib/postgresql/data
    healthcheck:
      test:
        - CMD-SHELL
        - 'pg_isready -U postgres'
      interval: 5s
      timeout: 5s
      retries: 5
  redis:
    image: redis:alpine
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 20s
      timeout: 3s

Replace the two secrets, change your DNS domain, and add more env variables as desired (see wallabag on docker hub for more information). Make sure you read the entire file.

Wallabag's auto initialization code doesn't really support postgresql that well. Howewer, with the following commands you should get it to work:

docker compose pull
docker compose up -d
docker compose exec db psql --user=postgres \
  -c "GRANT ALL ON SCHEMA public TO wallabag; \
      ALTER DATABASE wallabag OWNER TO wallabag;"
sleep 30
docker compose exec --no-TTY wallabag \
  /var/www/wallabag/bin/console doctrine:migrations:migrate \
    --env=prod --no-interaction
docker compose restart

What did we get?

You should now have a running Wallabag on port 8000. Go configure Caddy, Ngix, or whatever as a proxy with HTTPS termination.

Create a user

What you don't have yet, is a way to login. For this you need to create a user. You can do this with the following command:

docker compose exec -ti wallabag \
  /var/www/wallabag/bin/console fos:user:create --env=prod

More commands are documented on Wallabag console commands. Do not forget the mandatory --env=prod argument.

start.sh

To make upgrades a bit easier, you can use the following script:

#!/bin/bash
set -euo pipefail
IFS=$'\n\t'

docker compose pull
docker compose up -d
sleep 2
docker compose exec --no-TTY wallabag /var/www/wallabag/bin/console doctrine:migrations:migrate --env=prod --no-interaction
docker image prune

Future stuff

Once I have figured out how, I will update this article with:

Backup
Fail2ban integration

Using the TransIP API from bash

2025-01-01T17:27:00.003+01:00

TLDR: Signing API requests from Bash is tricky, but doable with a temporary file.

Every couple of months I rotate the DKIM keys of my email server, after which I publish them on my website. This article on publishing dkim keys gives a good overview of why this is a good idea.

Initially this was all done manually, but over time I automated more and more. The toughest part was finding a good DNS registrar (DKIM keys are published in DNS), that has a proper API, and then using that API from Bash. The DNS registrar I am using is TransIP.

Here is how I did it.

Before we can use any other endpoint of the API, we need to get a token. We get the token by sending an API request with your username. The request must be signed with the private key that you uploaded/obtained from the API part of the TransIP console.

Using Bash variables to hold the request makes signing very tricky, before you know it a newline is added or removed, invalidating the signature; the transmitted request must be byte-for-byte the same as what was signed. Instead, we side step all Bash idiosycrasies by storing the request in a temporary file. Here we go:

# Configure your TransIP username and location of the private key.
TRANSIP_USERNAME=your-username
TRANSIP_PRIVATE_KEY=/path/to/your-transip-private-key.pem

# The temporary file that holds the request.
TOKEN_REQUEST_BODY=$(mktemp)

# Create the request from your username.
# We're going to write DNS entries so 'read_only' must be 'false'.
# The request also needs a random nonce.
# The token is only needed for a short time, 30 seconds is enough
# in a Bash script.
# I vagely remember that the label must be unique, so some randomness
# is added there as well.
cat <<EOF > "$TOKEN_REQUEST_BODY"
{
    "login": "$TRANSIP_USERNAME",
    "nonce": "$(openssl rand -base64 15)",
    "read_only": false,
    "expiration_time": "30 seconds",
    "label": "Add dkim dns entry $RANDOM",
    "global_key": true
}
EOF

# Sign the request with openssl and encode the signature in base64.
SIGNATURE=$(
  cat "$TOKEN_REQUEST_BODY" |
  openssl dgst -sha512 -sign $TRANSIP_PRIVATE_KEY |
  base64 --wrap=0
)

# Send the request with curl.
# Note how we use '--data-binary' option to make sure curl transmit
# the request byte-for-byte as it was generated.
TOKEN_JSON=$(
  curl \
    --silent \
    --show-error \
    -X POST \
    -H "Content-Type: application/json" \
    -H "SIGNATURE: $SIGNATURE" \
    --data-binary "@$TOKEN_REQUEST_BODY" \
    https://api.transip.nl/v6/auth
)
rm -rf $TOKEN_REQUEST_BODY

# Extract the TOKEN from the response using jq.
TOKEN=$(echo "$TOKEN_JSON" | jq --raw-output .token)
if [[ "$TOKEN" == "null" ]]; then
  echo "Failed to get token"
  echo "$TOKEN_JSON"
  exit 1
fi

Now we can collect the data to write a DNS entry:

DNS_DOMAIN="your-domain.com"
DNS_NAME="unique-dkim-key-name._domainkey"
DNS_VALUE="v=DKIM1; h=sha256; t=s; p=MIIBIjANBgkqhkiG9....DAQAB"

I am using amavisd for DKIM so the values can be fetched with some grep/awk trickery:

DNS_DOMAIN="my-domain.com"
DNS_NAME=$(amavisd showkeys | grep -o '^[^.]*._domainkey')
DNS_VALUE=$(amavisd showkeys | awk -F'"' '$2 != "" {VALUE=VALUE $2}; END {print VALUE}')

Now we can create the DNS entry for the DKIM key:

# Create the request.
DNS_REQUEST_BODY=$(
cat <<EOF
{"dnsEntry":{"name":"${DNS_NAME}","expire":86400,"type":"TXT","content":"${DNS_VALUE}"}}
EOF
)

# Send the request with curl.
REGISTER_RESULT="$(
  curl \
    --silent \
    --show-error \
    -X POST \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $TOKEN" \
    -d "$DNS_REQUEST_BODY" \
    "https://api.transip.nl/v6/domains/${DNS_DOMAIN}/dns"
)"
if [[ "$REGISTER_RESULT" != "[]" ]]; then
  echo "Failed to register new DKIM DNS entry"
  echo "$REGISTER_RESULT"
  exit 1
fi

Note that this time this request is stored in a Bash variable.

Update 2024-01-50: Constructed variable DNS_REQUEST_BODY with cat instead of read because the latter exists with a non-zero exit code causing the script to exit.

Zio-kafka, faster than java-kafka

2024-12-02T22:11:00.007+01:00

TlDR: Concurrency and pre-fetching gives zio-kafka a higher consumer throughput than the default java Kafka client for most workloads.

Zio-kafka is an asynchronous Kafka client based on the ZIO async runtime. It allows you to consume from Kafka with a ZStream per partition. ZStreams are streams on steroids; they have all the stream operators known from e.g. RX-Java and Monix, and then some. ZStreams are supported by ZIO's super efficient runtime.

So the claim is that zio-kafka consumes faster than the regular Java client. But zio-kafka wraps the default java Kafka client. So how can it be faster? The trick is that with zio-kafka, processing (meaning your code) runs in parallel with the code that pulls records from the Kafka broker.

Obviously, there is some overhead in distributing the received records to the streams that need to process it. So the question then is, when is the extra overhead of zio-kafka less than the gain by parallel processing? Can we estimate this? In turns out we can! For this estimate we use the benchmarks that zio-kafka runs from GitHub. In particular, we look at these 2 benchmarks with runs from 2024-11-30:

the throughput benchmark, uses zio-kafka and takes around 592 ms
the kafkaClients benchmark, uses Kafka java client directly and takes around 544 ms

Both benchmarks:

run using JMH on a 4 core github runner with 16GB RAM
consume 50k records of ~512 bytes from 6 partitions
process the records by counting the number of incoming records, batch by batch
are using the same consumer settings, in particular max.poll.records is set to 1000
the broker runs in the same JVM as the consumer, so there is almost no networking overhead
do not commit offsets (but note that committing does not change the outcome of this article)

First we calculate the overhead of java-kafka. For this we assume that counting the number of records takes no time at all. This is reasonable, a count operation is just a few CPU instructions, nothing compared to fetching something over the network, even if it is on the same machine. Therefore, in the figure, the 'processing' rectangle collapses to a thick vertical line.

Polling and processing as blocks of execution on a thread/fiber.

We also assume that every poll returns the same amount of records, and takes the same amount of time. This is not how it works in practise, but when the records are already available we're probably not that far off. As the program consumes 50k records, and each poll returns a batch of 1k records, there are 50 polls. Therefore, the overhead of java-kafka is 544/50 ≈ 11 ms per poll.

Now we can calculate the overhead of zio-kakfa. Again we assume that processing takes no time at all, so that every millisecond that the zio-kafka benchmark takes longer can be attributed to zio-kafka's overhead. The zio-kakfa benchmark runs longer for 592-544 = 48 ms. Therefore zio-kafka's overhead is 48/50 ≈ 1ms per poll.

Now lets look at a more realistic scenario where processing does take time.

Polling and processing as blocks of execution on a thread/fiber.

As you can see a java-kafka program alternates between polling and processing, while the zio-kafka program processes the records in parallel distributed over 6 streams (1 stream per partition, each stream runs independently on its own ZIO fiber). In the figure we assume that the work is evenly distributed, but unless the load is really totally skewed, it won't matter that much in practice due to zio-kafka's per-partition pre-fetching. As you can see the zio-kafka program in this scenario is much faster. But is this a realistic example? When do zio-kafka programs become faster than java-kafka programs?

Observe that how long polling takes is not important for this question because polling time is the same for both programs. So we will only look at the time between polls. We use (p) for the processing time per batch for the java-kafka program. For the zio-kafka program time between polls is equal to zio-kafka's overhead (o) plus processing time per batch divided by the number of partitions (n). So we want to know for which p:

o + \frac{p}{n} \leq p

This solves to:

p \geq \frac{o n}{n - 1}

For the benchmark we fill in the zio-kakfa overhead (o = 1 ms) and the number of partitions (n = 6) and we get: p ≥ 1.2 ms. (Remember, this is processing time per batch of 1000 records.)

In the previous paragraph we assumed that processing is IO intensive. When the processing is compute intensive, ZIO can not actually run more fibers in parallel than the number of cores available. In our example we have 4 cores, using n = 4 gives p ≥ 1.3 ms which is still a very low tipping point.

Conclusion

For IO loads, or with enough cores available, even quite low processing times per batch, makes zio-kafka consume faster than the plain java Kafka library. The plain java consumer is only faster for trivial processing tasks like counting.

If you'd like to know what this looks like in practice, you can take a look at this real-world example application: Kafka Big-Query Express. This is a zio-kafka application recently open sourced by Adevinta. Here is the Kafka consumer code. (Disclaimer: I designed and partly implemented this application.)

MavenGate gets it all wrong and hurts open source

2024-08-25T13:24:00.004+02:00

MavenGate claims that some Maven namespaces (for example nl.grons, the namespace I control) are vulnerable to hijacking. If I understand it correctly, the idea is that hackers can place a package with the existing or newer Maven coordinates in the same, or different Maven repository, thereby luring users into using a hacked version of your package. Sounds serious, and it probably is.

However, they then went on to create a list of Maven namespaces that are vulnerable. Unfortunately, they do not say what criteria were used to put namespaces on this list. Is it because the associated DNS domain expired? Because the DNS domain moved to a different owner, or only to another DNS registrar? Is it because the PGP key used to sign packages is not on a known server? Or something else entirely? For some reason my namespace ended up on the list, even though I never lost control of the DNS domain and strictly follow all their recommendations.

Even more unfortunately, this is not even the right way to look at the problem. It is not the namespaces that are vulnerable, it is the Maven repositories themselves! It is the Maven repositories that are responsible for checking the namespace against ownership of the associated DNS domain and link that to a PGP key. Once the key is linked to the namespace, packages signed with a different PGP key should not be accepted. Any exceptions to this rule should be considered very carefully.

Now to my second point, how does this hurt open source? Since my Maven Central account was blocked after MavenGate, I contacted Sonatype, the owners of Maven Central. Luckily, I use Keybase and was therefore easily able to assert I am still owner of the DNS domain and the PGP key that has been used to sign packages. But then Sonatype also wrote this:

It is important to note that, even if we are able to verify your publisher authorization, security software may flag components published under this namespace. It may be worth considering registering a separate, new namespace with a clean-slate reputation.

I am just an individual publishing open source packages in my free time. IMHO it is totally unreasonable to ask people to switch to another domain because some random company on the internet suspects you might be vulnerable! Switching to a new DNS domain is a lot of work and in addition, not everyone is willing or able to bear the costs. I suspect that many people, including me, will give up rather than join a race against 'security software'.

To summarize:

MavenGate declares Maven namespaces to be vulnerable based on unclear and probably wrong criteria.
If this is taken seriously, the bar to publishing open source becomes so high that many will give up instead.

Note: I have tried to contact the MavenGate authors, but unfortunately did not receive a reply yet.

Java plugins with isolating class loaders

2024-05-28T15:18:00.000+02:00

My team's article on how to write Java plugins has been published on the Adevinta Tech Blog. Enjoy!

Making ZIO-Kafka Safer And Faster

2024-04-26T14:48:00.002+02:00

My talk "Making ZIO-Kafka Safer And Faster" at Functional Scala 2023 went online!

Explore Erik van Oosten's presentation on improving ZIO-Kafka for better safety and performance. Learn about the modifications introduced in 2023, get insights into the library's internal workings, and uncover useful ZIO techniques and Kafka's lesser-known challenges.

Contents in the video:

2:07 Improvements
9:06 Results
10:29 Rebalances
18:10 The Future

Tips for running Roundcube for years

2024-04-21T11:32:00.002+02:00

I have been running a Roundcube instance for about 8 years now. At the beginning I only used it as a backup email client that can be invoked from anywhere. Nowadays, is it so good that I didn't even bother installing Thunderbird on my work laptop.

Unfortunately, I discovered that the docker backup of Roundcube had become quite large, many GBs. This was quite unexpected for a service that is used by only 2 people. The reason was quickly found: the sqlite database was huge!

What did I know? I though ony Postgresql needed scheduled cleanups. Turns out sqlite needs it too! Would this be the reason Android phones tend to fill up over time?

Anyways, the fix was simple: run the vacuum command! So, now I have the following run daily using cron. Problem solved!

sqlite3 /paht/to/roundcubemail.sqlite 'VACUUM;'

Scheduling tasks and sharing state with streams

2024-01-24T09:13:00.001+01:00

Recently we built a system that needs to perform 2 tasks. Taks 1 runs every 15 minutes, task 2 runs every 2 minutes. Task 1 kicks off some background jobs (an upload to BigQuery), task 2 checks upon the results of these background jobs and does some cleanup when they are done (delete the uploaded files). The two tasks need to share information back and forth about what jobs are running in the background.

Now think to yourself (ignore the title for now 😉), what would be the most elegant way to implement this? I suspect that most developers will come with a solution that involves some locking, synchronisation and global state. For example by sharing the information through a transactional database, or by using a semaphore to prevent the two tasks from running at the same time plus sharing information in a global variable. This is understandable, most programming environments do not provide better techniques for these kinds of problems at all!

However, if your environment supports streams and has some kind of scheduling, here are two tricks you can use: one for the scheduling of the tasks, the second for sharing information without a global variable.

Here is an example for the first written in Scala using the ZIO streams library. Read on for an explanation.

import zio._
import zio.stream._

def performTask1: Task[Unit] = ???
def performTask2: Task[Unit] = ???

// An enumeration (scala 2 style) for our task.
sealed trait BusinessTask
object Task1 extends BusinessTask
object Task2 extends BusinessTask

ZStream.mergeAllUnbounded()(
  ZStream.fromSchedule(Schedule.fixed(15.minutes)).as(Task1),
  ZStream.fromSchedule(Schedule.fixed(2.minutes)).as(Task2)
)
.mapZIO {
  case Task1 => performTask1
  case Task2 => performTask2
}
.runDrain

We create 2 streams, each stream contains sequential numbers, emitted upon a schedule. As you can see, the schedule corresponds directly with the requirements. We do not really care for the sequential numbers, so with stream operator as we convert the stream's emitted values to a value from the BusinessTask enumeration.

Then we merge the two streams. We now have a stream that emits the two enumeration values at the time the corresponding task should run. This is already a big win! Even when the two schedules produce an item at the same time, the tasks will run sequentially. This is because by default streams are evaluated without parallelism.

We are not there yet though. The tasks need to share information. They could access a shared variable but then we still have tightly coupled components and no guarantees that the shared variable is used correctly.

Also, wouldn't it be great if performTask1 and performTask2 are functions that can be tested in isolation? With streams this is possible.

Here is the second part of the idea. Again, read on for an explanation.

case class State(...)
val initialState = State(...)

def performTask1(state: State): Task[State] = ???
def performTask2(state: State): Task[State] = ???

ZStream.mergeAllUnbounded()(
  ZStream.fromSchedule(Schedule.fixed(15.minutes)).as(Task1),
  ZStream.fromSchedule(Schedule.fixed(2.minutes)).as(Task2)
)
.scanZIO(initialState) { (state, task) =>
  task match {
    case Task1 => performTask1(state)
    case Task2 => performTask2(state)
  }
}
.runDrain

We have changed the signatures of the performTask* methods. Also, the mapZIO operator has been replaced with scanZIO. The stream operator scanZIO works much like foldLeft on collections. Like foldLeft, it accepts an initial state, and a function that combines the accumulated state plus the next stream element (of type BusinessTask) and converts those into the next state.

Stream operator scanZIO also emits the new states. This allows further common processing. For example we can persist the state to disk, or collect custom metrics about the state.

Conclusion

Using libraries with higher level constructs like streams, we can express straightforward requirements in a straightforward way. With a few lines of code we have solved the scheduling requirement, and showed an elegant way of sharing information between tasks without global variables.

Discovering scala-cli while fixing my digital photo archive

2023-11-26T09:58:00.001+01:00

Over the years I built up a nice digital photo library with my family. It is a messy process. Here are some of the things that can go wrong:

Digital cameras that add incompatible exif metadata.
Some files have exif tag CreateDate, others DateTimeOriginal.
Images shared via Whatsapp or Signal do not have an exif date tag at all.
Wrong rotation.
Fuzzy, yet memorable jpeg images wich take 15MB because of their resolution and high quality settings.
Badly supported ancient movie formats like 3gp and RIFF AVI.
Old movie formats that need 3 times more disk space than h.265.
Losing almost all your photos because you thought you could copy an Iphoto library using tar and cp (hint: you can’t). (This took a low-level harddisk scan and months of manual de-duplication work to recover the photos.)
Another low-level scan of an SD card to find accidentally deleted photos.
Date in image file name corresponds to import date, not creation date.
Weird file names that order the files differently than from creation date.
Images from 2015 are stored in the folder for 2009.
etc.

I wrote countless bash scripts to mold the collection into order. Unfortunately, to various success. However, now that I am ready to import the library into Immich (please, do sponsor them, they are building a very nice product!), I decided to start cleaning up everything.

So there I was, writing yet another bash script, struggling with parsing a date response from exiftool. And then I remembered the recent articles about scala-cli and decided to try it out.

The experience was amazing! Even without proper IDE support, I was able to crank out scripts that did more, more accurately and faster than I could ever have accomplished in bash.

Here are some of the things that I learned:

Take the time to learn os-lib.
If the scala code gets harder to write, open a proper IDE and use code completion. Then copy the code over to your .sc file.
One well placed .par (using scala-parallel-collections) can more than quadruple the performance of your script!
You will still spend a lot of time parsing the output from other programs (like exiftoool).
Scala-cli scripts run very well from Github actions as well.

Conclusions

Next time you open your editor to write a bash file, think again. Perhaps you should really write some scala instead.

Dependabot, Gradle and Scala

2023-10-08T09:15:00.003+02:00

Due to a series of unfortunate circumstances we have to deal with a couple of projects that use Gradle as build tool at work. For these projects we wanted automatic PR generation for updated dependencies. Since we use Github Enterprise, using Dependabot seems logical. However, this turned out to be not very straightforward. This article documents one way that works for us.

As we were experimenting with Dependabot, we discovered the following rules:

The scala version in the artifact name must not be a variable.
A variable for the artifact version is fine, but it must be declared in the same file in the ext block.
Versions should follow the Semver specification.
You must not use Gradle’s + version range syntax anywhere, Maven’s version range syntax is fine.

In our projects the scala version comes from a plugin. In addition, we sometimes need to cross build for different scala versions, very much at odds with rule no. 1. We solved this with a switch statement.

With these rules and constraints we discovered that the following structure works for us and Dependabot:

ext {
    jacksonVersion = '2.15.2'
    scalaTestVersion = '3.0.8'
}

dependencies {
    switch(scalaMainVersion) {
        case "2.12":
            implementation "com.fasterxml.jackson.module:jackson-module-scala_2.12:$jacksonVersion"
            testImplementation "org.scalatest:scalatest_2.12:$scalaTestVersion"
            break
        case "2.13":
            implementation "com.fasterxml.jackson.module:jackson-module-scala_2.13:$jacksonVersion"
            testImplementation "org.scalatest:scalatest_2.13:$scalaTestVersion"
            break
        default:
            break
    }

    // implementation 'com.example:library:0.8+'    // Don't do this
    implementation 'com.example:library:[0.8,1.0['  // This is fine

}

It took 3 people a month to slowly discover this solution (thank you!). I hope that you, dear reader, will spend your time more productive.

Zio-kafka hacking day

2023-04-20T14:22:00.012+02:00

Not long ago I contacted Steven (committer of the zio-kafka library) to get some better understanding of how the library works. April 12, not more than 2 months later I am a committer, and I was sitting in a room together with Steven, Jules Ivanic (another committer) and wildcard Pierangelo Cecchetto (contributor), hacking on zio-kafka.

The meeting was an idea of Jules who was ‘in the neighborhood’. He was traveling from Australia for his company (Conduktor). We were able to get a nice room in the Amsterdam office of my employer (Adevinta). Amsterdam turned out to be a nice middle ground for Steven, me and Pierangelo. (Special thanks to Go Data Driven who also had place for us.)

In the morning we spoke about current and new ideas on how to improve the library. Also, we shared detailed knowledge on ZIO and what Kafka expects from its users. After lunch we started hacking. Having someone to start an ad hoc discussion turned out to be very productive; we were able to move some tough issues forward.

Here are some highlights.

PR #788 — Wait for stream end in rebalance listener is important to prevent duplicates during a rebalance process. This PR was mostly finished for quite some time, but many details made the extensive test suite fail. We were able to solve many of these issues.

In the area of performance we implemented an idea to replace buffering (pre-fetching a fixed number of polls), with pre-fetching based on the stream’s queue size. This resulted in PR #803 — Alternative backpressure mechanism.

We also laid the seeds for another performance improvement implementation: PR #908 — Optimistically resume partitions early.

These last two PRs showed great performance improvements bringing us much closer to direct usage of the Java Kafka client. All 3 PRs are now in review.

All in all it was a lot of fun to meet fellow enthusiasts and hack on the complex machinery that is inside zio-kafka.

Kafka is good for transport, not for system boundaries

2023-01-29T10:20:00.004+01:00

In the last years I have learned that you should not run Kafka as a system boundary. A system boundary in this article is the place where messages are passed from one autonomy domain to another.

Now why is that? Let’s look at two classes of problems: connecting to Kafka and the long feedback loop. To prove my points, I am going to bore you with long stories from my personal experience. You may be in a different situation, YMMV!

Problem 1: Connecting to Kafka is hard

Compared to calling an HTTP endpoint, sending messages to Kafka is much much harder.

Don’t agree? Watch out for observation bias! During my holiday we often have long high-way drives through unknown countries. After looking at a highway for several hours non-stop, you might be inclined to believe that the entire country is covered by a dense highway network. In reality though, the next highway might be 200km away. A similar thing can happen at work. My part of the company offers Kafka as a service. We also run several services that invariable use Kafka in some way. We have deep knowledge and experience. It would be easy to think that Kafka is simple for everyone. However, for the rest of the company this Kafka thing is just another far away system that they have to integrate with and knowledge will be spotty and incomplete.

Let’s look at some of the problems that you have to deal with.

Partitioning is hard

It is easier to deal with partitioning problems when you control both the producer and the broker. We once had a problem where our systems could not keep up with the inflow of Kafka messages for one of the producers. The weird thing is that most of the machines were just idling. The problem grew slowly, so it took us some time before we realized it was caused by some partitions having most of the traffic. Producers of Kafka events do not always realize the effect of wrongly chosen key values. When many messages have the same key they end up in the same partition. It took some time before we got across that they needed to change the message key.

When you run an HTTP endpoint, spreading traffic and partitioning is handled by the load-balancer and is therefore under control of the receiver and not the sender.

Cross network connections are hard

Producers and the Kafka brokers need to have the same view of the network. This is because the brokers will tell a producer to which broker (by DNS name or IP address) it needs to connect to for each partition. This might go wrong when the producers and brokers use a different DNS server, or when they are on networks with colliding IP address ranges. Getting this right is a lot easier when you’re running everything in a single network you control.

This is not a problem with HTTP endpoints. Producers only need 1 hostname and optionally an HTTP proxy.

We didn’t talk about authentication and encryption yet. Kafka is very flexible; it has many knobs and settings in this area and the producers have to be configured exactly right or else it just won’t work. And don’t expect good error messages. Good documentation and cooperation is required to make this work across different teams.

With HTTP endpoints, encryption is very well-supported through https. Authentication is straight forward with HTTP’s basic authentication.

Problems that have been solved

Just for completeness here are some problems from around 2019 that have since been solved.

Around 2019 Kafka did not support authentication and TLS out of the box. Crossing untrusted networks was quite cumbersome.

Also around that time you had to be very careful about versioning. The client and server had to be upgraded in a very controlled order. Today this looks much better; you can combine almost any client and server version.

The default partitioner would give slow brokers more, instead of less work. This has been solved a few months ago.

Problem 2: Long feedback loop

When messages are being given to you via Kafka, you can not reject them. They are send and forget, the producer no longer cares. Dealing with invalid messages is now your responsibility.

In one of our projects we used to set invalid messages apart and offer Slack alerts so that the producers knew they had to look at the validation errors. Unfortunately, it didn’t work well. The feedback loop was simply too long and the number of invalid messages stayed high.

Later we introduced an HTTP endpoint in which we reject invalid messages with a 400 response. This simple change was nothing less than a miracle. For every producer that switched the vast majority of invalid messages disappeared. The number of invalid messages has remained very low since then.

Because we were able to reject invalid messages the feedback loop shortened and became much more effective.

Conclusions

Kafka within your own autonomy domain can be a great solution for message transport. However, Kafka as a boundary between autonomy domains will hurt.

Footnotes

Though at high enough volume, HTTP is not easy either; you’ll need proper connection pooling and an endpoint that accepts batches or else deploy a huge server park. ↩
Many load balancers offer sticky sessions which is a weak form of partitioning. ↩
We suffered both. ↩
When your authentication settings are wrong, the Kafka command line tools tell you that by showing an OutOfMemoryError. My head still hurts from this one. ↩
Though unfortunately, many architects will make this complex by using oauth or other such systems. ↩
Most invalid messages could be fixed with a few minutes of coding time. ↩

ZIO service layer pattern

2022-12-04T20:28:00.000+01:00

While reading about ZIO-config in 2.0.4, the following pattern to create services caught my eye. I am copying it here for easy lookup. Enjoy.

val myLayer: ZLayer[PaymentRepo, Nothing, MyService] = 
  ZLayer.scoped {
    for {
      repo   <- ZIO.service[PaymentRepo]
      config <- ZIO.config(MyServiceImpl.config)
      ref    <- Ref.make(MyState.Initial)
      impl   <- ZIO.succeed(new MyServiceImpl(config, ref, repo))
      _      <- impl.initialize
      _      <- ZIO.addFinalizer(impl.destroy)
    } yield impl
  }

Speed up ZIOs with memoization

2022-11-05T15:58:00.008+01:00

TLDR: You can do ZIO memoization in just a few lines, however, use zio-cache for more complex use cases.

Recently I was working on fetching Avro schema's from a schema registry. Avro schema's are immutable and therefore perfectly cacheable. Also, the number of possible schema's is limited so cache evictions are not needed. We can simply cache every schema for ever in a plain hash-map. So, we are doing memoization.

Since this was the first time I did this in a ZIO based application, I looked around for existing solutions. What I wanted is something like this:

def fetchSchema(schemaId: String): Task[Schema] = {
  val fetchFromRegistry: Task[Schema] = ???
  fetchFromRegistry.memoizeBy(key = schemaId)
}

Frankly, I was a bit disappointed that ZIO does not already support this out of the box. However, as you'll see in this article, the proposed syntax only works for simple use cases. (Actually, there is ZIO.memoize but that is even simpler and only caches the result for a single ZIO instance, not for any instance that gives the same value.)

Let's continue anyway and implement it ourselves.

The idea is that method memoizeBy first looks in a map using the given key. If the value is not present, we get the result from the original Zio and store it in the map. If the value is present, it will be used and the original Zio is not executed.

A map, yes, we need also need to give the method a map! The map might be used and updated concurrently. I choose to wrap an immutable map in a Ref, but you could also use a ConcurrentMap.

Here we go:

import zio._
import scala.collection.immutable.Map

implicit class ZioMemoizeBy[R, E, A](zio: ZIO[R, E, A]) {
  def memoizeBy[B](cacheRef: Ref[Map[B, A]], key: B): ZIO[R, E, A] = {
    for {
      cache <- cacheRef.get
      value <- cache.get(key) match {
        case Some(value) => ZIO.succeed(value)
        case None => zio.tap(value => cacheRef.update(_.updated(key, value)))
      }
    } yield value
  }
}

That is it, just a few lines of code to put in some corner of your application.

Here is a full example with memoizeBy using Service Pattern 2.0:

import org.apache.avro.Schema
import zio._
import scala.collection.immutable.Map

trait SchemaFetcher {
  def fetchSchema(schemaId: String): Task[Schema]
}

object SchemaFetcherLive {
  val layer: ZLayer[Any, Throwable, SchemaFetcher] = ZLayer {
    for {
      // Create the Ref and the Map:
      schemaCacheRef <- Ref.make(Map.empty[String, Schema])
    } yield SchemaFetcherLive(schemaCacheRef)
  }
}

case class SchemaFetcherLive(
  schemaCache: Ref[Map[String, Schema]]
) extends SchemaFetcher {
  def fetchSchema(schemaId: String): Task[Schema] = {
    val fetchFromRegistry: Task[Schema] = ...
    // Use memoizeBy to make fetchFromRegistry more efficient!
    fetchFromRegistry.memoizeBy(schemaCache, schemaId)
  }
}

Discussion

Note how we're using the default immutable Map. Because it is immutable, all threads can read from the map at the same time without synchronization. We only need some synchronization using Ref, to atomically replace the map after a new element was added.

When multiple requests for the same key come in at roughly the same time, both are executed, and both lead to an update of the map. This is not as advanced as e.g. zio-cache, which detects multiple simultaneous requests for the same key. In the presented use case this is not a problem and very unlikely to happen often anyway.

Can we improve further? Yes, we can! If you look at method fetchSchema in the example, you see that a fetchFromRegistry ZIO is constructed, but we do not use it when the value is already present. And even worse, the value already being present is the common case! This is not very efficient. If efficiency is a problem, another API is needed. Zio-cache does not have this problem. In zio-cache the cache is aware of how to look up new values (it is a loading cache). So here is a trade off: efficiency with a more complex API, or highly readable code.

Using zio-cache

For completeness, here is (almost) the same example using zio-cache:

import org.apache.avro.Schema
import zio._
import zio.cache.{Cache, Lookup}

trait SchemaFetcher {
  def fetchSchema(schemaId: String): Task[Schema]
}

object ZioCacheSchemaFetcherLive {
  val layer: ZLayer[SomeService, Throwable, SchemaFetcher] = ZLayer {
    for {
      someService <- ZIO.service[SomeService]
      // the fetching logic can use someService:
      fetchFromRegistry: String => Task[Schema] = ???
      // create the cache:
      cache <- Cache.make(
        capacity = 1000,
        timeToLive = Duration.Infinity,
        lookup = Lookup(fetchFromRegistry)
      )
    } yield ZioCacheSchemaFetcherLive(cache)
  }
}

case class ZioCacheSchemaFetcherLive(
  cache: Cache[String, Throwable, Schema]
) extends SchemaFetcher {
  def fetchSchema(schemaId: String): Task[Schema] = {
    // use the loader cache:
    cache.get(schemaId)
  }
}

We now need a reference to fetchFromRegistry while constructing the layer. This complicates the code a bit; we can no longer define fetchFromRegistry in the case class. In the example we pull a SomeService so that we can put the definition of fetchFromRegistry into the for comprehension and stick to Service Pattern 2.0. Perhaps we should completely move it to another service so that we can write lookup = Lookup(someService.fetchFromRegistry). That, I'll leave as an exercise to the reader.

Conclusion

For simple use cases like fetching Avro schema's, this article presents an appropriately light weight way to do memoization. If you need more features such as eviction and detection of concurrent invocations, I recommend zio-cache.

Update 2024-01-24

Here is a version of cachedBy that only fetches a value once, even when two fibers request it concurrently. The second fiber is semantically blocked until the first fiber has produced the value.

import zio._

object ZioCaching {
  implicit class ZioCachedBy[R, E, A](zio: ZIO[R, E, A]) {
    def cachedBy[B](cacheRef: Ref[Map[B, Promise[E, A]]], key: B): ZIO[R, E, A] = {
      for {
        newPromise <- Promise.make[E, A]
        actualPromise <- cacheRef.modify { cache =>
          cache.get(key) match {
            case Some(existingPromise) => (existingPromise, cache)
            case None                  => (newPromise, cache + (key -> newPromise))
          }
        }
        _ <- ZIO.when(actualPromise eq newPromise) {
          zio.intoPromise(newPromise)
        }
        value <- actualPromise.await
      } yield value
    }
  }

}

Zigzag bytes

2022-06-08T09:35:00.005+02:00

I was playing around with a goofy idea for which I needed zigzag encoding for bytes. Zigzag encoding is often used in combination with variable length encoding in things like Avro, Thrift and Protobuf.

In zigzag encoded integers, the least significant bit is used for sign. To convert from regular encoding (2-complement) to zigzag (and back) you can use the following Scala code:

def i32ToZigZag(n: Int): Int = (n << 1) ^ (n >> 31)
def zigZagToI32(n: Int): Int = (n >>> 1) ^ - (n & 1)
def i64ToZigZag(n: Long): Long = (n << 1) ^ (n >> 63)
def zigZagToI64(n: Long): Long = (n >>> 1) ^ - (n & 1)

Translate this to Java and the expressions after the = look exactly the same.

Using these bit shifting tricks for bytes is a whole lot more difficult. The problem is that Scala (like Java) does not support bit operations on Bytes. They always convert them to an Int first.

After a lot of fiddling, I settled on the following:

private def b(i: Int): Byte = (i & 0xff).toByte
def i8ToZigZag(n: Byte): Byte = (b(n << 1) ^ (n >> 7)).toByte
def zigZagToI8(n: Byte): Byte = b(((n & 0xff) >>> 1) ^ (256 - (n & 1)))

Translated to Java it should look like this (not tested!):

private byte b(int i) { return (byte)(i & 0xff); }
public byte i8ToZigZag(byte n) { return (byte)(b(n << 1) ^ (n >> 7)); }
public byte zigZagToI8(byte n) { return b(((n & 0xff) >>> 1) ^ (256 - (n & 1))); }

Is there a better way to do this?

Upgrading Libreoffice with Homebrew

2022-03-12T17:09:00.008+01:00

Update 2023-06-06 brew now asks for your password so it can install everything directly. Much better!!

The text below is no longer applicable and only kepts as reference.

Reminder to self: this is the procedure to upgrade Libreoffice with Homebrew:

brew update
brew upgrade
open -a /Applications/LibreOffice.app
Quit the application
brew reinstall libreoffice-language-pack, enter your password
open "/usr/local/Caskroom/libreoffice-language-pack/$(cd /usr/local/Caskroom/libreoffice-language-pack; ls -1 | sort -rV | head)/LibreOffice Language Pack.app", click 'Ok', click 'Ok'

Please anyone, please make this simpler...

Update 2021-03-24

Here is a script to remove most of the manual toil:

#!/bin/bash
echo "Initiating Libre Office upgrade"
brew update
brew upgrade libreoffice
open -g -a /Applications/LibreOffice.app
echo "Wait until the application completed startup (it is started in the background)"
read -p "Press enter to quit LibreOffice"
osascript -e 'quit app "LibreOffice"'
APP=$(brew reinstall libreoffice-language-pack | tee /dev/tty | grep "LibreOffice Language Pack.app" | xargs)
open -a "$APP"
echo "The language pack installer has been opened."

Having fun with Ordering in Scala

2022-01-26T14:17:00.000+01:00

Challenge: sort a list of objects by name, but some names have priority. If these names appear, they should be ordered by the position they have in the priority list.

For example:

val priorityList = Seq("Willow", "James", "Ezra")
val input = Seq("Olivier", "Charlotte", "Willow", "Declan", "Aurora", "Ezra")
val ordered = ???
assert(ordered == Seq("Willow", "Ezra", "Aurora", "Charlotte", "Declan", "Olivier"))

Challenge accepted.

Luckily Scala has strong support for sorting in the standard library. All sequences have a sorted method which accepts an Ordering. The ordering is an implicit parameter which means that normally we don't need to provide it; it will be derived to the natural ordering of the items. However, we are going to provide this parameter explicitly. Let's build an Ordering!

The challenge explains we have 2 orderings:

first order by the priority list
failing that, order by alphabet

Let's focus on the first ordering. The idea is to assign an integer 'priority-value' to each possible string that is based on the position in the priority list. If the string is not in the list, we use some high integer. The first ordering will simply order by this priority-value. Lower numbers go before higher number, just like the natural ordering of integers.

// attempt 1
val priorityValue = priorityList.indexOf(name)

This works well for any name on the priority list. E.g. Willow gets 0 and Ezra gets 2. Unfortunately, all the other names get priority value -1 which orders them even before Willow. We need to convert the -1 to something higher.
Since I like to program without if statements whenever possible, I looked at math for a solution. Modulus can do the trick:

// attempt 2
val priorityValue = priorityList.indexOf(name) % priorityList.size

Oops, wrong modulus implementation: -1 % 3 == -1. Let's use floorMod:

// attempt 3
val priorityValue = Math.floorMod(priorityList.indexOf(name), priorityList.size) 

Now -1 gets converted into priorityList.size which is definitely higher than the other priority-values. However, since we don't really care what the higher number is we can just use Int.MaxValue:

val priorityValue = Math.floorMod(priorityList.indexOf(name), Int.MaxValue)

Now we wrap that in an Ordering:

val priorityOrdering: Ordering[String] =
  Ordering.by(name => Math.floorMod(priorityList.indexOf(name), Int.MaxValue))

Unfortunately Scala can't derive the type. We either need to type the value directly, or add a type on the name parameter.

Now we need to add the second ordering. We can simply use Ordering.String from the library. We combine the orderings with orElse, available since Scala 2.13. The second ordering is used when priorityOrdering can't decide because the priority-value is the same.
Note that for the special case where we compare a priority value with itself, e.g. Willow with Willow, the second ordering is also applied. This is okay though, the outcome doesn't change because these values are the same for the alphabetic ordering also.

Here is the complete code:

val priorityList = Seq("Willow", "James", "Ezra")
val priorityOrdering: Ordering[String] =
  Ordering.by(name => Math.floorMod(priorityList.indexOf(name), Int.MaxValue))
val combinedOrdering: Ordering[String] =
  priorityOrdering.orElse(Ordering.String)

val input = Seq("Olivier", "Charlotte", "Willow", "Declan", "Aurora", "Ezra")
val ordered = input.sorted(combinedOrdering)
assert(ordered == Seq("Willow", "Ezra", "Aurora", "Charlotte", "Declan", "Olivier"))

We're almost there. The challenge was to work on any object. Lets wrap it up a bit and also make it work for any type of priority value:

def priorityOrdering[A, B : Ordering](priorityList: Seq[B], by: A => B): Ordering[A] = {
  def priorityValue(b: B): Int = Math.floorMod(priorityList.indexOf(b), Int.MaxValue)
  Ordering.by[A, Int](a => priorityValue(by(a))).orElseBy(by)
}

We can use like this:

val ordered = input.sorted(priorityOrdering[String, String](priorityList, identity))

Or like this. Here we sort persons by birthdate, ordering today before the other days:

case class Person(name: String, birthdate: MonthDay)
val persons: Seq[Person] = ???

val priorityDates = Seq(MonthDay.now())
persons.sorted(priorityOrdering(priorityDates, (_: Person).birthdate))

Some remarks

Note that in all these cases type derivation is quite awful. The compiler has problems finding the correct types, even though all the information is available.

You should also know that you can't use this approach if you are stuck on Scala 2.12 or earlier since Ordering.orElse is not available there.

Alternative

You can side-step all the type derivation problems by using the sortBy method. Give it a function that returns a tuple of Ints, Strings, or anything for which an Ordering is already defined. The sequence is then sorted on the first value of the tuple, then on the second value, etc.:

def priorityValue(name: String): Int =
  Math.floorMod(priorityList.indexOf(name), Int.MaxValue)
val ordered = input.sortBy(name => (priorityValue(name), name))

Conclusion

Although I had fun learning all about Ordering, next time I'll avoid it and go directly for sortBy.

From Adoptopenjdk to Temurin on a Mac using Homebrew

2022-01-10T14:21:00.005+01:00

Adoptopenjdk joined the Eclipse foundation and renamed their JDK to Temurin. Here are instructions on how to migrate on Macs with Homebrew.

The following instructions removes Adoptopenjdk JDKs you may still have:

brew remove adoptopenjdk/openjdk/adoptopenjdk8
brew remove adoptopenjdk/openjdk/adoptopenjdk11
brew untap AdoptOpenJDK/openjdk
brew remove adoptopenjdk8
brew remove adoptopenjdk11
brew remove adoptopenjdk

Use /usr/libexec/java_home -V to get an overview of any other JDK you may still have. Just delete what you don't need any more.

Then install Temurin 8, 11 and 17. The first command (brew tap …) is only needed in case you need Temurin 8 or 11:

brew tap homebrew/cask-versions
brew install --cask temurin8
brew install --cask temurin11
brew install --cask temurin

Bonus: execute the following to define aliases that let you easily switch between Java versions:

cat <<-EOF >> ~/.zshrc

# Aliases for switching java version
alias java17="export JAVA_HOME=\$(/usr/libexec/java_home -v 17)"
alias java11="export JAVA_HOME=\$(/usr/libexec/java_home -v 11)"
alias java8="export JAVA_HOME=\$(/usr/libexec/java_home -v 1.8)"
java11
EOF

Are you looking for more power? For example you need to test for many more JDKs? Then maybe Sdkman is something for you.

Customizing the Jitsi Meet UI in a Docker deployment

2021-12-19T17:44:00.004+01:00

I manage a Jitsi instance for a small for-benefit organization. I wanted to make some changes to the UI to make it visually belong to the organization. Unfortunately, Jitsi doesn't make it easy to do this. Upon every upgrade your changes are gone. This post describes a workaround for Jitsi deployments that use Docker.

Although the details can be hairy, the idea is quite simple. We are going to put another layer over the provided Docker image called 'web'. The additional layer contains all the changes we need. When Jitsi publishes an update, we just apply the changes again as part of the deployment process.

Our starting point is the docker-compose.yaml provided by Jitsi. Make all the changes as instructed. However, before we make any changes to the UI, you should make sure your Jitsi instance is working.

Its working? Congratulations! Start with a small change to your docker-compose.yaml.
Replace:

  web:
    image: jitsi/web:latest

with:

  web:
    build: ./jitsi-web

This sets you up for building your own Docker image.

Create the jitsi-web directory, and put all the artwork you want to override in it. You should end up with a directory structure like this (more details follow):

The Dockerfile in the jitsi-web initially has just one line like this:

FROM jitsi/web

Build the image and deploy it with:

docker-compose build --pull
docker-compose up -d

Make sure that Jitsi is still working.

Now it is your turn to get creative. With some RUN instructions you can change any file in the base image.

To get you started, I'll show what is in my Dockerfile. Details are discussed directly below:

FROM jitsi/web

# Add wasm mime type
# https://community.jitsi.org/t/loading-wasm-webassembly-file-on-jitsimeetjs/68071/3
RUN sed -i '/}/i \
    application/wasm          wasm;' /etc/nginx/mime.types

# Replace/add some images
COPY --chown=root:root overrides /usr/share/jitsi-meet/

RUN sed -i "s|\(// \)\?defaultLanguage:.*|defaultLanguage: 'nl',|" /defaults/config.js; \
    sed -e 's/welcome-background.png/welcome-background.jpg/' \
        -e 's|.deep-linking-mobile .header{width:100%;height:70px;background-color:#f1f2f5;text-align:center}|.deep-linking-mobile .header{width:100%;height:70px;background-color:#003867;text-align:center}|' \
        -e 's|.deep-linking-mobile .header .logo{margin-top:15px;margin-left:auto;margin-right:auto;height:40px}|.deep-linking-mobile .header .logo{margin-top:10px;margin-left:auto;margin-right:auto;height:50px}|' \
        -i /usr/share/jitsi-meet/css/all.css; \
    sed -e 's|"headerTitle": "Jitsi Meet"|"headerTitle": "Mijn Organisatie"|' \
        -e 's|"headerSubtitle": "Veilige vergaderingen van hoge kwaliteit"|"headerSubtitle": "Wij vergaderen online!"|' \
        -i /usr/share/jitsi-meet/lang/main-nl.json; \
    sed -e 's|C().createElement("h1",{className:"header-text-title"},t("welcomepage.headerTitle"))|C().createElement("h1",{className:"header-text-title"},C().createElement("img",{src:"images/logo-deep-linking.png",alt:"Mijn Organisatie",height:100}))|' \
        -e 's|"headerTitle":"Jitsi Meet"|"headerTitle":"Mijn Organisatie"|' \
        -e 's|"headerSubtitle":"Secure and high quality meetings"|"headerSubtitle":"Wij vergaderen online!"|' \
        -i /usr/share/jitsi-meet/libs/app.bundle.min.js; \
    sed -e "s|\bAPP_NAME: .*|APP_NAME: 'Mijn Organisatie Jitsi',|" \
        -e "s|\bPROVIDER_NAME: .*|PROVIDER_NAME: 'Mijn Organisatie Cloud',|" \
        -e "s|\bDEFAULT_REMOTE_DISPLAY_NAME: .*|DEFAULT_REMOTE_DISPLAY_NAME: 'Gespreksgenoot',|" \
        -e "s|\bDEFAULT_LOCAL_DISPLAY_NAME: .*|DEFAULT_LOCAL_DISPLAY_NAME: 'Ik',|" \
        -e "s|\bGENERATE_ROOMNAMES_ON_WELCOME_PAGE: .*|GENERATE_ROOMNAMES_ON_WELCOME_PAGE: false,|" \
        -i /defaults/interface_config.js

The first RUN instruction adds a line to the Nginx configuration which enables clients to download Wasm files. Unfortunately this is not yet fixed in Jitsi docker itself (checked at December 2021, but YMMV).

The COPY instruction copies your images over the existing stuff. Feel free to add more as needed.

The second RUN instruction is where the magic happens. This changes existing files. Let's go through them one by one.

The first file that gets changes in /defaults/config.js where we set the default language to Dutch.

The next file that gets changes is /usr/share/jitsi-meet/css/all.css. Normally Jitsi uses a png background image on the welcome page but I needed to use a jpg image. The first line takes care of that. Note that there is no welcome-background.jpg image in the base image, but I added it in the overrides/images directory.
The next 2 changes for this file are some small color and layout changes to the welcome page for mobile browsers.

The next file that gets changes is /usr/share/jitsi-meet/lang/main-nl.json. There are many more files in this directory, one for each language.

The next one, file /usr/share/jitsi-meet/libs/app.bundle.min.js is tricky. This file contains a fully compiled React application in minified Javascript. The first change you see here replaces the header text with a header image. The next two lines replace the the default titles with the Dutch version. For some reason Jitsi initially renders the page in English and then re-renders it in the correct locale. On slow devices this can take quite some time. I found this quite disturbing, especially for the texts that make your first impression. By changing some default texts, most of my users (which are Dutch) will see less flapping texts.
This is the file that is most sensitive to changes in the base image. Make sure your tweaks still work after an upgrade.

Finally, in /defaults/interface_config.js some more settings are tweaked.

Some more tips

Don't worry if you break something. Just fix your changes and re-deploy. A re-deploy is very quick.

Finding out what to change can be pretty hard. Sometimes it helps to extract the file from the image to see what it contains. First find the file you want to change by opening a shell in the base image:

docker-compose exec web bash

Extract the file for more detailed inspection with something like this:

docker-compose exec web cat /usr/share/jitsi-meet/libs/app.bundle.min.js > app.bundle.min.js

Jitsi updates

When you see new images appear at Jitsi on docker hub you can deploy them as follows:

# Pulls the images that we're not changing (e.g. prosody, jicofo and jvb):
docker-compose pull
# Rebuild the 'web' image, checking for a new base image:
docker-compose build --pull
# Deploy changes:
docker-compose up -d
# Remove old images:
docker image prune

Most of the things that were tweaked here were pretty stable over the last years. But I advice you check anyway.

That's it, go creative!