Day to day stuff: 2022

Sunday, December 4, 2022

ZIO service layer pattern

While reading about ZIO-config in 2.0.4, the following pattern to create services caught my eye. I am copying it here for easy lookup. Enjoy.

val myLayer: ZLayer[PaymentRepo, Nothing, MyService] = 
  ZLayer.scoped {
    for {
      repo   <- ZIO.service[PaymentRepo]
      config <- ZIO.config(MyServiceImpl.config)
      ref    <- Ref.make(MyState.Initial)
      impl   <- ZIO.succeed(new MyServiceImpl(config, ref, repo))
      _      <- impl.initialize
      _      <- ZIO.addFinalizer(impl.destroy)
    } yield impl
  }

Saturday, November 5, 2022

Speed up ZIOs with memoization

TLDR: You can do ZIO memoization in just a few lines, however, use zio-cache for more complex use cases.

Recently I was working on fetching Avro schema's from a schema registry. Avro schema's are immutable and therefore perfectly cacheable. Also, the number of possible schema's is limited so cache evictions are not needed. We can simply cache every schema for ever in a plain hash-map. So, we are doing memoization.

Since this was the first time I did this in a ZIO based application, I looked around for existing solutions. What I wanted is something like this:

def fetchSchema(schemaId: String): Task[Schema] = {
  val fetchFromRegistry: Task[Schema] = ???
  fetchFromRegistry.memoizeBy(key = schemaId)
}

Frankly, I was a bit disappointed that ZIO does not already support this out of the box. However, as you'll see in this article, the proposed syntax only works for simple use cases. (Actually, there is ZIO.memoize but that is even simpler and only caches the result for a single ZIO instance, not for any instance that gives the same value.)

Let's continue anyway and implement it ourselves.

The idea is that method memoizeBy first looks in a map using the given key. If the value is not present, we get the result from the original Zio and store it in the map. If the value is present, it will be used and the original Zio is not executed.

A map, yes, we need also need to give the method a map! The map might be used and updated concurrently. I choose to wrap an immutable map in a Ref, but you could also use a ConcurrentMap.

Here we go:

import zio._
import scala.collection.immutable.Map

implicit class ZioMemoizeBy[R, E, A](zio: ZIO[R, E, A]) {
  def memoizeBy[B](cacheRef: Ref[Map[B, A]], key: B): ZIO[R, E, A] = {
    for {
      cache <- cacheRef.get
      value <- cache.get(key) match {
        case Some(value) => ZIO.succeed(value)
        case None => zio.tap(value => cacheRef.update(_.updated(key, value)))
      }
    } yield value
  }
}

That is it, just a few lines of code to put in some corner of your application.

Here is a full example with memoizeBy using Service Pattern 2.0:

import org.apache.avro.Schema
import zio._
import scala.collection.immutable.Map

trait SchemaFetcher {
  def fetchSchema(schemaId: String): Task[Schema]
}

object SchemaFetcherLive {
  val layer: ZLayer[Any, Throwable, SchemaFetcher] = ZLayer {
    for {
      // Create the Ref and the Map:
      schemaCacheRef <- Ref.make(Map.empty[String, Schema])
    } yield SchemaFetcherLive(schemaCacheRef)
  }
}

case class SchemaFetcherLive(
  schemaCache: Ref[Map[String, Schema]]
) extends SchemaFetcher {
  def fetchSchema(schemaId: String): Task[Schema] = {
    val fetchFromRegistry: Task[Schema] = ...
    // Use memoizeBy to make fetchFromRegistry more efficient!
    fetchFromRegistry.memoizeBy(schemaCache, schemaId)
  }
}

Discussion

Note how we're using the default immutable Map. Because it is immutable, all threads can read from the map at the same time without synchronization. We only need some synchronization using Ref, to atomically replace the map after a new element was added.

When multiple requests for the same key come in at roughly the same time, both are executed, and both lead to an update of the map. This is not as advanced as e.g. zio-cache, which detects multiple simultaneous requests for the same key. In the presented use case this is not a problem and very unlikely to happen often anyway.

Can we improve further? Yes, we can! If you look at method fetchSchema in the example, you see that a fetchFromRegistry ZIO is constructed, but we do not use it when the value is already present. And even worse, the value already being present is the common case! This is not very efficient. If efficiency is a problem, another API is needed. Zio-cache does not have this problem. In zio-cache the cache is aware of how to look up new values (it is a loading cache). So here is a trade off: efficiency with a more complex API, or highly readable code.

Using zio-cache

For completeness, here is (almost) the same example using zio-cache:

import org.apache.avro.Schema
import zio._
import zio.cache.{Cache, Lookup}

trait SchemaFetcher {
  def fetchSchema(schemaId: String): Task[Schema]
}

object ZioCacheSchemaFetcherLive {
  val layer: ZLayer[SomeService, Throwable, SchemaFetcher] = ZLayer {
    for {
      someService <- ZIO.service[SomeService]
      // the fetching logic can use someService:
      fetchFromRegistry: String => Task[Schema] = ???
      // create the cache:
      cache <- Cache.make(
        capacity = 1000,
        timeToLive = Duration.Infinity,
        lookup = Lookup(fetchFromRegistry)
      )
    } yield ZioCacheSchemaFetcherLive(cache)
  }
}

case class ZioCacheSchemaFetcherLive(
  cache: Cache[String, Throwable, Schema]
) extends SchemaFetcher {
  def fetchSchema(schemaId: String): Task[Schema] = {
    // use the loader cache:
    cache.get(schemaId)
  }
}

We now need a reference to fetchFromRegistry while constructing the layer. This complicates the code a bit; we can no longer define fetchFromRegistry in the case class. In the example we pull a SomeService so that we can put the definition of fetchFromRegistry into the for comprehension and stick to Service Pattern 2.0. Perhaps we should completely move it to another service so that we can write lookup = Lookup(someService.fetchFromRegistry). That, I'll leave as an exercise to the reader.

Conclusion

For simple use cases like fetching Avro schema's, this article presents an appropriately light weight way to do memoization. If you need more features such as eviction and detection of concurrent invocations, I recommend zio-cache.

Update 2024-01-24

Here is a version of cachedBy that only fetches a value once, even when two fibers request it concurrently. The second fiber is semantically blocked until the first fiber has produced the value.

import zio._

object ZioCaching {
  implicit class ZioCachedBy[R, E, A](zio: ZIO[R, E, A]) {
    def cachedBy[B](cacheRef: Ref[Map[B, Promise[E, A]]], key: B): ZIO[R, E, A] = {
      for {
        newPromise <- Promise.make[E, A]
        actualPromise <- cacheRef.modify { cache =>
          cache.get(key) match {
            case Some(existingPromise) => (existingPromise, cache)
            case None                  => (newPromise, cache + (key -> newPromise))
          }
        }
        _ <- ZIO.when(actualPromise eq newPromise) {
          zio.intoPromise(newPromise)
        }
        value <- actualPromise.await
      } yield value
    }
  }

}

Wednesday, June 8, 2022

Zigzag bytes

I was playing around with a goofy idea for which I needed zigzag encoding for bytes. Zigzag encoding is often used in combination with variable length encoding in things like Avro, Thrift and Protobuf.

In zigzag encoded integers, the least significant bit is used for sign. To convert from regular encoding (2-complement) to zigzag (and back) you can use the following Scala code:

def i32ToZigZag(n: Int): Int = (n << 1) ^ (n >> 31)
def zigZagToI32(n: Int): Int = (n >>> 1) ^ - (n & 1)
def i64ToZigZag(n: Long): Long = (n << 1) ^ (n >> 63)
def zigZagToI64(n: Long): Long = (n >>> 1) ^ - (n & 1)

Translate this to Java and the expressions after the = look exactly the same.

Using these bit shifting tricks for bytes is a whole lot more difficult. The problem is that Scala (like Java) does not support bit operations on Bytes. They always convert them to an Int first.

After a lot of fiddling, I settled on the following:

private def b(i: Int): Byte = (i & 0xff).toByte
def i8ToZigZag(n: Byte): Byte = (b(n << 1) ^ (n >> 7)).toByte
def zigZagToI8(n: Byte): Byte = b(((n & 0xff) >>> 1) ^ (256 - (n & 1)))

Translated to Java it should look like this (not tested!):

private byte b(int i) { return (byte)(i & 0xff); }
public byte i8ToZigZag(byte n) { return (byte)(b(n << 1) ^ (n >> 7)); }
public byte zigZagToI8(byte n) { return b(((n & 0xff) >>> 1) ^ (256 - (n & 1))); }

Is there a better way to do this?

Saturday, March 12, 2022

Upgrading Libreoffice with Homebrew

Update 2023-06-06 brew now asks for your password so it can install everything directly. Much better!!

The text below is no longer applicable and only kepts as reference.

Reminder to self: this is the procedure to upgrade Libreoffice with Homebrew:

brew update
brew upgrade
open -a /Applications/LibreOffice.app
Quit the application
brew reinstall libreoffice-language-pack, enter your password
open "/usr/local/Caskroom/libreoffice-language-pack/$(cd /usr/local/Caskroom/libreoffice-language-pack; ls -1 | sort -rV | head)/LibreOffice Language Pack.app", click 'Ok', click 'Ok'

Please anyone, please make this simpler...

Update 2021-03-24

Here is a script to remove most of the manual toil:

#!/bin/bash
echo "Initiating Libre Office upgrade"
brew update
brew upgrade libreoffice
open -g -a /Applications/LibreOffice.app
echo "Wait until the application completed startup (it is started in the background)"
read -p "Press enter to quit LibreOffice"
osascript -e 'quit app "LibreOffice"'
APP=$(brew reinstall libreoffice-language-pack | tee /dev/tty | grep "LibreOffice Language Pack.app" | xargs)
open -a "$APP"
echo "The language pack installer has been opened."

Wednesday, January 26, 2022

Having fun with Ordering in Scala

Challenge: sort a list of objects by name, but some names have priority. If these names appear, they should be ordered by the position they have in the priority list.

For example:

val priorityList = Seq("Willow", "James", "Ezra")
val input = Seq("Olivier", "Charlotte", "Willow", "Declan", "Aurora", "Ezra")
val ordered = ???
assert(ordered == Seq("Willow", "Ezra", "Aurora", "Charlotte", "Declan", "Olivier"))

Challenge accepted.

Luckily Scala has strong support for sorting in the standard library. All sequences have a sorted method which accepts an Ordering. The ordering is an implicit parameter which means that normally we don't need to provide it; it will be derived to the natural ordering of the items. However, we are going to provide this parameter explicitly. Let's build an Ordering!

The challenge explains we have 2 orderings:

first order by the priority list
failing that, order by alphabet

Let's focus on the first ordering. The idea is to assign an integer 'priority-value' to each possible string that is based on the position in the priority list. If the string is not in the list, we use some high integer. The first ordering will simply order by this priority-value. Lower numbers go before higher number, just like the natural ordering of integers.

// attempt 1
val priorityValue = priorityList.indexOf(name)

This works well for any name on the priority list. E.g. Willow gets 0 and Ezra gets 2. Unfortunately, all the other names get priority value -1 which orders them even before Willow. We need to convert the -1 to something higher.
Since I like to program without if statements whenever possible, I looked at math for a solution. Modulus can do the trick:

// attempt 2
val priorityValue = priorityList.indexOf(name) % priorityList.size

Oops, wrong modulus implementation: -1 % 3 == -1. Let's use floorMod:

// attempt 3
val priorityValue = Math.floorMod(priorityList.indexOf(name), priorityList.size) 

Now -1 gets converted into priorityList.size which is definitely higher than the other priority-values. However, since we don't really care what the higher number is we can just use Int.MaxValue:

val priorityValue = Math.floorMod(priorityList.indexOf(name), Int.MaxValue)

Now we wrap that in an Ordering:

val priorityOrdering: Ordering[String] =
  Ordering.by(name => Math.floorMod(priorityList.indexOf(name), Int.MaxValue))

Unfortunately Scala can't derive the type. We either need to type the value directly, or add a type on the name parameter.

Now we need to add the second ordering. We can simply use Ordering.String from the library. We combine the orderings with orElse, available since Scala 2.13. The second ordering is used when priorityOrdering can't decide because the priority-value is the same.
Note that for the special case where we compare a priority value with itself, e.g. Willow with Willow, the second ordering is also applied. This is okay though, the outcome doesn't change because these values are the same for the alphabetic ordering also.

Here is the complete code:

val priorityList = Seq("Willow", "James", "Ezra")
val priorityOrdering: Ordering[String] =
  Ordering.by(name => Math.floorMod(priorityList.indexOf(name), Int.MaxValue))
val combinedOrdering: Ordering[String] =
  priorityOrdering.orElse(Ordering.String)

val input = Seq("Olivier", "Charlotte", "Willow", "Declan", "Aurora", "Ezra")
val ordered = input.sorted(combinedOrdering)
assert(ordered == Seq("Willow", "Ezra", "Aurora", "Charlotte", "Declan", "Olivier"))

We're almost there. The challenge was to work on any object. Lets wrap it up a bit and also make it work for any type of priority value:

def priorityOrdering[A, B : Ordering](priorityList: Seq[B], by: A => B): Ordering[A] = {
  def priorityValue(b: B): Int = Math.floorMod(priorityList.indexOf(b), Int.MaxValue)
  Ordering.by[A, Int](a => priorityValue(by(a))).orElseBy(by)
}

We can use like this:

val ordered = input.sorted(priorityOrdering[String, String](priorityList, identity))

Or like this. Here we sort persons by birthdate, ordering today before the other days:

case class Person(name: String, birthdate: MonthDay)
val persons: Seq[Person] = ???

val priorityDates = Seq(MonthDay.now())
persons.sorted(priorityOrdering(priorityDates, (_: Person).birthdate))

Some remarks

Note that in all these cases type derivation is quite awful. The compiler has problems finding the correct types, even though all the information is available.

You should also know that you can't use this approach if you are stuck on Scala 2.12 or earlier since Ordering.orElse is not available there.

Alternative

You can side-step all the type derivation problems by using the sortBy method. Give it a function that returns a tuple of Ints, Strings, or anything for which an Ordering is already defined. The sequence is then sorted on the first value of the tuple, then on the second value, etc.:

def priorityValue(name: String): Int =
  Math.floorMod(priorityList.indexOf(name), Int.MaxValue)
val ordered = input.sortBy(name => (priorityValue(name), name))

Conclusion

Although I had fun learning all about Ordering, next time I'll avoid it and go directly for sortBy.

Monday, January 10, 2022

From Adoptopenjdk to Temurin on a Mac using Homebrew

Adoptopenjdk joined the Eclipse foundation and renamed their JDK to Temurin. Here are instructions on how to migrate on Macs with Homebrew.

The following instructions removes Adoptopenjdk JDKs you may still have:

brew remove adoptopenjdk/openjdk/adoptopenjdk8
brew remove adoptopenjdk/openjdk/adoptopenjdk11
brew untap AdoptOpenJDK/openjdk
brew remove adoptopenjdk8
brew remove adoptopenjdk11
brew remove adoptopenjdk

Use /usr/libexec/java_home -V to get an overview of any other JDK you may still have. Just delete what you don't need any more.

Then install Temurin 8, 11 and 17. The first command (brew tap …) is only needed in case you need Temurin 8 or 11:

brew tap homebrew/cask-versions
brew install --cask temurin8
brew install --cask temurin11
brew install --cask temurin

Bonus: execute the following to define aliases that let you easily switch between Java versions:

cat <<-EOF >> ~/.zshrc

# Aliases for switching java version
alias java17="export JAVA_HOME=\$(/usr/libexec/java_home -v 17)"
alias java11="export JAVA_HOME=\$(/usr/libexec/java_home -v 11)"
alias java8="export JAVA_HOME=\$(/usr/libexec/java_home -v 1.8)"
java11
EOF

Are you looking for more power? For example you need to test for many more JDKs? Then maybe Sdkman is something for you.