tag:blogger.com,1999:blog-278767652024-03-21T19:09:34.179+01:00Day to day stuffExperiences from a hard core JVM programmer. Likes to keep things simple.Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.comBlogger140125tag:blogger.com,1999:blog-27876765.post-43934488501722416232024-01-24T09:13:00.001+01:002024-01-24T09:17:35.244+01:00Scheduling tasks and sharing state with streams<p>
Recently we built a system that needs to perform 2 tasks. Taks 1 runs every 15 minutes,
task 2 runs every 2 minutes. Task 1 kicks off some background jobs (an upload to
BigQuery), task 2 checks upon the results of these background jobs and does some cleanup
when they are done (delete the uploaded files). The two tasks need to share information back and forth about what
jobs are running in the background.
</p><p>
Now think to yourself (ignore the title for now đ), what would be the most elegant
way to implement this? I suspect that most developers will come with a solution that
involves some locking, synchronisation and global state. For example by sharing the
information through a transactional database, or by using a semaphore to prevent the two
tasks from running at the same time plus sharing information in a global variable.
This is understandable, most programming environments do not provide better
techniques for these kinds of problems at all!
</p><p>
However, if your environment supports streams and has some kind of scheduling, here are
two tricks you can use: one for the scheduling of the tasks, the second for sharing
information without a global variable.
</p><p>
Here is an example for the first written in Scala using the ZIO streams library. Read on for an explanation.
</p>
<div class="codeblock"><code class="scala">import zio._
import zio.stream._
def performTask1: Task[Unit] = ???
def performTask2: Task[Unit] = ???
// An enumeration (scala 2 style) for our task.
sealed trait BusinessTask
object Task1 extends BusinessTask
object Task2 extends BusinessTask
ZStream.mergeAllUnbounded()(
ZStream.fromSchedule(Schedule.fixed(15.minutes)).as(Task1),
ZStream.fromSchedule(Schedule.fixed(2.minutes)).as(Task2)
)
.mapZIO {
case Task1 => performTask1
case Task2 => performTask2
}
.runDrain</code></div>
<p>
We create 2 streams, each stream contains sequential numbers, emitted upon a schedule.
As you can see, the schedule corresponds directly with the requirements. We do not really
care for the sequential numbers, so with stream operator <tt>as</tt> we convert the stream's
emitted values to a value from the <tt>BusinessTask</tt> enumeration.
</p><p>
Then we merge the two streams. We now have a stream that emits the two
enumeration values at the time the corresponding task should run. This is already a big
win! Even when the two schedules produce an item at the same time, the tasks will run
sequentially. This is because by default streams are evaluated without parallelism.
</p><p>
We are not there yet though. The tasks need to share information. They could access a
shared variable but then we still have tightly coupled components and no guarantees
that the shared variable is used correctly.
</p><p>
Also, wouldn't it be great if <tt>performTask1</tt> and <tt>performTask2</tt> are functions that can
be tested in isolation? With streams this is possible.
</p><p>
Here is the second part of the idea. Again, read on for an explanation.
</p>
<div class="codeblock"><code class="scala">case class State(...)
val initialState = State(...)
def performTask1(state: State): Task[State] = ???
def performTask2(state: State): Task[State] = ???
ZStream.mergeAllUnbounded()(
ZStream.fromSchedule(Schedule.fixed(15.minutes)).as(Task1),
ZStream.fromSchedule(Schedule.fixed(2.minutes)).as(Task2)
)
.scanZIO(initialState) { (state, task) =>
task match {
case Task1 => performTask1(state)
case Task2 => performTask2(state)
}
}
.runDrain</code></div>
<p>
We have changed the signatures of the <tt>performTask*</tt> methods. Also, the <tt>mapZIO</tt> operator
has been replaced with <tt>scanZIO</tt>. The stream operator <tt>scanZIO</tt> works much like
<tt>foldLeft</tt> on collections. Like <tt>foldLeft</tt>, it accepts an initial state, and a function
that combines the accumulated state plus the next stream element (of type <tt>BusinessTask</tt>)
and converts those into the next state.
</p><p>
Stream operator <tt>scanZIO</tt> also emits the new states. This allows further common processing.
For example we can persist the state to disk, or collect custom metrics about the state.
</p>
<h4>Conclusion</h4>
<p>
Using libraries with higher level constructs like streams, we can express straightforward
requirements in a straightforward way. With a few lines of code we have solved the
scheduling requirement, and showed an elegant way of sharing information between tasks
without global variables.
</p>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-10118344911227862242023-11-26T09:58:00.001+01:002023-11-26T09:58:56.827+01:00Discovering scala-cli while fixing my digital photo archive<p>
Over the years I built up a nice digital photo library with my family. It is a messy process. Here are some of the things that can go wrong:
</p>
<ul>
<li>Digital cameras that add incompatible exif metadata.</li>
<li>Some files have exif tag <tt>CreateDate</tt>, others <tt>DateTimeOriginal</tt>.</li>
<li>Images shared via Whatsapp or Signal do not have an exif date tag at all.</li>
<li>Wrong rotation.</li>
<li>Fuzzy, yet memorable jpeg images wich take 15MB because of their resolution and high quality settings.</li>
<li>Badly supported ancient movie formats like <tt>3gp</tt> and <tt>RIFF AVI</tt>.</li>
<li>Old movie formats that need 3 times more disk space than h.265.</li>
<li>Losing almost all your photos because you thought you could copy an Iphoto library using <tt>tar</tt> and <tt>cp</tt> (hint: you can’t). (This took a low-level harddisk scan and months of manual de-duplication work to recover the photos.)</li>
<li>Another low-level scan of an SD card to find accidentally deleted photos.</li>
<li>Date in image file name corresponds to import date, not creation date.</li>
<li>Weird file names that order the files differently than from creation date.</li>
<li>Images from 2015 are stored in the folder for 2009.</li>
<li>etc.</li>
</ul>
<p>
I wrote countless <tt>bash</tt> scripts to mold the collection into order. Unfortunately, to various success. However, now that I am ready to import the library into <a href="https://immich.app/">Immich</a> (please, do <a href="https://github.com/immich-app/immich#donation">sponsor</a> them, they are building a very nice product!), I decided to start cleaning up everything.
</p>
<p>
So there I was, writing yet another bash script, struggling with parsing a date response from <tt>exiftool</tt>. And then I remembered the recent articles about <a href="https://scala-cli.virtuslab.org/">scala-cli</a> and decided to try it out.
</p>
<p>
<b>The experience was amazing!</b> Even without proper IDE support, I was able to crank out scripts that did more, more accurately and faster than I could ever have accomplished in bash.
</p>
<p>
Here are some of the things that I learned:
</p>
<ul>
<li>Take the time to learn <a href="https://github.com/com-lihaoyi/os-lib">os-lib</a>.</li>
<li>If the scala code gets harder to write, open a proper IDE and use code completion. Then copy the code over to your <tt>.sc</tt> file.</li>
<li>One well placed <tt>.par</tt> (using <a href="https://github.com/scala/scala-parallel-collections">scala-parallel-collections</a>) can more than quadruple the performance of your script!</li>
<li>You will still spend a lot of time parsing the output from other programs (like <tt>exiftoool</tt>).</li>
<li>Scala-cli scripts run very well from Github actions as well.</li>
</ul>
<h2>Conclusions</h2>
<p>
Next time you open your editor to write a bash file, think again. Perhaps you should really write some scala instead.
</p>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-57311837130899840542023-10-08T09:15:00.003+02:002023-10-08T09:20:20.170+02:00Dependabot, Gradle and Scala<p>
Due to a series of unfortunate circumstances we have to deal with a couple of projects that use Gradle as build tool at work.
For these projects we wanted automatic PR generation for updated dependencies.
Since we use Github Enterprise, using Dependabot seems logical.
However, this turned out to be not very straightforward.
This article documents one way that works for us.
</p>
<p>
As we were experimenting with Dependabot, we discovered the following rules:
</p>
<ol>
<li>The scala version in the artifact name must not be a variable.</li>
<li>A variable for the artifact version is fine, but it must be declared in the same file in the <tt>ext</tt> block.</li>
<li>Versions should follow the Semver specification.</li>
<li>You must not use Gradle’s <tt>+</tt> version range syntax anywhere, Maven’s version range syntax is fine.</li>
</ol>
<p>
In our projects the scala version comes from a plugin.
In addition, we sometimes need to cross build for different scala versions, very much at odds with rule no. 1.
We solved this with a switch statement.
</p>
<p>
With these rules and constraints we discovered that the following structure works for us and Dependabot:
</p>
<div class="codeblock"><code class="java">ext {
jacksonVersion = '2.15.2'
scalaTestVersion = '3.0.8'
}
dependencies {
switch(scalaMainVersion) {
case "2.12":
implementation "com.fasterxml.jackson.module:jackson-module-scala_2.12:$jacksonVersion"
testImplementation "org.scalatest:scalatest_2.12:$scalaTestVersion"
break
case "2.13":
implementation "com.fasterxml.jackson.module:jackson-module-scala_2.13:$jacksonVersion"
testImplementation "org.scalatest:scalatest_2.13:$scalaTestVersion"
break
default:
break
}
// implementation 'com.example:library:0.8+' // Don't do this
implementation 'com.example:library:[0.8,1.0[' // This is fine
}</code></div>
<p>
It took 3 people a month to slowly discover this solution (thank you!).
I hope that you, dear reader, will spend your time more productive.
</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-30559928158543673432023-04-20T14:22:00.012+02:002023-04-23T20:21:39.814+02:00Zio-kafka hacking day<p>
Not long ago I contacted Steven (committer of the <a href="https://github.com/zio/zio-kafka">zio-kafka</a> library) to get some better understanding of how the library works. April 12, not more than 2 months later I am a committer, and I was sitting in a room together with Steven, Jules Ivanic (another committer) and wildcard Pierangelo Cecchetto (contributor), hacking on zio-kafka.
</p>
<p>
The meeting was an idea of Jules who was âin the neighborhoodâ. He was traveling from Australia for his company (<a href="https://www.conduktor.io/">Conduktor</a>). We were able to get a nice room in the Amsterdam office of my employer (<a href="https://www.adevinta.com/">Adevinta</a>). Amsterdam turned out to be a nice middle ground for Steven, me and Pierangelo. (Special thanks to <a href="https://godatadriven.com/">Go Data Driven</a> who also had place for us.)
</p>
<p>
In the morning we spoke about current and new ideas on how to improve the library. Also, we shared detailed knowledge on ZIO and what Kafka expects from its users. After lunch we started hacking. Having someone to start an ad hoc discussion turned out to be very productive; we were able to move some tough issues forward.
</p>
<p>
Here are some highlights.
</p>
<p>
<a href="https://github.com/zio/zio-kafka/pull/788">PR #788 â Wait for stream end in rebalance listener</a> is important to prevent duplicates during a rebalance process. This PR was mostly finished for quite some time, but many details made the extensive test suite fail. We were able to solve many of these issues.
</p>
<p>
In the area of performance we implemented an idea to replace buffering (pre-fetching a fixed number of polls), with pre-fetching based on the streamâs queue size. This resulted in <a href="https://github.com/zio/zio-kafka/pull/803">PR #803 â Alternative backpressure mechanism</a>.
</p>
<p>
We also laid the seeds for another performance improvement implementation: <a href="https://github.com/zio/zio-kafka/pull/809">PR #908 â Optimistically resume partitions early</a>.
</p>
<p>
These last two PRs showed great performance improvements bringing us much closer to direct usage of the Java Kafka client. All 3 PRs are now in review.
</p>
<p>
All in all it was a lot of fun to meet fellow enthusiasts and hack on the complex machinery that is inside zio-kafka.
</p>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-83809933083801068982023-01-29T10:20:00.004+01:002023-01-29T10:25:13.961+01:00Kafka is good for transport, not for system boundaries<article>
<p>In the last years I have learned that you should not run Kafka as a system boundary. A system boundary in this
article is the place where messages are passed from one
<abbr title="An autonomy domain is a software (sub-)system that is governed by a team that does their own planning and will therefore schedule their work independently from other teams.">autonomy domain</abbr>
to another.
</p>
<p>Now why is that? Let’s look at two classes of problems: connecting to Kafka and the long feedback loop. To
prove my points, I am going to bore you with long stories from my personal experience. You may be in a different
situation, YMMV!
</p>
<h2 id="problem-1-connecting-to-kafka-is-hard">Problem 1: Connecting to Kafka is hard</h2>
<p>Compared to calling an HTTP endpoint, sending messages to Kafka is much much
harder<a href="#high-volume-http" aria-describedby="footnote-label" id="high-volume-http-ref"></a>.
</p>
<p>Don’t agree? Watch out for observation bias! During my holiday we often have long high-way drives through
unknown countries. After looking at a highway for several hours non-stop, you might be inclined to believe that
the entire country is covered by a dense highway network. In reality though, the next highway might be 200km
away. A similar thing can happen at work. My part of the company offers Kafka as a service. We also run several
services that invariable use Kafka in some way. We have deep knowledge and experience. It would be easy to think
that Kafka is simple for everyone. However, for the rest of the company this Kafka thing is just another far
away system that they have to integrate with and knowledge will be spotty and incomplete.
</p>
<p>Let’s look at some of the problems that you have to deal with.
</p>
<h3 id="partitioning-is-hard">Partitioning is hard</h3>
<p>It is easier to deal with partitioning problems when you control both the producer and the broker. We once had a
problem where our systems could not keep up with the inflow of Kafka messages for one of the producers. The
weird
thing is that most of the machines were just idling. The problem grew slowly, so it took us some time before we
realized it was caused by some partitions having most of the traffic. Producers of Kafka events do not always
realize the effect of wrongly chosen key values. When many messages have the same key they end up in the same
partition. It took some time before we got across that they needed to change the message key.
</p>
<p>When you run an HTTP endpoint, spreading traffic and
partitioning<a href="#sticky-session" aria-describedby="footnote-label" id="sticky-session-ref"></a>
is handled by the load-balancer and is therefore under control of the receiver and not the sender.
</p>
<h3 id="cross-network-connections-are-hard">Cross network connections are hard</h3>
<p>Producers and the Kafka brokers need to have the same view of the network. This is because the brokers will tell
a producer to which broker (by DNS name or IP address) it needs to connect to for each partition. This might go
wrong when the producers and brokers use a different DNS server, or when they are on networks with colliding IP
address ranges<a href="#both" aria-describedby="footnote-label" id="both-ref"></a>. Getting this right is a lot
easier when you’re running everything in a single network you control.
</p>
<p>This is not a problem with HTTP endpoints. Producers only need 1 hostname and optionally an HTTP proxy.
</p>
<p>We didn’t talk about authentication and encryption yet. Kafka is very flexible; it has many knobs and
settings in this area and the producers have to be configured exactly right or else it just won’t work.
And don’t expect good error messages<a href="#oom" aria-describedby="footnote-label" id="oom-ref"></a>.
Good documentation and cooperation is required to make this work across different teams.
</p>
<p>With HTTP endpoints, encryption is very well-supported through https. Authentication is straight forward with
HTTP’s basic authentication<a href="#no-oauth" aria-describedby="footnote-label" id="no-oauth-ref"></a>.
</p>
<h2 id="problems-that-have-been-solved">Problems that have been solved</h2>
<p>Just for completeness here are some problems from around 2019 that have since been solved.
</p>
<p>Around 2019 Kafka did not support authentication and TLS out of the box. Crossing untrusted networks was quite
cumbersome.
</p>
<p>Also around that time you had to be very careful about versioning. The client and server had to be upgraded in a
very controlled order. Today this looks much better; you can combine almost any client and server version.
</p>
<p>The default partitioner would give slow brokers more, instead of less work.
<a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-794%3A+Strictly+Uniform+Sticky+Partitioner">This
has been solved</a> a few months ago.
</p>
<h2 id="problem-2-long-feedback-loop">Problem 2: Long feedback loop</h2>
<p>When messages are being given to you via Kafka, you can not reject them. They are send and forget, the producer
no longer cares. Dealing with invalid messages is now your responsibility.
</p>
<p>In one of our projects we used to set invalid messages apart and offer Slack alerts so that the producers knew
they had to look at the validation errors. Unfortunately, it didn’t work well. The feedback loop was
simply too long and the number of invalid messages stayed high.
</p>
<p>Later we introduced an HTTP endpoint in which we reject invalid messages with a 400 response. This simple change
was nothing less than a miracle. For every producer that switched the vast majority of
invalid messages<a href="#fixing" aria-describedby="footnote-label" id="fixing-ref"></a>
disappeared. The number of invalid messages has remained very low since then.
</p>
<p>Because we were able to reject invalid messages the feedback loop shortened and became much more effective.
</p>
<h2 id="conclusions">Conclusions</h2>
<p>Kafka within your own autonomy domain can be a great solution for message transport. However, Kafka as a boundary
between autonomy domains will hurt.
</p>
<div class=âarticle-footerâ>
<h2 id="footnote-label">Footnotes</h2>
<ol>
<li id="high-volume-http">Though at high enough volume, HTTP is not easy either; you’ll need proper
connection pooling and an endpoint that accepts batches or else deploy a huge server park.
<a href="#high-volume-http-ref" aria-label="Back to content">â©</a></li>
<li id="sticky-session">Many load balancers offer sticky sessions which is a weak form of partitioning.
<a href="#sticky-session-ref" aria-label="Back to content">â©</a></li>
<li id="both">We suffered both. <a href="#both-ref" aria-label="Back to content">â©</a></li>
<li id="oom">When your authentication settings are wrong, the Kafka command line tools tell you that by
showing an OutOfMemoryError. My head still hurts from this one.
<a href="#oom-ref" aria-label="Back to content">â©</a></li>
<li id="no-oauth">Though unfortunately, many architects will make this complex by using oauth or other such
systems. <a href="#no-oauth-ref" aria-label="Back to content">â©</a></li>
<li id="fixing">Most invalid messages could be fixed with a few minutes of coding time.
<a href="#fixing-ref" aria-label="Back to content">â©</a></li>
</ol>
</div>
</article>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-45032717870956593792022-12-04T20:28:00.000+01:002022-12-04T20:28:14.609+01:00ZIO service layer pattern<p>
While reading about <a href="https://degoes.net/articles/zio-config">ZIO-config in 2.0.4</a>, the following pattern to create services caught my eye. I am copying it here for easy lookup. Enjoy.
</p>
<div class="codeblock"><code class="scala">val myLayer: ZLayer[PaymentRepo, Nothing, MyService] =
ZLayer.scoped {
for {
repo <- ZIO.service[PaymentRepo]
config <- ZIO.config(MyServiceImpl.config)
ref <- Ref.make(MyState.Initial)
impl <- ZIO.succeed(new MyServiceImpl(config, ref, repo))
_ <- impl.initialize
_ <- ZIO.addFinalizer(impl.destroy)
} yield impl
}</code></div>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-17508256956019929702022-11-05T15:58:00.008+01:002024-01-24T09:26:21.176+01:00Speed up ZIOs with memoization<p>
<b>TLDR:</b> You can do ZIO memoization in just a few lines, however, use <a href="https://zio.dev/ecosystem/officials/zio-cache/">zio-cache</a> for more complex use cases.
</p>
<p>
Recently I was working on fetching Avro schema's from a schema registry.
Avro schema's are immutable and therefore perfectly cacheable.
Also, the number of possible schema's is limited so cache evictions are not needed.
We can simply cache every schema for ever in a plain hash-map.
So, we are doing <a href="https://en.wikipedia.org/wiki/Memoization">memoization</a>.
</p>
<p>
Since this was the first time I did this in a ZIO based application, I looked around for existing solutions.
What I wanted is something like this:
</p>
<div class="codeblock"><code class="scala">def fetchSchema(schemaId: String): Task[Schema] = {
val fetchFromRegistry: Task[Schema] = ???
fetchFromRegistry.memoizeBy(key = schemaId)
}
</code></div>
<p>
Frankly, I was a bit disappointed that ZIO does not already support this out of the box.
However, as you'll see in this article, the proposed syntax only works for simple use cases.
(Actually, there is <a href="https://javadoc.io/static/dev.zio/zio_3/2.0.3/zio/ZIO.html#memoize-63f">ZIO.memoize</a> but that is even simpler and only caches the result for a single ZIO instance, not for any instance that gives the same value.)
</p>
<p>
Let's continue anyway and implement it ourselves.
</p>
<p>
The idea is that method <tt>memoizeBy</tt> first looks in a map using the given key.
If the value is not present, we get the result from the original Zio and store it in the map.
If the value is present, it will be used and the original Zio is not executed.
</p>
<p>
A map, yes, we need also need to give the method a map!
The map might be used and updated concurrently.
I choose to wrap an immutable map in a Ref, but you could also use a <tt>ConcurrentMap</tt>.
</p>
<p>Here we go:</p>
<div class="codeblock"><code class="scala">import zio._
import scala.collection.immutable.Map
implicit class ZioMemoizeBy[R, E, A](zio: ZIO[R, E, A]) {
def memoizeBy[B](cacheRef: Ref[Map[B, A]], key: B): ZIO[R, E, A] = {
for {
cache <- cacheRef.get
value <- cache.get(key) match {
case Some(value) => ZIO.succeed(value)
case None => zio.tap(value => cacheRef.update(_.updated(key, value)))
}
} yield value
}
}
</code></div>
<p>
That is it, just a few lines of code to put in some corner of your application.
</p>
<p>
Here is a full example with <tt>memoizeBy</tt> using <a href="https://zio.dev/guides/migrate/zio-2.x-migration-guide/#service-pattern">Service Pattern 2.0</a>:
</p>
<div class="codeblock"><code class="scala">import org.apache.avro.Schema
import zio._
import scala.collection.immutable.Map
trait SchemaFetcher {
def fetchSchema(schemaId: String): Task[Schema]
}
object SchemaFetcherLive {
val layer: ZLayer[Any, Throwable, SchemaFetcher] = ZLayer {
for {
// Create the Ref and the Map:
schemaCacheRef <- Ref.make(Map.empty[String, Schema])
} yield SchemaFetcherLive(schemaCacheRef)
}
}
case class SchemaFetcherLive(
schemaCache: Ref[Map[String, Schema]]
) extends SchemaFetcher {
def fetchSchema(schemaId: String): Task[Schema] = {
val fetchFromRegistry: Task[Schema] = ...
// Use memoizeBy to make fetchFromRegistry more efficient!
fetchFromRegistry.memoizeBy(schemaCache, schemaId)
}
}
</code></div>
<p><b>Discussion</b></p>
<p>
Note how we're using the default immutable Map.
Because it is immutable, all threads can read from the map at the same time without synchronization.
We only need some synchronization using Ref, to atomically replace the map after a new element was added.
</p>
<p>
When multiple requests for the same key come in at roughly the same time, both are executed, and both lead to an update of the map.
This is not as advanced as e.g. <a href="https://zio.dev/ecosystem/officials/zio-cache/">zio-cache</a>, which detects multiple simultaneous requests for the same key.
In the presented use case this is not a problem and very unlikely to happen often anyway.
</p>
<p>
Can we improve further?
Yes, we can!
If you look at method <tt>fetchSchema</tt> in the example, you see that a <tt>fetchFromRegistry</tt> ZIO is constructed, but we do not use it when the value is already present.
And even worse, the value already being present is the common case!
This is not very efficient.
If efficiency is a problem, another API is needed.
Zio-cache does not have this problem.
In zio-cache the cache is aware of how to look up new values (it is a <a href="https://github.com/google/guava/wiki/CachesExplained#from-a-cacheloader">loading cache</a>).
So here is a trade off: efficiency with a more complex API, or highly readable code.
</p>
<p><b>Using zio-cache</b></p>
<p>
For completeness, here is (almost) the same example using zio-cache:
</p>
<div class="codeblock"><code class="scala">import org.apache.avro.Schema
import zio._
import zio.cache.{Cache, Lookup}
trait SchemaFetcher {
def fetchSchema(schemaId: String): Task[Schema]
}
object ZioCacheSchemaFetcherLive {
val layer: ZLayer[SomeService, Throwable, SchemaFetcher] = ZLayer {
for {
someService <- ZIO.service[SomeService]
// the fetching logic can use someService:
fetchFromRegistry: String => Task[Schema] = ???
// create the cache:
cache <- Cache.make(
capacity = 1000,
timeToLive = Duration.Infinity,
lookup = Lookup(fetchFromRegistry)
)
} yield ZioCacheSchemaFetcherLive(cache)
}
}
case class ZioCacheSchemaFetcherLive(
cache: Cache[String, Throwable, Schema]
) extends SchemaFetcher {
def fetchSchema(schemaId: String): Task[Schema] = {
// use the loader cache:
cache.get(schemaId)
}
}</code></div>
<p>
We now need a reference to <tt>fetchFromRegistry</tt> while constructing the layer.
This complicates the code a bit; we can no longer define <tt>fetchFromRegistry</tt> in the case class.
In the example we pull a <tt>SomeService</tt> so that we can put the definition of <tt>fetchFromRegistry</tt> into the for comprehension and stick to Service Pattern 2.0.
Perhaps we should completely move it to another service so that we can write <tt>lookup = Lookup(someService.fetchFromRegistry)</tt>.
That, I'll leave as an exercise to the reader.
</p>
<p><b>Conclusion</b></p>
<p>
For simple use cases like fetching Avro schema's, this article presents an appropriately light weight way to do memoization.
If you need more features such as eviction and detection of concurrent invocations, I recommend zio-cache.
</p>
<h4>Update 2024-01-24</h4>
<p>Here is a version of <tt>cachedBy</tt> that only fetches a value once, even when two fibers request it concurrently.
The second fiber is semantically blocked until the first fiber has produced the value.</p>
<div class="codeblock"><code class="scala">import zio._
object ZioCaching {
implicit class ZioCachedBy[R, E, A](zio: ZIO[R, E, A]) {
def cachedBy[B](cacheRef: Ref[Map[B, Promise[E, A]]], key: B): ZIO[R, E, A] = {
for {
newPromise <- Promise.make[E, A]
actualPromise <- cacheRef.modify { cache =>
cache.get(key) match {
case Some(existingPromise) => (existingPromise, cache)
case None => (newPromise, cache + (key -> newPromise))
}
}
_ <- ZIO.when(actualPromise eq newPromise) {
zio.intoPromise(newPromise)
}
value <- actualPromise.await
} yield value
}
}
}</code></div>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com2tag:blogger.com,1999:blog-27876765.post-56821546167984199492022-06-08T09:35:00.005+02:002022-06-08T09:37:23.842+02:00Zigzag bytes<p>I was playing around with a goofy idea for which I needed <a href="https://en.wikipedia.org/wiki/Variable-length_quantity#Zigzag_encoding">zigzag encoding</a> for <i>bytes</i>. Zigzag encoding is often used in combination with variable length encoding in things like Avro, Thrift and Protobuf.</p>
<p>In zigzag encoded integers, the least significant bit is used for sign. To convert from regular encoding (2-complement) to zigzag (and back) you can use the following Scala code:</p>
<div class="codeblock"><code class="scala">def i32ToZigZag(n: Int): Int = (n << 1) ^ (n >> 31)
def zigZagToI32(n: Int): Int = (n >>> 1) ^ - (n & 1)
def i64ToZigZag(n: Long): Long = (n << 1) ^ (n >> 63)
def zigZagToI64(n: Long): Long = (n >>> 1) ^ - (n & 1)
</code></div>
<p>Translate this to Java and the expressions after the <tt>=</tt> look exactly the same.</p>
<p>Using these bit shifting tricks for bytes is a whole lot more difficult. The problem is that Scala (like Java) does not support bit operations on <tt>Byte</tt>s. They always convert them to an <tt>Int</tt> first.</p>
<p>After a lot of fiddling, I settled on the following:</p>
<div class="codeblock"><code class="scala">private def b(i: Int): Byte = (i & 0xff).toByte
def i8ToZigZag(n: Byte): Byte = (b(n << 1) ^ (n >> 7)).toByte
def zigZagToI8(n: Byte): Byte = b(((n & 0xff) >>> 1) ^ (256 - (n & 1)))
</code></div>
<p>Translated to Java it should look like this (not tested!):</p>
<div class="codeblock"><code class="java">private byte b(int i) { return (byte)(i & 0xff); }
public byte i8ToZigZag(byte n) { return (byte)(b(n << 1) ^ (n >> 7)); }
public byte zigZagToI8(byte n) { return b(((n & 0xff) >>> 1) ^ (256 - (n & 1))); }
</code></div>
<p>Is there a better way to do this?</p>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-81850729996066938802022-03-12T17:09:00.008+01:002023-12-06T19:57:27.689+01:00Upgrading Libreoffice with Homebrew<p><b>Update 2023-06-06</b> brew now asks for your password so it can install everything directly. Much better!!<p>
<p>The text below is no longer applicable and only kepts as reference.</p>
<hr/>
<p>Reminder to self: this is the procedure to upgrade Libreoffice with Homebrew:
<ol>
<li><tt>brew update</tt></li>
<li><tt>brew upgrade</tt></li>
<li><tt>open -a /Applications/LibreOffice.app</tt></li>
<li>Quit the application</li>
<li><tt>brew reinstall libreoffice-language-pack</tt>, enter your password</li>
<li><tt>open "/usr/local/Caskroom/libreoffice-language-pack/$(cd /usr/local/Caskroom/libreoffice-language-pack; ls -1 | sort -rV | head)/LibreOffice Language Pack.app"</tt>, click 'Ok', click 'Ok'</li>
</ol>
</p>
<p>Please anyone, please make this simpler...</p>
<p><b>Update 2021-03-24</b><p>
<p>Here is a script to remove most of the manual toil:<p>
<div class="codeblock"><code class="shell">#!/bin/bash
echo "Initiating Libre Office upgrade"
brew update
brew upgrade libreoffice
open -g -a /Applications/LibreOffice.app
echo "Wait until the application completed startup (it is started in the background)"
read -p "Press enter to quit LibreOffice"
osascript -e 'quit app "LibreOffice"'
APP=$(brew reinstall libreoffice-language-pack | tee /dev/tty | grep "LibreOffice Language Pack.app" | xargs)
open -a "$APP"
echo "The language pack installer has been opened."</code></div>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-1796488393606049202022-01-26T14:17:00.000+01:002022-01-26T14:17:16.939+01:00Having fun with Ordering in Scala<p><b>Challenge:</b> sort a list of objects by name, but some names have priority. If these names appear, they should be ordered by the position they have in the priority list.</p>
<p>For example:</p>
<div class="codeblock"><code class="scala">val priorityList = Seq("Willow", "James", "Ezra")
val input = Seq("Olivier", "Charlotte", "Willow", "Declan", "Aurora", "Ezra")
val ordered = ???
assert(ordered == Seq("Willow", "Ezra", "Aurora", "Charlotte", "Declan", "Olivier"))
</code></div>
<p>Challenge accepted.</p>
<p>Luckily Scala has strong support for sorting in the standard library. All sequences have a <tt>sorted</tt> method which accepts an <tt>Ordering</tt>. The ordering is an implicit parameter which means that normally we don't need to provide it; it will be derived to the natural ordering of the items. However, we are going to provide this parameter explicitly. Let's build an <tt>Ordering</tt>!</p>
<p>The challenge explains we have 2 orderings:
<ol>
<li>first order by the priority list</li>
<li>failing that, order by alphabet</li>
</ol>
</p>
<p>Let's focus on the first ordering. The idea is to assign an integer 'priority-value' to each possible string that is based on the position in the priority list. If the string is not in the list, we use some high integer. The first ordering will simply order by this priority-value. Lower numbers go before higher number, just like the natural ordering of integers.</p>
<div class="codeblock"><code class="scala">// attempt 1
val priorityValue = priorityList.indexOf(name)
</code></div>
<p>This works well for any name on the priority list. E.g. <tt>Willow</tt> gets <tt>0</tt> and <tt>Ezra</tt> gets <tt>2</tt>. Unfortunately, all the other names get priority value <tt>-1</tt> which orders them even <em>before</em> <tt>Willow</tt>. We need to convert the <tt>-1</tt> to something higher.<br>
Since I like to program without <tt>if</tt> statements whenever possible, I looked at math for a solution. Modulus can do the trick:</p>
<div class="codeblock"><code class="scala">// attempt 2
val priorityValue = priorityList.indexOf(name) % priorityList.size
</code></div>
<p>Oops, wrong modulus implementation: <tt>-1 % 3 == -1</tt>. Let's use <tt>floorMod</tt>:</p>
<div class="codeblock"><code class="scala">// attempt 3
val priorityValue = Math.floorMod(priorityList.indexOf(name), priorityList.size)
</code></div>
<p>Now <tt>-1</tt> gets converted into <tt>priorityList.size</tt> which is definitely higher than the other priority-values. However, since we don't really care what the higher number is we can just use <tt>Int.MaxValue</tt>:</p>
<div class="codeblock"><code class="scala">val priorityValue = Math.floorMod(priorityList.indexOf(name), Int.MaxValue)
</code></div>
<p>Now we wrap that in an <tt>Ordering</tt>:</p>
<div class="codeblock"><code class="scala">val priorityOrdering: Ordering[String] =
Ordering.by(name => Math.floorMod(priorityList.indexOf(name), Int.MaxValue))
</code></div>
<p>Unfortunately Scala can't derive the type. We either need to type the value directly, or add a type on the <tt>name</tt> parameter.</p>
<p>Now we need to add the second ordering. We can simply use <tt>Ordering.String</tt> from the library. We combine the orderings with <tt>orElse</tt>, available since Scala 2.13. The second ordering is used when <tt>priorityOrdering</tt> can't decide because the priority-value is the same.<br>
Note that for the special case where we compare a priority value with itself, e.g. <tt>Willow</tt> with <tt>Willow</tt>, the second ordering is also applied. This is okay though, the outcome doesn't change because these values are the same for the alphabetic ordering also.</tt></p>
<p>Here is the complete code:</p>
<div class="codeblock"><code class="scala">val priorityList = Seq("Willow", "James", "Ezra")
val priorityOrdering: Ordering[String] =
Ordering.by(name => Math.floorMod(priorityList.indexOf(name), Int.MaxValue))
val combinedOrdering: Ordering[String] =
priorityOrdering.orElse(Ordering.String)
val input = Seq("Olivier", "Charlotte", "Willow", "Declan", "Aurora", "Ezra")
val ordered = input.sorted(combinedOrdering)
assert(ordered == Seq("Willow", "Ezra", "Aurora", "Charlotte", "Declan", "Olivier"))
</code></div>
<p>We're almost there. The challenge was to work on any object. Lets wrap it up a bit and also make it work for any type of priority value:</p>
<div class="codeblock"><code class="scala">def priorityOrdering[A, B : Ordering](priorityList: Seq[B], by: A => B): Ordering[A] = {
def priorityValue(b: B): Int = Math.floorMod(priorityList.indexOf(b), Int.MaxValue)
Ordering.by[A, Int](a => priorityValue(by(a))).orElseBy(by)
}
</code></div>
<p>We can use like this:</p>
<div class="codeblock"><code class="scala">val ordered = input.sorted(priorityOrdering[String, String](priorityList, identity))
</code></div>
<p>Or like this. Here we sort persons by birthdate, ordering today before the other days:</p>
<div class="codeblock"><code class="scala">case class Person(name: String, birthdate: MonthDay)
val persons: Seq[Person] = ???
val priorityDates = Seq(MonthDay.now())
persons.sorted(priorityOrdering(priorityDates, (_: Person).birthdate))
</code></div>
<p><b>Some remarks</b></p>
<p>Note that in all these cases type derivation is quite awful. The compiler has problems finding the correct types, even though all the information is available.</p>
<p>You should also know that you can't use this approach if you are stuck on Scala 2.12 or earlier since <tt>Ordering.orElse</tt> is not available there.</p>
<p><b>Alternative</b></p>
<p>You can side-step all the type derivation problems by using the <tt>sortBy</tt> method. Give it a function that returns a tuple of <tt>Int</tt>s, <tt>String</tt>s, or anything for which an <tt>Ordering</tt> is already defined. The sequence is then sorted on the first value of the tuple, then on the second value, etc.:</p>
<div class="codeblock"><code class="scala">def priorityValue(name: String): Int =
Math.floorMod(priorityList.indexOf(name), Int.MaxValue)
val ordered = input.sortBy(name => (priorityValue(name), name))
</code></div>
<p><b>Conclusion</b></p>
<p>Although I had fun learning all about <tt>Ordering</tt>, next time I'll avoid it and go directly for <tt>sortBy</tt>.</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-32328591035242361362022-01-10T14:21:00.005+01:002022-01-10T15:12:20.783+01:00From Adoptopenjdk to Temurin on a Mac using Homebrew<p>Adoptopenjdk joined the Eclipse foundation and renamed their JDK to Temurin. Here are instructions on how to migrate on Macs with Homebrew.</p>
<p>The following instructions removes Adoptopenjdk JDKs you may still have:</p>
<div class="codeblock"><code class="shell">brew remove adoptopenjdk/openjdk/adoptopenjdk8
brew remove adoptopenjdk/openjdk/adoptopenjdk11
brew untap AdoptOpenJDK/openjdk
brew remove adoptopenjdk8
brew remove adoptopenjdk11
brew remove adoptopenjdk
</code></div>
<p>Use <tt>/usr/libexec/java_home -V</tt> to get an overview of any other JDK you may still have. Just delete what you don't need any more.</p>
<p>Then install Temurin 8, 11 and 17. The first command (<tt>brew tap âŠ</tt>) is only needed in case you need Temurin 8 or 11:</p>
<div class="codeblock"><code class="shell">brew tap homebrew/cask-versions
brew install --cask temurin8
brew install --cask temurin11
brew install --cask temurin
</code></div>
<p>Bonus: execute the following to define aliases that let you easily switch between Java versions:</p>
<div class="codeblock"><code class="shell">cat <<-EOF >> ~/.zshrc
# Aliases for switching java version
alias java17="export JAVA_HOME=\$(/usr/libexec/java_home -v 17)"
alias java11="export JAVA_HOME=\$(/usr/libexec/java_home -v 11)"
alias java8="export JAVA_HOME=\$(/usr/libexec/java_home -v 1.8)"
java11
EOF
</code></div>
<p>Are you looking for more power? For example you need to test for many more JDKs? Then maybe <a href="https://sdkman.io/">Sdkman</a> is something for you.</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-12412615250160144472021-12-19T17:44:00.004+01:002021-12-19T17:52:24.278+01:00Customizing the Jitsi Meet UI in a Docker deployment<p>
I manage a Jitsi instance for a small for-benefit organization. I wanted
to make some changes to the UI to make it visually belong to the organization.
Unfortunately, Jitsi doesn't make it easy to do this. Upon every upgrade
<a href="https://github.com/jitsi/jitsi-meet/issues/7354"
>your changes are gone</a
>. This post describes a workaround for Jitsi deployments that use Docker.
</p>
<p>
Although the details can be hairy, the idea is quite simple. We are going
to put another layer over the provided Docker image called 'web'. The additional layer
contains all the changes we need. When Jitsi publishes an update, we just
apply the changes again as part of the deployment process.
</p>
<p>
Our starting point is the <tt>docker-compose.yaml</tt> provided by
<a href="https://docs.easyjitsi.com/docs/docker">Jitsi</a>. Make all
the changes as instructed. However, before we make any changes to the UI,
you should make sure your Jitsi instance is working.
</p>
<p>Its working? Congratulations! Start with a small change to your <tt>docker-compose.yaml</tt>.<br>Replace:</p>
<div class="codeblock"><code class="yaml"> web:
image: jitsi/web:latest</code>
</div>
<p>with:</p>
<div class="codeblock"><code class="yaml"> web:
build: ./jitsi-web</code>
</div>
<p>This sets you up for building your own Docker image.</p>
<p>Create the <tt>jitsi-web</tt> directory, and put all the artwork
you want to override in it. You should end up with a directory
structure like this (more details follow):</p>
<div class="separator" style="clear: both;"><a href="https://www.grons.nl/day2daystuff/20211219_jitsi-docker-overrides.png" style="display: block; padding: 1em 0; text-align: center; "><img alt="Directory structure" border="0" width="320" data-original-height="642" data-original-width="658" src="https://www.grons.nl/day2daystuff/20211219_jitsi-docker-overrides.png"/></a></div>
<p>The <tt>Dockerfile</tt> in the <tt>jitsi-web</tt> initially has just one line like this:</p>
<div class="codeblock"><code class="plaintext">FROM jitsi/web</code></div>
<p>Build the image and deploy it with:</p>
<div class="codeblock"><code class="shell">docker-compose build --pull
docker-compose up -d
</code></div>
<p>Make sure that Jitsi is still working.</p>
<p>Now it is your turn to get creative. With some <tt>RUN</tt>
instructions you can change any file in the base image.</p>
<p>To get you started, I'll show what is in my <tt>Dockerfile</tt>. Details are discussed directly below:</p>
<div class="codeblock"><code class="plaintext">FROM jitsi/web
# Add wasm mime type
# https://community.jitsi.org/t/loading-wasm-webassembly-file-on-jitsimeetjs/68071/3
RUN sed -i '/}/i \
application/wasm wasm;' /etc/nginx/mime.types
# Replace/add some images
COPY --chown=root:root overrides /usr/share/jitsi-meet/
RUN sed -i "s|\(// \)\?defaultLanguage:.*|defaultLanguage: 'nl',|" /defaults/config.js; \
sed -e 's/welcome-background.png/welcome-background.jpg/' \
-e 's|.deep-linking-mobile .header{width:100%;height:70px;background-color:#f1f2f5;text-align:center}|.deep-linking-mobile .header{width:100%;height:70px;background-color:#003867;text-align:center}|' \
-e 's|.deep-linking-mobile .header .logo{margin-top:15px;margin-left:auto;margin-right:auto;height:40px}|.deep-linking-mobile .header .logo{margin-top:10px;margin-left:auto;margin-right:auto;height:50px}|' \
-i /usr/share/jitsi-meet/css/all.css; \
sed -e 's|"headerTitle": "Jitsi Meet"|"headerTitle": "Mijn Organisatie"|' \
-e 's|"headerSubtitle": "Veilige vergaderingen van hoge kwaliteit"|"headerSubtitle": "Wij vergaderen online!"|' \
-i /usr/share/jitsi-meet/lang/main-nl.json; \
sed -e 's|C().createElement("h1",{className:"header-text-title"},t("welcomepage.headerTitle"))|C().createElement("h1",{className:"header-text-title"},C().createElement("img",{src:"images/logo-deep-linking.png",alt:"Mijn Organisatie",height:100}))|' \
-e 's|"headerTitle":"Jitsi Meet"|"headerTitle":"Mijn Organisatie"|' \
-e 's|"headerSubtitle":"Secure and high quality meetings"|"headerSubtitle":"Wij vergaderen online!"|' \
-i /usr/share/jitsi-meet/libs/app.bundle.min.js; \
sed -e "s|\bAPP_NAME: .*|APP_NAME: 'Mijn Organisatie Jitsi',|" \
-e "s|\bPROVIDER_NAME: .*|PROVIDER_NAME: 'Mijn Organisatie Cloud',|" \
-e "s|\bDEFAULT_REMOTE_DISPLAY_NAME: .*|DEFAULT_REMOTE_DISPLAY_NAME: 'Gespreksgenoot',|" \
-e "s|\bDEFAULT_LOCAL_DISPLAY_NAME: .*|DEFAULT_LOCAL_DISPLAY_NAME: 'Ik',|" \
-e "s|\bGENERATE_ROOMNAMES_ON_WELCOME_PAGE: .*|GENERATE_ROOMNAMES_ON_WELCOME_PAGE: false,|" \
-i /defaults/interface_config.js
</code></div>
<p>The first <tt>RUN</tt> instruction adds a line to the Nginx configuration which enables clients to download Wasm files.
Unfortunately this is not yet fixed in Jitsi docker itself (checked at December 2021, but YMMV).</p>
<p>The <tt>COPY</tt> instruction copies your images over the existing stuff. Feel free to add more as needed.</p>
<p>The second <tt>RUN</tt> instruction is where the magic happens. This changes existing files. Let's go through them one by one.</tt>
<p>The first file that gets changes in <tt>/defaults/config.js</tt> where we
set the default language to Dutch.
</p>
<p>The next file that gets changes is <tt>/usr/share/jitsi-meet/css/all.css</tt>.
Normally Jitsi uses a <tt>png</tt> background image on the welcome page but I needed to use a <tt>jpg</tt>
image. The first line takes care of that.
Note that there is no <tt>welcome-background.jpg</tt> image in the base image, but I added it in the
<tt>overrides/images</tt> directory.<br>
The next 2 changes for this file are some small color and layout changes to the welcome page for mobile browsers.
</p>
<p>The next file that gets changes is <tt>/usr/share/jitsi-meet/lang/main-nl.json</tt>. There are many more files
in this directory, one for each language.
</p>
<p>The next one, file <tt>/usr/share/jitsi-meet/libs/app.bundle.min.js</tt> is tricky. This file contains
a fully compiled React application in minified Javascript. The first change you see here replaces the header text with
a header image. The next two lines replace the the default titles with the Dutch version. For some reason
Jitsi initially renders the page in English and then re-renders it in the correct locale. On slow devices
this can take quite some time. I found this quite disturbing, especially for the texts that make your first
impression. By changing some default texts, most of my users (which are Dutch) will see less flapping texts.<br>
This is the file that is most sensitive to changes in the base image. Make sure your tweaks still work after an
upgrade.
</p>
<p>Finally, in <tt>/defaults/interface_config.js</tt> some more settings are tweaked.</p>
<b>Some more tips</b>
<p>Don't worry if you break something. Just fix your changes and re-deploy. A re-deploy is very quick.</p>
<p>Finding out what to change can be pretty hard. Sometimes it helps to extract the file
from the image to see what it contains. First find the file you want to change by opening a
shell in the base image:</p>
<div class="codeblock"><code class="shell">docker-compose exec web bash</code></div>
<p>Extract the file for more detailed inspection with something like this:</p>
<div class="codeblock"><code class="shell">docker-compose exec web cat /usr/share/jitsi-meet/libs/app.bundle.min.js > app.bundle.min.js</code>
</div>
<p><b>Jitsi updates</b></p>
<p>When you see new images appear at <a href="https://hub.docker.com/u/jitsi">Jitsi on docker hub</a>
you can deploy them as follows:</p>
<div class="codeblock"><code class="shell"># Pulls the images that we're not changing (e.g. prosody, jicofo and jvb):
docker-compose pull
# Rebuild the 'web' image, checking for a new base image:
docker-compose build --pull
# Deploy changes:
docker-compose up -d
# Remove old images:
docker image prune
</code></div>
<p>Most of the things that were tweaked here were pretty stable over the last years. But I advice you check anyway.</p>
<p>That's it, go creative!</p>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-74479489872576107942021-12-08T14:36:00.003+01:002021-12-08T17:50:00.013+01:00Akka graceful shutdown - continued<p><a href="https://day-to-day-stuff.blogspot.com/2020/04/akka-http-graceful-shutdown.html">Some time ago</a> I wrote on how to gracefully shutdown Akka HTTP servers, crucial to prevent aborted requests during re-deployments or in elastic (cloud) environments where instances come and go. Look at the <a href="https://day-to-day-stuff.blogspot.com/2020/04/akka-http-graceful-shutdown.html">previous post</a> for more details on how graceful shutdown works and some common caveats in setting it up.</p>
<p>This post refreshes how this works for newer Akka versions, and it gives some tips on how to speed up a shutdown.</p>
<b>Coordinated shutdown</b>
<p>Newer Akka versions have more extensive support for graceful shutdown in the form of <a href="https://doc.akka.io/docs/akka/current/coordinated-shutdown.html">coordinated shutdown</a>. Here we show an example that uses coordinated shutdown to configure a graceful shutdown.</p>
<div class="codeblock"><code class="scala">
import akka.actor.{ActorSystem, CoordinatedShutdown}
import akka.http.scaladsl.Http
import scala.concurrent.duration._
import scala.util.{Failure, Success}
implicit val system: ActorSystem = ???
val logger = ???
val routes: Route = ???
val interface: String = "0.0.0.0"
val port: Int = 80
val shutdownDeadline: FiniteDuration = 5.seconds
Http()
.newServerAt(interface, port)
.bind(routes)
.map(_.addToCoordinatedShutdown(httpShutdownTimeout)) // â that simple!
.foreach { server =>
logger.info(s"HTTP service listening on: ${server.localAddress}")
server.whenTerminationSignalIssued.onComplete { _ =>
logger.info("Shutdown of HTTP service initiated")
}
server.whenTerminated.onComplete {
case Success(_) => logger.info("Shutdown of HTTP endpoint completed")
case Failure(_) => logger.error("Shutdown of HTTP endpoint failed")
}
}
</code></div>
<p>The important line is where we use <tt>addToCoordinatedShutdown</tt>.
What follows is just logging so we know what's going on.</p>
<b>Shutting down more components</b>
<p>You probably have more parts that would benefit from a proper shutdown, e.g. a database connection pool.
Here is an example on how to hook into the coordinated shutdown:
</p>
<div class="codeblock"><code class="scala">// Add this code _before_ construction of the HTTP server
CoordinatedShutdown(system).addTask(
CoordinatedShutdown.PhaseBeforeClusterShutdown,
"database connection pool shutdown"
) { () =>
val dbPoolShutdown: Future[Done] = shutdownDatabase()
dbPoolShutdown.onComplete {
case Success(_) =>
logger.info("Shutdown of database connection pool completed")
case Failure(_) =>
logger.error("Shutdown of database connection pool failed")
}
logger.info("Shutdown of database connection pool was initiated")
dbPoolShutdown
}
</code></div>
<p>Coordinated shutdown consists of multiple phases. Which phases exist is configurable but the default phases are fine for this post.</p>
<p>Our code runs in the phase called <tt>before-cluster-shutdown</tt>.
This phase runs after phase <tt>service-unbind</tt> in which the HTTP service shuts down.</p>
<p><b>Tips to speed up shutdown</b></p>
<p>The default phases need to complete in 10 seconds. If this is challenging for your system, here are 2 tips that might help.</p>
<p>First of all, you need to make sure that any blocking/synchronous code is wrapped in a <tt>blocking</tt>
construct. This will signal to the Akka execution context that it needs to extend its threadpool. This
is especially relevant if you have many shutdown tasks but it is good practice anyway.
For example:<p>
<div class="codeblock"><code class="scala">import akka.Done
import scala.concurrent.blocking
def shutdownDatabase: Future[Done] = {
Future {
blocking {
database.close() // blocking code
logger.info("Database connection closed")
}
Done
}
}
</code></div>
<p>The second thing you could do is only shutdown on a best-effort basis. Closing the connection to a
read-only system is probably not essential.<br>
The trick is to use coordinated shutdown as initiator, but then immediately report completion. For example:<p>
<div class="codeblock"><code class="scala">import akka.Done
import scala.concurrent.blocking
def bestEffortShutdownDatabase: Future[Done] = {
// Starts db close in a Future, but ignore the result
Future {
blocking {
database.close()
}
}
Future.successful(Done)
}
</code></div>
<p>Now, when closing the databse takes too long, Akka won't complain about this in the logs.</p>
<p>For more details see Akka's <a href="https://doc.akka.io/docs/akka/current/coordinated-shutdown.html">coordinated shutdown</a> documentation.</p>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-52504636762913378222020-07-07T12:03:00.003+02:002021-03-21T15:06:15.137+01:00Avoiding Scala's Option.fold<p>Scala's <tt>Option.fold</tt> is a bit nasty to use sometimes because type inference is not always great. Here is a stupid toy example:</p>
<div class="codeblock"><code class="scala">val someValue: Option[String] = ???
val processed = someValue.fold[Either[Error, UserName]](
Left(Error("No user name given"))
)(
value => Right(UserName(value))
)
</code></div>
<p>The <tt>[Either[Error, UserName]]</tt> on <tt>fold</tt> is necessary otherwise the scala compiler can not derive the type.</p>
<p>Here is a really small trick to avoid <tt>Option.fold</tt> when you need to convert an <tt>Option</tt> to an <tt>Either</tt>:</p>
<div class="codeblock"><code class="scala">val someValue: Option[String] = ???
val processed = someValue
.toRight(left = Error("No user name given"))
.map(value => Right(UserName(value)))
</code></div>
<p>Much nicer!</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-19922938668291938012020-04-15T11:29:00.001+02:002020-04-15T11:32:51.195+02:00Traefik v2 enable HSTS, Docker and nextcloud<p>This took me days to figure out how to configure Traefik v2. Here it is for posterity.</p>
<p>This is a docker-compose.yaml fragment to append to a service section:</p>
<div class="codeblock"><code class="yaml"> labels:
- "traefik.enable=true"
- "traefik.http.routers.service.rule=Host(`www.example.com`)"
- "traefik.http.routers.service.entrypoints=websecure"
- "traefik.http.routers.service.tls.certresolver=myresolver"
- "traefik.http.middlewares.servicests.headers.stsincludesubdomains=false"
- "traefik.http.middlewares.servicests.headers.stspreload=true"
- "traefik.http.middlewares.servicests.headers.stsseconds=31536000"
- "traefik.http.middlewares.servicests.headers.isdevelopment=false"
- "traefik.http.routers.service.middlewares=servicests"
</code></div>
<p>It will:
<ul>
<li>tell Traefik to direct traffic for <tt>www.example.com</tt> to this container,</li>
<li>on the <tt>websecure</tt> entrypoint (this is configured statically),</li>
<li>using the <tt>myresolver</tt> (for Acme, resolver also configured statically),</li>
<li>configure middleware to add HSTS headers,</li>
<li>enable the middleware.</li>
</ul>
</p>
<b>Nextcloud</b>
<p>Here is a slightly more complex example for a nextcloud deployment which includes the recommended redirects.</p>
<div class="codeblock"><code class="yaml"> labels:
- "traefik.enable=true"
- "traefik.http.routers.nextcloud.rule=Host(`nextcloud.example.com`)"
- "traefik.http.routers.nextcloud.entrypoints=websecure"
- "traefik.http.routers.nextcloud.tls.certresolver=myresolver"
- "traefik.http.middlewares.nextcloudredir.redirectregex.permanent=true"
- "traefik.http.middlewares.nextcloudredir.redirectregex.regex=https://(.*)/.well-known/(card|cal)dav"
- "traefik.http.middlewares.nextcloudredir.redirectregex.replacement=https://$$1/remote.php/dav/"
- "traefik.http.middlewares.nextcloudsts.headers.stsincludesubdomains=false"
- "traefik.http.middlewares.nextcloudsts.headers.stspreload=true"
- "traefik.http.middlewares.nextcloudsts.headers.stsseconds=31536000"
- "traefik.http.middlewares.nextcloudsts.headers.isdevelopment=false"
- "traefik.http.routers.nextcloud.middlewares=nextcloudredir,nextcloudsts"
</code></div>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com4tag:blogger.com,1999:blog-27876765.post-91682011202510159502020-04-10T17:03:00.001+02:002021-12-08T14:40:04.522+01:00Akka-http graceful shutdown<b>Why?</b>
<p>By default, when you restart a service, the old instance is simply killed. This means that all current requests are aborted; the caller will be left with a read timeout. We can do better!</p>
<b>What?</b>
<p>A graceful shutdown looks as follows:</p>
<ol>
<li>The scheduler (Kubernetes, Nomad, etc.) sends a signal (usually SIGINT) to the service.</li>
<li>The service gets the signal and closes all server-ports; it can no longer receive new request. This is very quickly picked up by the load-balancer. The load-balancer will no longer send new requests.</li>
<li>All requests-in-progress complete one by one.</li>
<li>When all requests are completed, or on a timeout, the service terminates.</li>
</ol>
<b>Caveats</b>
<p>Getting the signal to your service is unfortunately not always trivial. I have seen the following problems:</p>
<ul>
<li>The Nomad scheduler by default does not send an SIGINT signal to the service. You will have to configure this.</li>
<li>When the service runs in a Docker container, by default the init process (with PID 1) will ignore the signal. Back when every Unix installation had control over the entire computer this made lots of sense. In a container though, not so much. This may be fixed in newer Docker version. Otherwise you will have to use a special init process such as <a href="https://github.com/krallin/tini">tini</a>.</li>
</ul>
<b>Akka-HTTP</b>
<p>Akka-http has excellent support for graceful shutdown. Unfortunately, the documentation is not very clear about it. Here follows an example which can be used as a template:</p>
<p><b>Update 2021-12-08:</b> For newer Akka versions, please use the template in the <a href="https://day-to-day-stuff.blogspot.com/2021/12/akka-graceful-shutdown-continued.html">follow-up article</a>.</p>
<p>Just for reference, here is the old template:</p>
<div class="codeblock"><code class="scala">
import akka.http.scaladsl.Http
import akka.http.scaladsl.server._
import scala.concurrent.duration._
val logger = ???
val route: Route = ???
val interface: String = "0.0.0.0"
val port: Int = 80
val shutdownDeadline: FiniteDuration = 30.seconds
// Don't use this, see follow-up article instead!
Http()
.bindAndHandle(route, interface, port)
.map { binding =>
logger.info(
"HTTP service listening on: " +
s"http://${binding.localAddress.getHostName}:${binding.localAddress.getPort}/"
)
sys.addShutdownHook {
binding
.terminate(hardDeadline = shutdownDeadline)
.onComplete { _ =>
system.terminate()
logger.info("Termination completed")
}
logger.info("Received termination signal")
}
}
.onComplete {
case Failure(ex) =>
logger.error("server binding error:", ex)
system.terminate()
sys.exit(1)
case _ =>
}
</code></div>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-84202442427967265842020-03-03T10:56:00.000+01:002020-03-05T15:06:03.486+01:00Push Gauges<p>A colleague was complaining to me that <a href="https://micrometer.io/">Micrometer</a> gauges didn't work the way he expected. This led to some interesting work.</p>
<b>What is a gauge?</b>
<p>In science a gauge is a device for making measurements. In computer systems a gauge is very similar: a 'metric' which tracks something in your system over time. For example, you could track the number of items in a job queue. Libraries like Micrometer and <a href="https://metrics.dropwizard.io/">Dropwizard metrics</a> make it easy to define gauges. Since the measurement in itself is not useful, those libraries also make it easy to send the measurements to a metric system such as <a href="https://github.com/graphite-project/graphite-web">Graphite</a> or <a href="https://prometheus.io/">Prometheus</a>. These systems are used for visualization and generating alerts.</p>
<p>Gauges are typically defined with a callback function that does the measurement. For example, using <a href="https://github.com/erikvanoosten/metrics-scala">metrics-scala</a>, the scala API for Dropwizard metrics, it looks like:</p>
<div class="codeblock"><code class="scala">class JobQueue extends DefaultInstrumented {
private val waitingJobsQueue = ???
// Defines a gauge
metrics.gauge("queue.size") {
// This code block is the callback which does a 'measurement'.
waitingJobsQueue.size
}
}</code></div>
<p>Please note that the metric library determines when the callback function is invoked. For example, once every minute.</p>
<b>What is a push gauge?</b>
<p>My colleague had something else in mind. He didn't have access to the value all the time, but only when something was being processed. More like this:</p>
<div class="codeblock"><code class="scala">class ExternalCacheUpdater extends DefaultInstrumented {
def updateExternalCache(): Unit = {
val items = fetchItemsFromDatabase()
pushItemsToExternalCache(items)
gauge.push(items.size) // Pushes a new measurement to the gauge.
}
}</code></div>
<p>In the example the application becomes responsible for pushing new measurements. The push gauge simply keeps track of the last value and reports that whenever the metrics library needs it. So under the covers the push gauge behaves like a normal gauge.</p>
<p>Push gauges like in this example are now made possible in this <a href="https://github.com/erikvanoosten/metrics-scala/pull/233">pull-request for metrics-scala</a>. The only thing that was missing is the definition of the push gauge:</p>
<div class="codeblock"><code class="scala">class ExternalCacheUpdater extends DefaultInstrumented {
// Defines a push gauge
private val gauge = metrics.pushGauge[Int]("cached.items", 0)
def updateExternalCache(): Unit = // as above
}
</code></div>
<b>Push gauge with timeout</b>
<p>In some situations it may be misleading to report a very old measurement as the 'current' value. If the external cache in our example evicts items after 10 minutes, then the push gauge should not report measurements from more then 10 minutes ago. This is solved with a push gauge with timeout:</p>
<div class="codeblock"><code class="scala">class ExternalCache extends DefaultInstrumented {
// Defines a push gauge with timeout
private val gauge = metrics.pushGaugeWithTimeout[Int]("cached.items", 0, 10.minutes)
def updateExternalCache(): Unit = // as above
}
</code></div>
<b>Feedback wanted!</b>
<p>I have not seen this concept before in any metric library in the JVM ecosystem. Therefore I would like to collect as much feedback as possible before shipping this as a new feature of metrics-scala. If you have any ideas, comments or whatever, please leave a comment on <a href="https://github.com/erikvanoosten/metrics-scala/pull/233">the push-gauges pull-request</a> or drop me an email!</p>
<p><span style="font-weight:bold;">Update 2020-03-05</span>: The code example have been updated to reflect changes in the pull request.</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-15002828921243121292017-07-12T11:05:00.004+02:002017-07-12T11:05:59.571+02:00Continuation parsers and encoders<p>
I finally got around to writing about my hack project of last year. It was an exploration of what can be done with continuation parsers and encoders in order to implement a very fast single-copy asynchronous Thrift implementation.
</p>
<p>
Continuation parsers and encoders try to decode (read)/encode (write) their data <i>directly</i> from/to a network buffer. When the buffer has been fully read/written, it asks for more network buffers to continue.
</p>
<p>
For more information see the <a href="https://github.com/erikvanoosten/thrift-stream/blob/master/README.md">thrift-stream repository</a> on GitHub.
</p>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-31867658774683348532017-04-18T11:50:00.003+02:002017-04-21T10:16:56.518+02:00Bash history backup<p>I like my bash history, and I proudly have this in my <tt>.bash_profile</tt>:
<div class="codeblock"><code class="sh"># Increase bash history size, append instead of overwrite history on exit
export HISTSIZE=10000000
export HISTCONTROL=erasedups
shopt -s histappend</code></div>
</p>
<p>
However, after reading about <a href="https://undertitled.com/2017/04/12/historian-because-please-stop-deleting-my-bash-history.html">Historian</a> I realised I have no backup. Instead of installing Historian, I decided to take a simpler approach. Here it is:
<div class="codeblock"><code class="sh">#!/bin/bash
set -euo pipefail
IFS=$'\n\t'
#
# Makes a daily backup of your .bash_history, keeps the last 2 backups.
#
BACKUPDIR="${HOME}/.bash_history_backup"
mkdir -p "${BACKUPDIR}"
cp ${HOME}/.bash_history "${BACKUPDIR}/bash_history_$(date +%Y%m%d)"
for h in $(ls -1r "${BACKUPDIR}"/bash_history_* | tail -n +3); do rm $h; done</code></div>
</p>
<p>
Put the above contents in a file somewhere, e.g. in <tt>~/bin/bash_history_backup.sh</tt> and activate it with:
<div class="codeblock"><code class="sh">ln -s ~/bin/bash_history_backup.sh /etc/cron.daily/</code></div>
or add the following line with the more cumbersome route through <tt>crontab -e</tt>:
<div class="codeblock"><code class="sh">0 0 * * * $HOME/bin/bash_history_backup.sh</code></div>
</p>
<p><b>Update 2017-04-22:</b> The script actually works now :)</p>
<p><b>Update 2017-04-23:</b> The script actually actually works now :)</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-33247579709704803002016-09-13T10:51:00.000+02:002017-02-05T07:54:10.915+01:00Exploring Rkt on Ubuntu<p>I have been using docker in my home server since 0.4 in 2013. For me the most attractive property of docker is that it provides a way to decrease the amount of stuff one has to install on a server. I have only one server, but it has many different tasks of which some need to be rock solid (my family email) and other are experimental. Containers provide a nice way to clean up experiments. Unfortunately, docker has never been stable. I have had many fights with docker during upgrades and I have never fully understood how docker interacts with the iptables setup from my firewall (Shorewall).</p>
<p>Today, I was trying to run <a href="https://github.com/joeybaker/docker-syncthing">docker-syncthing</a>. Unfortunately, the container went into 'restarting' without any indication what was wrong. Inspired by <a href="https://jaxenter.com/can-docker-be-ousted-128806.html">"Can docker be ousted"</a> and <a href="https://medium.com/@bob_48171/an-ode-to-boring-creating-open-and-stable-container-world-4a7a39971443#.ocm6bwyz2">boring and stable containers</a> I decide to give up and try something new: <a href="https://coreos.com/rkt/">CoreOS' rkt</a>.</p>
<h3>Installation</h3>
<p>The installation instructions are a bit hidden. <a href="https://coreos.com/rkt/docs/latest/distributions.html">This page</a> refers to the script <tt>install-rkt.sh</tt>. Unfortunately, no link was given. A search on github finally gave <a href="https://github.com/coreos/rkt/blob/master/scripts/install-rkt.sh">the answer</a>. However, in the end, I liked these instructions better: <a href="http://askubuntu.com/a/796148/201096">ask ubuntu: Is it possible to install rkt in Ubuntu?</a>.</p>
<p>Because I wanted to run Syncthing and I only had a Docker recipe, I also needed the tool <a href="https://github.com/appc/docker2aci">docker2aci</a>. The instructions there are clear except that you need to install golang first (docs are fixed now):</p>
<div class="codeblock"><code class="bash">$ sudo apt-get install golang
$ git clone git://github.com/appc/docker2aci
$ cd docker2aci
$ ./build.sh
</code></div>
<p>The <tt>docker2aci</tt> binary is located in the <tt>bin</tt> folder. Put it somewhere so you can find it back.</p>
<h3>Building and converting a docker image</h3>
<p>After building the docker image:</p>
<div class="codeblock"><code class="bash"># docker build -t syncthing .
</code></div>
<p>I had the following:</p>
<div class="codeblock"><code class="bash"># docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
syncthing latest 8ea0931f1196 29 hours ago 197.1 MB
</code></div>
<p>Next step is to <em>fetch</em> the image into the rkt image repository. The <tt>rkt fetch</tt> command can fetch an image directly from a docker repository. However, I found no way to fetch directly from the local docker image repository. This is the work around:</p>
<div class="codeblock"><code class="bash"># docker save -o syncthing-docker-image.tar syncthing
# docker2aci syncthing-docker-image.tar
# rkt --insecure-options=image fetch syncthing-latest.aci
# rkt image list
ID NAME SIZE IMPORT TIME LAST USED
sha512-8161ad07a42e syncthing:latest 168MiB 7 hours ago 7 hours ago
# rkt image cat-manifest syncthing:latest | less
</code></div>
<p></p>
<h3>Running the image</h3>
<p>Now comes the time to run the image:</p>
<div class="codeblock"><code class="bash"># rkt --insecure-options=image run --net=host \
--dns=$(awk '/nameserver/ {print $2}' < /etc/resolv.conf) \
--volume=volume-srv-config,kind=host,source=/media/nas/syncthing/config,readOnly=false \
--volume=volume-srv-data,kind=host,source=/media/nas/syncthing/data,readOnly=false \
syncthing
</code></div>
<p>Creating that statement took actually longer then I expected. Here are the highlights:<ul>
<li>Option <tt>--dns=...</tt> copies the nameserver from you local <tt>/etc/resolv.conf</tt> to the same file inside the container. This makes DNS work inside the container also. Depending on your DNS setup, you may need to pass more <tt>--dns*</tt> options.</li>
<li>The <tt>--net=host</tt> gives you the easiest access to the network. If you want a bit more security, you will have to dive deeper.</li>
<li>I also tried to add the options <tt>--user=nobody --group=nogroup</tt>. However, that resulted in my container not starting up at all with weird error messages.</li></ul></p>
<p>You now have something like this:</p>
<div class="codeblock"><code class="bash"># rkt list
UUID APP IMAGE NAME STATE CREATED STARTED NETWORKS
17baa16c syncthing syncthing:latest running 1 minute ago 1 minute ago</code></div>
<p></p>
<h3>Inspecting the container</h3>
<p>Inspecting a running container is easy. With <tt>rkt enter</tt> you can directly open a shell in the container:</p>
<div class="codeblock"><code class="bash"># rkt enter 17baa16c
enter: no command specified, assuming "/bin/bash"
root@rkt-5e9ad759-82b4-4b27-b03a-b6b5074b2ac2:/#
</code></div>
<p></p>
<h3>Cleanup</h3>
<p>Every time you start a new container, the old one stays around. With <tt>rkt gc</tt> you cleanup containers that stopped some time ago (more then half an hour?). This command should be run from cron:</p>
<div class="codeblock"><code class="bash"># echo -e '#!/bin/sh\nexec rkt gc' > /etc/cron.daily/rkt-gc
# chmod +x /etc/cron.daily/rkt-gc
</code></div>
<p></p>
<h3>Networking</h3>
<p>As a Shorewall user you need to add a rule to open the ports for the syncthing application in <tt>/etc/shorewall/rules</tt> and that's it. This might get a lot more hairy when you are not using host networking.</p>
<h3>Automating startup</h3>
<p>Modern Ubuntu's use systemd to start applications. <a href="https://coreos.com/rkt/docs/latest/using-rkt-with-systemd.html">Rkt's systemd manual</a> seems well written but can be used as an introduction as best. Their suggestion to use <tt>systemd-run</tt> failed me:</p>
<div class="codeblock"><code class="bash"># systemd-run --slice=machine /usr/bin/rkt --insecure-options=image run --net=host --dns=192.168.1.1 --volume=volume-srv-config,kind=host,source=/media/nas/syncthing/config,readOnly=false --volume=volume-srv-data,kind=host,source=/media/nas/syncthing/data,readOnly=false syncthing
Failed to start transient service unit: Cannot set property ExecStart, or unknown property.
</code></div>
<p>Also, it fails to mention where to put the systemd unit file. After lots of reading (in particular the <a href="https://www.digitalocean.com/community/tutorials/how-to-configure-a-linux-service-to-start-automatically-after-a-crash-or-reboot-part-2-reference#systemd-introduction">systemd section of this manual</a>) I created <tt>/etc/systemd/system/rkt-syncthing.service</tt> with the following content:</p>
<div class="codeblock"><code class="bash">[Unit]
Description=Rkt syncthing
Requires=remote-fs.target
After=remote-fs.target
[Service]
Slice=machine.slice
ExecStart=/usr/bin/rkt --insecure-options=image run --net=host --dns=192.168.1.1 --volume=volume-srv-config,kind=host,source=/media/nas/syncthing/config,readOnly=false --volume=volume-srv-data,kind=host,source=/media/nas/syncthing/data,readOnly=false syncthing
KillMode=mixed
Restart=always
[Install]
WantedBy=multi-user.target
</code></div>
<p>And finally:</p>
<div class="codeblock"><code class="bash"># systemctl daemon-reload
# systemctl enable syncthing.service
# systemctl start syncthing.service
</code></div>
<p></p>
<h3>Ending thoughts</h3>
<p>Although I got far in a few hours there are still a few open problems.</p>
<ol>
<li>None of the above software was available as a Ubuntu package. This results in me spending more time to keep everything up to date.</li>
<li>Rkt's documentation is okay but not amazing. For example, hyperlinks are missing in crucial places (see above), and in the manual page for <tt>rkt-export</tt> does not tell you how to indicate which container you want to export.</li>
<li>Rkt containers run from an image and store changes as a layer on top. As soon the container exists, you can not start it again with those changes. This means that everything that needs to be persisted must be external to the container. For other changes you can run <tt>rkt export</tt> to create a new image, or you rebuild your image from scratch.</li>
<li>Having to use Docker to build an image for rkt is a bit weird. Next step is to create the image directly with <a href="https://github.com/appc/acbuild">acbuild</a>.</li>
</ol>
<h3>Updates</h3>
<p><b>2017-02-05</b> Fixed commands for enabling rkt garbage collection via cron.</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-44562631052606259502016-03-03T21:19:00.000+01:002016-03-03T21:19:32.696+01:00Donât call your state âstateâ<p>In the OO world you are frowned upon if you call something <i>object</i>. Its time to extend this principle: donât call your state <i>state</i>. This post is about why this is a bad idea and what you can do about it.</p>
<p>Recently we sat down to discuss the new data model for our messaging system at <a href="http://www.ebayclassifiedsgroup.com/">eBayâs Classifieds Group</a>. One of the things we inherited from the past is the entity called <tt>conversation</tt> with a field called <tt>state</tt>. Possible values were <tt>Ok</tt>, <tt>On hold</tt> and <tt>Blocked</tt>.</p>
<p><b>So what was the problem?</b><br/>A field called âstateâ almost always has a very intuitive meaning. Unfortunately, the word is so vague that the meaning can easily warp, depending on the problem at hand. I noticed this in a couple of projects: the state field started to collect more and more possible values. With more values came increasingly difficult state transitions. This lead to code that was way more messy then necessary.</p>
<p>For example, in our conversation entity we could introduce the state <tt>Closed</tt> to indicate that a participant wants to stop the conversation. Then we continue by adding the state <tt>Archived</tt> to indicate that the conversation should be hidden until a new messages arrive.</p>
<p><b>What can we do?</b><br/>The key observation is that each state value represents multiple behaviors. Think about it, what behavior is needed in each state? How do these behaviors change for each state? These questions will lead you to <i>multiple fields</i> that can represent the entire state of your entities.</p>
<p>Within a couple of minutes we found three behaviors we wanted to have for our conversations: a conversation is either visible or not (field <tt>visibility</tt> with values <tt>Displayed</tt> and <tt>Hidden</tt>), it will accept new messages or not (field <tt>acceptNew</tt> with values <tt>Accept</tt> and <tt>Reject</tt>) and we want to notify the recipient of a new message (or not) (field <tt>notifyOnNew</tt> with values <tt>Notify</tt> and <tt>Mute</tt>).Not only did our code become easier to extend and reason about, as a bonus we found a feature that would have been really hard with the old model: muting a conversation.</p>
<p><b>Conclusion</b><br/>Donât call your state âstateâ, instead, think about the behavior each state represents and model that instead.</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com2tag:blogger.com,1999:blog-27876765.post-13444393872349925332016-02-14T14:08:00.001+01:002016-02-14T14:09:12.806+01:00Generate a certificate signing request (CSR) as a 1 liner<p>As I run my own (secure) web and mail server I frequently have to get certificates. You could run with a selfsigned certificate, but that is not ideal; desktop browsers are very noisy about them, and mobile browsers are outright hostile. Installing certificates on a mobile phone are no fun (though, hat of for <a href="https://play.google.com/store/apps/details?id=at.bitfire.cadroid">CAdroid</a>).</p>
<p>Now that <a href="https://letsencrypt.org/">Letsencrypt</a> is live, I wanted to try it out. However, Letsencrypt does not make it easy for you to keep using the same private key. This is necessary as the Android's http client pins your certificates (the one I needed anyway). Luckily Letsencrypt allows you to use a CSR that is generated outside of their tools.</p>
<p>Letsencrypt certificates are only valid for a short time (90 days) so automation is key. Unfortunately, openssl does not make it easy to fully automate creating CSRs, especially when you need 'Alternative Names', a requirement from Letsencrypt.</p>
<p>Luckily I found a solution from <a href="http://blog.endpoint.com/2014/10/openssl-csr-with-alternative-names-one.html?showComment=1442955981110#c4884976064836690024">Andrew Leahy</a>. Here it is:</p>
<div class="codeblock"><code class="sh">openssl req -new -nodes -sha256 -key private-key.pem \
-subj "/C=US/ST=CA/O=Acme, Inc./OU=Acme Example/CN=example.com" \
-reqexts SAN \
-config <(cat /etc/ssl/openssl.cnf <(printf "[SAN]\nsubjectAltName=DNS:example.com,DNS:www.example.com")) \
-out domain.csr</code></div>
<p>This will generate a CSR in file <tt>domain.csr</tt> with set country (C), state (ST), organization (O), organization unit (OU) and most importantly the common name (CN) and in addition two alternative names <tt>example.com</tt> and <tt>www.example.com</tt>. Thanks Andrew!</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-47823168446603993522015-05-20T21:35:00.000+02:002015-05-20T21:35:00.988+02:005 problems in an hour<p><a href="https://twitter.com/svpino">@svpino</a> posted a nice <a href="https://blog.svpino.com/2015/05/07/five-programming-problems-every-software-engineer-should-be-able-to-solve-in-less-than-1-hour">chalenge</a>: solve 5 problems within an hour or denounce your title of software engineer.</p>
<p>It took me 40 minutes, of which 10 minutes was fighting around a limitation of the scala REPL.</p>
<p>Here are my solutions. They can all be pasted as is in the Scala REPL.</p>
<div class="codeblock"><code class="scala">
/////////////////////////////////////////
// Problem 1 (Yuc! This is not a Scala problem!)
def p1_for(xs: Seq[Int]): Int = { var s = 0; for (x <- xs) s += x; s }
def p1_while(xs: Seq[Int]): Int = { var s = 0; var i = 0; while (i < xs.length) {s += xs(i); i+=1}; s }
def p1_rec(xs: Seq[Int]): Int = if (xs.isEmpty) 0 else xs.head + p1_rec(xs.tail)
def p1_proper(xs: Seq[Int]): Int = xs.foldLeft(0)(_+_) // :)
scala> p1_for(Seq(1,2,3))
res0: Int = 6
scala> p1_while(Seq(1,2,3))
res1: Int = 6
scala> p1_rec(Seq(1,2,3))
res2: Int = 6
/////////////////////////////////////////
// Problem 2
def p2[A,B](a: Seq[A], b: Seq[B]): Seq[Any] = a.zip(b).flatMap { case (a,b) => Seq(a,b) }
scala> p2(Seq("a","b","c"), Seq(1,2,3))
res3: Seq[Any] = List(a, 1, b, 2, c, 3)
/////////////////////////////////////////
// Problem 3
def fibonaci: Iterator[Int] = Iterator.iterate((0,1)) { case (a,b) => (b, a+b) }.map(_._1)
scala> fibonaci.take(100).mkString(", ")
res79: String = 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309, 3524578, 5702887, 9227465, 14930352, 24157817, 39088169, 63245986, 102334155, 165580141, 267914296, 433494437, 701408733, 1134903170, 1836311903, 2971215073, 4807526976, 7778742049, 12586269025, 20365011074, 32951280099, 53316291173, 86267571272, 139583862445, 225851433717, 365435296162, 591286729879, 956722026041, 1548008755920, 2504730781961, 4052739537881, 6557470319842, 10610209857723, 17167680177565, 27777890035288, 44945570212853, 72723460248141, 117669030460994, 190392490709135, 308061521170129, 498454011879264, 806515533049393, 1304969544928657, 2111485077978050, 3416454622906707, 55...
/////////////////////////////////////////
// Problem 4
def p4(n: Seq[Int]): Int = n.map(_.toString).sorted.reverse.mkString("").toInt
scala> p4(Seq(50,2,1,9))
res73: Int = 95021
/////////////////////////////////////////
// Problem 5
sealed trait Expr {
def evaluate: Int
def show: String
}
case class Plus(e1: Expr, e2: Expr) extends Expr {
def evaluate: Int = e1.evaluate + e2.evaluate
def show: String = e1.show + " + " + e2.show
}
case class Min(e1: Expr, e2: Expr) extends Expr {
def evaluate: Int = e1.evaluate - e2.evaluate
def show: String = e1.show + " - " + e2.show
}
case class Literal(v: Int) extends Expr {
def evaluate: Int = v
def show: String = v.toString
}
def appendToLastLiteral(e: Expr, append: Int): Expr = e match {
case Plus(e1, e2) => Plus(e1, appendToLastLiteral(e2, append))
case Min(e1, e2) => Min(e1, appendToLastLiteral(e2, append))
case Literal(v) => Literal((v.toString + append.toString).toInt)
}
def loop(e: Expr, todo: Seq[Int]): Unit = {
if (todo.isEmpty) {
if (e.evaluate == 100) println(e.show)
} else {
loop(Plus(e, Literal(todo.head)), todo.tail)
loop(Min(e, Literal(todo.head)), todo.tail)
loop(appendToLastLiteral(e, todo.head), todo.tail)
}
}
scala> loop(Literal(1), Seq(2, 3, 4, 5, 6, 7, 8, 9))
1 + 2 + 3 - 4 + 5 + 6 + 78 + 9
1 + 2 + 34 - 5 + 67 - 8 + 9
1 + 23 - 4 + 5 + 6 + 78 - 9
1 + 23 - 4 + 56 + 7 + 8 + 9
12 + 3 + 4 + 5 - 6 - 7 + 89
12 + 3 - 4 + 5 + 67 + 8 + 9
12 - 3 - 4 + 5 - 6 + 7 + 89
123 + 4 - 5 + 67 - 89
123 + 45 - 67 + 8 - 9
123 - 4 - 5 - 6 - 7 + 8 - 9
123 - 45 - 67 + 89
</code></div>
<p>Thanks @svpino, this was fun!</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0tag:blogger.com,1999:blog-27876765.post-62891582537401050542013-11-01T10:29:00.001+01:002013-11-01T10:30:29.021+01:00Installing gnutar on Maverick<p>Unfortunately Apple decided to remove <tt>/usr/bin/gnutar</tt> from Maverick (Mac OSX 10.9). This is a pain because most of the tarring I do on my mac is to transfer the file to a GNU based linux (e.g. Debian/Ubuntu). Apple's bsd-tar is not compatible with gnu-tar.</p>
<p>This is my solution:
<div class="codeblock"><code>brew install gnu-tar
cd /usr/bin
sudo ln -s /usr/local/opt/gnu-tar/libexec/gnubin/tar gnutar</code></div>
</p>Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com3tag:blogger.com,1999:blog-27876765.post-88628331730541968932013-09-18T21:18:00.001+02:002013-09-18T21:18:36.152+02:00Configuring Postfix/Dovecot for Microsoft Windows Live Mail<p>Personal mail gets no love from Microsoft. The last 10 year I have not seen their product change a lot. Notable I see name changes (always a bad sign) and some visual changes. The actual implementation is still the same: not respecting standards.
I run a Postfix/Dovecot installation for my family mail. I have had many many different email clients connect to it without large problems. With Microsoft Windows Live Mail, Outlook Express or whatever it is called today, it just doesn't work. Anyway, here is what you can do:</p>
<p>(I am assuming you are using something like Ubuntu with Postfix for SMTP with TLS (actually STARTTLS) on port 25, and Dovecot with IMAPS on port 993.)</p>
<p>Open the file <tt>/etc/dovecot/dovecot.conf</tt>, and update the line with <tt>auth_mechanisms</tt> to the following. The trick
is that <tt>login</tt> has to come first:</p>
<div class="codeblock"><code>auth_mechanisms = login plain</code></div>
<p>Repeat this trick for Postfix in <tt>/etc/postfix/sasl/smtp.conf</tt>:</p>
<div class="codeblock"><code>mech_list: login plain</code></div>
<p>Restart Postfix and Dovecot, and you're good to go!</p>
Erik van Oostenhttp://www.blogger.com/profile/15976519439979651010noreply@blogger.com0