Friday, November 1, 2013

Installing gnutar on Maverick

Unfortunately Apple decided to remove /usr/bin/gnutar from Maverick (Mac OSX 10.9). This is a pain because most of the tarring I do on my mac is to transfer the file to a GNU based linux (e.g. Debian/Ubuntu). Apple's bsd-tar is not compatible with gnu-tar.

This is my solution:

brew install gnu-tar cd /usr/bin sudo ln -s /usr/local/opt/gnu-tar/libexec/gnubin/tar gnutar

Wednesday, September 18, 2013

Configuring Postfix/Dovecot for Microsoft Windows Live Mail

Personal mail gets no love from Microsoft. The last 10 year I have not seen their product change a lot. Notable I see name changes (always a bad sign) and some visual changes. The actual implementation is still the same: not respecting standards. I run a Postfix/Dovecot installation for my family mail. I have had many many different email clients connect to it without large problems. With Microsoft Windows Live Mail, Outlook Express or whatever it is called today, it just doesn't work. Anyway, here is what you can do:

(I am assuming you are using something like Ubuntu with Postfix for SMTP with TLS (actually STARTTLS) on port 25, and Dovecot with IMAPS on port 993.)

Open the file /etc/dovecot/dovecot.conf, and update the line with auth_mechanisms to the following. The trick is that login has to come first:

auth_mechanisms = login plain

Repeat this trick for Postfix in /etc/postfix/sasl/smtp.conf:

mech_list: login plain

Restart Postfix and Dovecot, and you're good to go!

Monday, July 1, 2013

Installing thrift through homebrew

Currently thrift support in homebrew is broken; brew versions thrift returns nothing. Here is a list I got from a collegae:

$ brew versions thrift 0.9.0 git checkout 3b8bb74 /usr/local/Library/Formula/thrift.rb 0.8.0 git checkout e5475d9 /usr/local/Library/Formula/thrift.rb 0.7.0 git checkout 141ddb6 /usr/local/Library/Formula/thrift.rb 0.6.1 git checkout 54ff633 /usr/local/Library/Formula/thrift.rb 0.5.0 git checkout 0476235 /usr/local/Library/Formula/thrift.rb 0.4.0 git checkout 4523877 /usr/local/Library/Formula/thrift.rb 0.3.0 git checkout 67ec3c0 /usr/local/Library/Formula/thrift.rb 0.2.0 git checkout d0efd9e /usr/local/Library/Formula/thrift.rb HEAD git checkout c4decd7 /usr/local/Library/Formula/thrift.rb

With this information in mind, continue to stackoverflow for more information on how to install a specific version.

I had to do brew -v install thrift (with -v), otherwise installation just halted.

Sunday, June 30, 2013

Announcing Metrics-Scala 3.0.0

Metrics-scala 3.0.0 was just released and is available in Maven central. This is the first release against Metrics version 3.0.0, and the first release where the code is maintained by me instead of being a line for line copy of Coda Hales' original.

A special thanks goes to @scullxbones who started the 3.0.0 branch and ported the tests to ScalaTest.

Changes:

  • Code is no longer a copy from Coda Hale's sources and is now maintained by me.
  • Depends on Metrics-core 3.0.0.
  • Ported tests from original to ScalaTest (@scullxbones).
  • Much more documentation.

Although the metrics-scala API is mostly source compatible, there are breaking API changes which are mostly caused by changes in the metrics-core library:

  • All code moved to the nl.grons.metrics.scala package (changed at Coda Hale's request).
  • The class Instrumented must now be created in your project by extending InstrumentedBuilder.
  • All configuration for histograms, meters, and timers are gone. These are now configured in the reporter.
  • Dropped method clear on Histogram and Timer.

More ideas and pull requests are welcome!

Friday, June 28, 2013

Fast directory transfer on Unix machines

Here is a little trick to transfer a big folder from one unix machine to another in 2 variations.

In this variation netcat is in listen mode on the target (execute in the given order):

on target> nc -l 19001 | lzop -d | tar x on source> tar c [directory to copy] | lzop | nc [target] 19001

In the second variation netcat is in listen mode on the source system (again, execute in the given order):

on source> tar c [directory to copy] | lzop | nc -l 19001 on target> nc [source] 19001 | lzop -d | tar x

Make sure you have a decent network connection, 1 Gbit/s is fine.

Update 2015-11-18: I experimented with cpio and found that it is a lot faster then tar. I also added pipe viewer (pv) to get some sense of when a transfer is done.

This is using cpio with netcat in listen mode on the target and send mode on the source:

on target> nc -l 19001 | lzop -d | cpio -idm on source> cd [directory to copy]; find . -depth -print0 | cpio -o0 \ | pv -s $(du -ks . | awk '{print $1}')k \ | lzop | nc [target] 19001

This is what it looks like on the source side:

Friday, April 5, 2013

Fixing code and binary incompatibilities for cross Scala version library development

Scala is a fantastic language that unfortunately has a tradition of having no binary compatibility between versions. The result is that library developers have to go through a lot of pain to release their software for multiple scala versions. Even though starting with scala 2.9 minor versions are binary compatible, with scala 2.10 the situation has worsened because there are now some code incompatibilities as well.

This post shows some techniques for library developers to build releases against multiple scala versions, taking care of binary and code incompatibilities.

SBT — Simple Build Tool

The only viable option I know to build cross scala versions is SBT (Simple Build Tool). I am going to assume you are somewhat familiar with SBT. The most important cross build settings in your build.sbt are (full version on Github):

scalaVersion := "2.10.1" crossScalaVersions := Seq("2.9.1", "2.9.1-1", "2.9.2", "2.10.1") crossVersion := CrossVersion.binary

Key scalaVersion sets the current scala version to use, key crossScalaVersions contains all scala versions to use during cross builds.

The last settings has the effect that the correct scala version is appended to the name of your artifact. ‘Correct’ in this case means the full version for scala versions 2.9.x and lower, or just the 2 highest numbers for 2.10.0 and later. So if you have a setting name := "libname", the generated artifacts will be named libname_2.9.1, libname_2.9.1-1, libname_2.9.2 and libname_2.10.

Kick of a cross build by prepending ‘+’ to your command. E.g. sbt +test.

Code incompatibilities

Scala 2.10 brings some nasty code incompatibilities. The popular Akka library for example has partly moved into the main scala library. The consequence is that code for scala 2.9 needs to depend on Akka and import akka.dispatch.Future, while code for scala 2.10 needs no additional dependencies and import scala.concurrent.Future.

Another example are the changes around concurrent maps. In 2.9 one needs to do new java.util.concurrent.ConcurrentHashMap[A, B](1024).asScala to get a scala.collection.mutable.ConcurrentMap. In Scala 2.10 you are better of with scala.collection.concurrent.TrieMap.empty to get a scala.collection.concurrent.Map. All interfaces stay the same while all names changed.

Dependency incompatibilities

To define dependencies based on the current scala version you can use the following trick:

libraryDependencies <++= (scalaVersion) { v: String => if (v.startsWith("2.10")) Seq("com.yammer.metrics" % "metrics-core" % "2.1.5", "org.specs2" %% "specs2" % "1.13" % "test") else if (v.startsWith("2.9")) Seq("com.yammer.metrics" % "metrics-core" % "2.1.5", "com.typesafe.akka" % "akka-actor" % "2.0.5", "org.specs2" %% "specs2" % "1.12.3" % "test") else Seq() }

Fixing code incompatibilities

If code needs to differ between scala versions, the easiest way is to have multiple source roots. E.g.:

libname/ build.sbt src/ main/ scala/ scala_2.9/ scala_2.10/ test/

Add the following to your build.sbt to make it possible:

// The following prepends src/main/scala_2.9 or src/main/scala_2.10 to the compile path. unmanagedSourceDirectories in Compile <<= (unmanagedSourceDirectories in Compile, sourceDirectory in Compile, scalaVersion) { (sds: Seq[java.io.File], sd: java.io.File, v: String) => val mainVersion = v.split("""\.""").take(2).mkString(".") val extra = new java.io.File(sd, "scala_" + mainVersion) (if (extra.exists) Seq(extra) else Seq()) ++ sds }

Example code for 2.9 (full version on Github):

package nl.grons.sentries.cross object Concurrent { type Future[+A] = akka.dispatch.Future[A] val Future = akka.dispatch.Future val Await = akka.dispatch.Await type CMap[A, B] = scala.collection.mutable.ConcurrentMap[A, B] def defaultConcurrentMap[A,B](): CMap[A,B] = new java.util.concurrent.ConcurrentHashMap[A, B](1024).asScala }

Example code for 2.10 (full version on Github):

package nl.grons.sentries.cross object Concurrent { type Future[+A] = scala.concurrent.Future[A] val Future = scala.concurrent.Future val Await = scala.concurrent.Await type CMap[A, B] = scala.collection.concurrent.Map[A, B] def defaultConcurrentMap[A,B](): CMap[A,B] = scala.collection.concurrent.TrieMap.empty }

The rest of the code can now use the type aliases and references from here. E.g. nl.grons.sentries.cross.Concurrent.Future refers to Akka for scala 2.9 and to the standard library for scala 2.10.

Conclusions

With some hackary SBT allows you to define dependencies and source roots based on the current scala version. This allows you to overcome scala’s incompatibilities if you are a library developer that builds releases for multiple scala versions.

The techniques described in this post were developed for Sentries. The code is on Github.

Wednesday, February 6, 2013

Breaking the Circuit Breaker

The circuit breaker is this wonderful pattern to protect your application against resources that fail slowly. The idea is that you stop trying to use a resource when it has too many failures. Regular retries test the resource and will make the resource available again. The benefit is that your application can react quickly to a failed resource instead of hogging CPU, threads, network, etc. while you are waiting to find out the resource is unavailable.

So what's wrong?

Its the metaphor. In the classical description a circuit breaker has 3 states: the open state, the closed state and the half-open state. So what does it mean when the circuit breaker is open? When is a bridge open? When you can drive over it, or when you can sail through it? Only when you look at the first image you may see that a traditional open circuit breaker stops flow of electricity. To us that translates to no usage of the resource. In the 'closed' state electricity flows, which translates to having access to our resource. Now read that again and see if you can remember that!

Then we have a half-open state? Again, look at the first image. For such a switch half-open is still open. (A half-open bridge lets no traffic trough at all but that is another topic.) Why do we need the half-open state anyway? In the classical description we attempt to use the resource once while in this state. If it fails just once, we go back to the open state. This seems like a good idea, but let us think of modern networked applications. In such applications many requests are done simultaneously. So as soon as we switch to the half-open state for a retry, many, maybe hundreds of request will immediately try to use the resource, even if it is still down. This is exactly what we were trying to prevent!

Stop!

Although the circuit breaker is a great invention, I think we need a new metaphor, or at least some new terminology.

No more half-open

The first thing we can do is get rid of the half-open state. Instead, when its time to retry, we just let 1 client through to the resource. While that check is in progress we keep denying access to the resource for other clients; we stay in the same state. Only when the single check succeeds, we switch to the state in which we allow full access to the resource.

No more open

The second thing we need to do is to end the confusion on what it means to be 'open'. Instead I propose we call this state the broken state. No further explanation required. Good. In this state we do the regular retries.

Finally, to make things symmetric, I propose to rename the 'closed' state to flow state as all requests are granted.

Metaphor

Above I proposed new terminology but I failed to provide a new metaphor. Unfortunately metaphors are hard to find and too easy to get wrong. Perhaps a good metaphor should be related to the fact that we are limiting the number of errors we tolerate from a resource. If you have an idea, please let me know in a comment. I hope you liked my little rant. Any comments are always welcome.

—   ❧   —

Postscript: Sentries and the circuit breaker

The Sentries library contains a highly optimized circuit breaker implementation for Scala programs. The ideas in this article developed while writing Sentries. Feel free to have a look. As you can see there are only 2 states, the FlowState and the BrokenState. Note that the retryAt in BrokenState is a val; it can not be changed after initialization. When it is time to retry we replace the broken state with a new instance (in method attemptResetBrokenState).