Dangers of unit testing undefined behavior

Recently I participated in a intracompany discussion about a concurrency defect. A piece of code looked something like this:

Thread1:

boolean keepRunning = true;
while(keepRunning) {
...
}

Thread2:

...
keepRunning = false;

The problem? keepRunning a plain boolean without any synchronization. According to the JMM a whole slew of things could go wrong with the above code, one of which is for Thread1 to run forever because it never sees Thread2’s update to ‘keepRunning’.

What’s interesting about this is that there was an integration test that inadvertently tested this scenario and has always passed, so the problem was never caught. Once the code started running a production box which has different hardware characteristics (a lot more cores/memory) this code blew up.

This is one of those examples where unit testing doesn’t produce good results. It’s very dangerous to get an intuition about incorrect concurrent code by running simple unit tests. These one off unit tests run for a short period of time on a box that potentially has few cores, the JIT doesn’t kick in, inlining doesn’t happen the system isn’t under heavy load and the hardware configuration could be favorable to not surfacing the error. As shown in this post (C and Java examples), just because incorrect concurrent code works the way the author expects doesn’t mean it will continue to work. I’m not sure if that was the author’s intention but that’s what I got out of it.

This is why I am pessimistic about types and unit tests when it comes to catching interesting errors found in production. Unit tests/types are good for catching obvious things like “this method doesn’t accept arguments of this category” or “what happens when this method gets passed an empty string instead of a string that I expect?”. I have yet to see a language/test framework that can help with concurrency problems.

I know of are two partial solutions to the concurrency problem.

  1. Try to avoid errors by construction, i.e. have good design that makes doing the wrong thing harder. Immutable data structures by default is a big first step in that direction.

  2. Feynman method. Think really hard and write code that doesn’t contain concurrency bugs, if that’s not possible try to convince a friend or co-worker to think very hard with you.

The first method is really just a special case of the the Feynman method.