Oct 032014
 


James O. Coplien started a discussion in May 2014 about the value of UnitTests and why most of them are waste. In August he added the 2nd installment.

In this context I hear often the argument that the majority of tests should be UnitTests over all others. Mike Cohn suggests this as well with the layout of his Testing Pyramid, because UnitTests can be executed faster. Robert C. Martin (Uncle Bob) is a strong advocate of this too – see one of his recent blog entries. Later on I will come back to his blog post and his statement that a UnitTest framework must not slow down the developers’ pace. But first to the “argument” that UnitTests can be executed faster than the others:

When I talk in the following text only about End-To-End (E2E) tests, then as a pars pro toto for all kind of tests that check not just a single unit, but complex business logic or complete features.

So shall we write UnitTests “only” because they are executed faster than E2E tests? This sounds to me like some kind of a workaround. Because we cannot get easily execute broader applied tests fast, we test something that we can test faster? Exaggeratory, isn’t it somehow similar to the old joke below?

SearchKeyUnderLight

Shouldn’t we better try to solve the primary problem: Why do take the E2E tests take that long? Isn’t writing automated tests and a slow execution time some kind of normal software usage with a performance problem? What do we normally do, when we have a performance problem? We make a detailed analysis with a profiler or a similar tool and search for the bottleneck(s). That area that consumes most of the time is optimized until we have reached a point that we are satisfied. During this process of optimization, I think, one should consider the following questions as well: Do we really need that stack of framework over framework where each layer introduces additional execution time on top of everything? (As a side question, do you know what your CPU is doing within one second of your application running time? A typical PC-CPU can today execute between 3,000,000,000 and 18,000,000,000 instructions per second depending on the number of cores, sometime even more if it can parallelize certain ones. Well, I know that the OS needs its time too and that a certain amount of time the CPU just sits there and waits for the RAM, but never the less a question that one should ask him/herself.) As well we might improve the overall performance by using faster HW: e.g. SSD, RAM-Disks, or parallelize the tests over more machines when we have reached the point that not a single bottleneck is left but only a flat area of many functions that need nearly an equivalent amount of time? If the bottlenecks are inside a framework, we should check there for the problems. Otherwise, if the problem is because of a used protocol, we should consider using a more efficient one. (In a future post I will describe in detail how we handled in our company the slow functional test execution problem of our application.)

The nice side effect of such an optimization is as well that the product is faster for the end user and probably its power consumption overall is lower, because operations are executed faster and the CPU can throttle down in a low power mode earlier.

So when the performance problems are solved, we can put our focus back on tests that verify directly business value.

At the end I want to follow up on Robert Martin’s statement: “Slow running tests represent a design flaw that reflects on the experience and professionalism of the team. A slow running test suite is either a freshman error, or just plain carelessness.” From my point of view this is too simplistic. I think any team has to balance the features and possibilities that a slower UnitTest framework gives, against the compile and link time. I don’t know which framework the developers where using, but I know that one of the currently most powerful ones for C++ is GoogleTest. Because of its power, it probably compiles and links slower than others. But I personally do not want to miss the excellent reporting features and capabilities of templated test cases. Without them I would not get these detailed failure reports (hunting down a problem with a less detailed report would take longer) and I would have to write and maintain in the long term much more non-template test code.

At the end only the features – of course fully functional without the creation of legacy code – delivered to the customer and its development time count. If a team decides to accept longer UnitTest compile and link times, because they can achieve nevertheless in the same time more value to the customer, then this is – from my point of view – a correct and professional decision.

Note: I do not question the value of UnitTests where they make sense.

Many thanks to Jim Coplien for the inspiring discussion over the last months and for reviewing this text!

  4 Responses to “Is UnitTesting testing under the streetlight?”

  1. Just 2 remarks:
    About slow running tests: I think this is much of a definition issue; you can write tests that are not exactly unit tests (both by execution time and by the covered functionality) using a unit test framework like GoogleTest. From my point of view, they make perfect sense as module or integration tests, you just can’t run them as often as “real” unit tests (e.g. it is not practical to run them after every code change). Technically, you can filter them out while running the “real” unit tests, and run them less often (for example, only after a check-in, or only in a nightly build, depending on the scope and execution time).
    About the value of tests: Of course at the end what counts are the features delivered to the customer and the development time – but don’t forget about the possibility to change existing features or add new features fast. This is a value in itself (though not as immediate as the other value). Uncle Bob has a lot to say on this account, and I mostly agree with him here.

  2. Felix Petriconi

    Probably I did not made the reason for my blog entry clear. Often is said, that one should write unit tests, because other tests, test that verify a broader spectrum of the application, are slower. And my point is, that this is only a kind of excuse or name it workaround, not solving the real problem: Why are the broader tests are that slow?

    • Ok – this was not entirely clear to me, indeed. I understood that you vote to make higher level tests faster to be able to run them more often and get faster feedback – I fully agree with that. That making tests faster may benefit the product itself is a nice side effect. I also agree that these tests matter because they directly test the business value (at least a part of it).
      OTOH, I still believe in Mike Cohn’s testing pyramid, because I see unit tests not as a faster replacement for end to end tests, but as having their own value (as I tried to write above). I actually haven’t met with the argument that you need unit tests only because higher level tests are too slow – this would not convince me, either.

  3. Decades ago Edsger Dijkstra wrote his ever infamous article “Goto statement considered harmful” which had sparked the long lasting discussion about usage of the Goto statement in the code. Actually, the true name for the article is or should have been “Goto statement considered harmful if used in inappropriate ways”, which is the message behind the article. In analogy to this I believe that Unit Tests are useful if used appropriately. Yet, one needs to find out what that “appropriately” means. Unit Tests have their own merrits and it feels wrong for me to say that one should favour UT over End-To-End Tests and vice versa. Different things need to be tested on different levels. So, for instance, in order to test application’s use cases you would most likely want to write appropriate E-To-E Tests. But, in order to test some application agnostic componens, say a custom String class, you will use Unit Tests I guess. And when you use Unit Tests for your purpose, you will strive to write meaningful tests. What is meaningful? As a guidance what meaningful is I like the notion of “Test behaviour, not functions”. What is behaviour? For a string class it will probably be things such as appending/concatenating strings, searching, substring operations, comparisons etc. Writing Unit Tests for trivial operations such as getters and setters is most likely not meaningful.
    To come back to your polemics, I don’t think UT is testing under the streetlight.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">

(required)

(required)