Clojure at a Bank – Testing
May 21, 2013 § 7 Comments
This post is a continuation of my earlier ‘Clojure at a Bank’ posts. I’ve since left the bank and am working for a large newspaper company, fortunately for me still writing Clojure.
It’s an obvious point to make, that different projects can have very different testing demands. At the bank we managed a throughput of financial products so it was critical that we got no surprises. Prod deployments were often like moon-landings, staged well in advance with lots of people in mission control.
At the newspaper it’s a bit different. Whilst bugs are still not to be warmly welcomed with a red carpet, the direction of investment can be shifted away from pre-emptive regression testing to active monitoring, along with some A/B testing.
Though the contexts of projects can differ wildly, I can still mull over some very familiar debates and practices.
TDD can get a bad rap because it’s so often thoughtlessly applied. People used to hold up 90-100% coverage metrics like they meant something, their eyes glazing over from staring at the IDE bar going red/green all day. I know this because I’ve experienced that dogmatic mentality myself, churning out little unit-test classes because that’s how I thought software should be written. A more senior colleague once expoused to me when surveying our plateau of tests that ‘writing production code is hard’. He was right.
I think that most people would concur that having lots of unit-tests can be very expensive. Deciphering mock scaffolds is little fun, as is encountering pointless tests around wiring. Refactoring prod code can be gleeful, but refactoring other peoples tests is just a pain.
But none of this is argument against TDD, rather just a recoil against how it’s dogmatically applied. The same goes for people holding up massive opinions about Scrum and puritanical XP, and frankly OO at times. Nearly everything sucks if it’s done by the book with no human contemplation and is not continuously improved.
Still, I think that adopting Clojure really does force a reappraisal of the testing situation.
Firsly I think that everyone still does TDD, it’s just that you use the REPL instead of writing actual test files. Some scaffolding you might keep around for future use, but it has to earn its keep.
Secondly, immutability is another serious game changer, as you’d expect with Clojure. Testing functions at a command line REPL that have simple data-in, data-out contracts is trivial. Compare this to OO land where you’re passing mutable state around and the contrast becomes apparent. Code written in a style that is more likely to have side-effects does need a higher rigour of testing, where you can emphasize more with the compulsion to break everything down into very small units and leave tests behind around them, just so that you can be sure that their behavour is pinned down.
There may be a problem that devs don’t write enough tests in Clojure-land, and strangely the reverse can be also true as devs over-compensate by adding tests of questionable value. I’ve been guilty of this by adding tests around the nitty-gritty kernel of a codebase that have never failed the build and no-ones probably ever read.
This is an area where I feel I’ve learnt some. My starting point is that I’ve seen projects basically killed by people going mad with ‘specification by example’ styled tests. Fit, Concordion, JBehave… DSL frameworks that encourage non-techies to write a boatload of tests that are later unmaintainable. I’ve seen occasions where sometimes it appears to work out, like when the DSL is written in Java all the way down, but most of the time it’s a large fail.
At the bank we had thousands of FIT tests and I still fantasise about going back and removing every last one of them. They mashed together the ideas of outside-in TDD, talking to the business, talking to the QAs, regression testing and system documentation. Out of these concerns regression testing was the only persistent benefit. The cost of them was huge including slow run times and lots of duplicated HTML and Java to manage.
Our Clojure efforts led us into different territory. Working on a sub-system and with regression as our prime concern, we stored up a large catalogue of test input and output expectation datasets – ‘fixtures’. We then made the fixtures queryable and stored them in ElasticSearch. From the REPL you could easily pull back some choice fixtures and then send them through the codebase. Because the fixtures contained expected outputs, we had some nice HTML pass/fail reports produced for results. We got some of the benefits of FIT for a fraction of the ongoing cost.
This approach is often referred to as data-driven testing. The tests in our case just were frozen JSON documents. As I was leaving we played with extending this approach to cover the whole of the system. Even though our system managed a complicated life-cycle of financial products, it was still possible to watch a product going through it and make observations that you could then store as JSON fixture data. You could theorectically record everything that happened as a result of the FIT tests being run and then use this data to remove them.
I regret that we didn’t tackle the testing mess head on across the whole system before doing anything else major. But to be fair we couldn’t have done this without the learnings we got from the Clojure rewrite of a sub-system, and we what we picked up around testing in particular.
I’m going to write a subsequent post about how we’ve gone about testing at the newspaper.