April 7, 2015 § 2 Comments
About a year ago I was approached by a former employer to help him build a very large, greenfields system in Java. It was a new online real estate portal that also needed a separate estate agents portal, a robust and scalable data processing pipeline, an internal administrative app to be used by a support team, and a multitude of data import and data management tools. The project had an unmovable deadline of around ten months for release and would be given a high profile launch in the UK. The website needed to be fast, robust and beautiful.
I initially turned the opportunity away as it’s no longer a personal want to code on large statically typed OO systems. This is an argument that will rage on for eons; static vs dynamic typing, OO vs functional, and I won’t cover it in detail again here. Suffice to say that having worked on both Java and Clojure projects with very smart people involved, I just think that Clojure offers a simpler approach to programming, primarily for its grounding of opinionated immutability and persistent data collections. In the case of this project with the tight timescales involved, I genuinely thought that Java would be the riskier choice, given the propensity for developers to spend large amounts of their time iterating over an object model, which may or may not reward them later down the line. I politely made the case for Clojure in my response and then I waited, not investing too much hope in what could happen.
My former employer came back and said that Clojure looked a solid option. They had researched it and had become enthused about the potential benefits of functional programming and the pragmatism of Clojure. They had some deep concerns about being able to staff the project quickly, and this is where JUXT came in. We landed in five seasoned Clojure craftspeople on day one and a sixth soon after. Along with a solutions architect the client already had in place, we knuckled down and got to work. The client would successfully hire more Clojurians from the community as the project went on.
The architecture needed to be lean. The domain is more complex than you’d think, and so we needed to be able to make mistakes and recover. We went about building the system in an organic bottom up way, avoiding top level abstractions until they were justified. We also didn’t go in for the microservices approach, deciding early on to try and keep our codebase relatively unfactored out for longer. Clojure code is more naturally decoupled in not having objects or static types to bind code together, and so there’s less of a need to split it up preemptively. Keeping code together for longer is a conservative approach, as you don’t have to make the reverse journey if the split goes wrong.
For our technologies we made use of Ring, Compojure, Bidi, and http-kit for our Clojure web-apps, mixing in Friend and Liberator where appropriate. There is some debate as to whether Liberator is a good fit for general web applications vs RESTful services, but otherwise the use of this stack was pretty seamless.
We used Elasticsearch as the data repository for the front-end. I personally see ES as an old ally now, having used it on two Clojure projects prior to this one. Performance wise it’s extremely quick owing to a large in-memory cache it uses, and its search API is immensely powerful. Newer features such as ‘aggregations’ solve difficult problems for you. Because ES is JSON based it works seamlessly with a Clojure stack where the act of deserialising JSON via REST is trivial.
On the back-end we used tried and trusted Postgres. No complaints here, when it was midnight close to the release date and we needed to get a backlog of data through the system, then knowing how to fiddle an index as to boost performance makes you appreciate the well beaten track. My colleague Martin Trojer did some sterling work around building up a migration library to manage Postgres (and other data stores), and he also laid down some test fixture code for blowing away data in your local db prior to test runs.
Not every technology choice we tried would bring us sunbeams. We tried Storm early on as a contender for managing our asynchronous data processing pipeline. Storm is a great bit of kit, but it seems better suited to manipulating more straightforward numerical data, rather than bouncing large documents around a topology. When you have to frequently hit various data stores and want to update some central state, then you’re kind of going against the Storm ideal of having a collection of lightweight processing units wired up as an atomic job. Also there’s no denying that Storm is non-trivial to setup in the cloud, and in general it’s heavy to get up and running from the REPL. Onyx is a newer kid on the block that ought to be worth investigating.
We ended up making heavy use of Clojure’s core.async to manage our data processing pipeline, using external message queues to share the work across distributed processes. I couldn’t be happier with that approach, as the result is a lightweight, easily testable collection of functions wired up in a declarative, straightforward manner.
At one stage in the project JUXT conducted its own delivery assurance review as is standard practice for software consultancies. We brought in an external colleague who essentially reviewed our practices and choice of technologies. He pushed us to go leaner and simpler still, and as a result we collectively killed a few of our ‘darlings’ that were causing more of an overhead than they were worth. Not every decision we made paid immediate dividends, but what was more important was the shared team humility that would later lead to architectural agility.
Having a bunch of senior Clojure developers bring their own hard-won ideals to the table was undoubtedly a supremely joyful element of the project. For example I hadn’t jumped in with both feet into Prismatic Schema, fearing a lingering association with the kind of upfront modeling design you see so often on Java/.NET projects. Now though I think Schema is great and it was extremely useful for us, giving us vital contractual guarantees in our data-processing pipeline and a model for data coercions. We get the part of the cake back from OO that we enjoyed, the declarative schema definitions, and in this case they can be landed in organically rather than being introduced from the get-go.
Then someone came along and showed me Swagger, giving us a beautiful web UI for free on top of a previously headless restful service integrating with Schema. We wanted to be lean and pragmatic to be delivery focused, but now we were also enjoying our technology.
It was a fun project, but because of the amount of fixed scope and a non-negotiable deadline (TV-ads), it was the most challenging I’ve had so far. In particular with the need to ramp up development capacity it was difficult to get everybody singing from the same hymn sheet. Faced with the blank canvas nature of the greenfields and the introduction of new technologies there is a tendency to ‘manage by democracy’, which can lead to the ‘big architectural debate’ anti-pattern, which if left unchecked can ultimately impede delivery. I was guilty at times of this, and looking to the future I like the approach of ‘managing as a meritocracy‘.
Another challenge was ensuring a shared team ownership. Lisp programming is arguably an individualistic pursuit, intensified in our case by developers sprinting forward to build up the system ready for launch. The cohesiveness of the team was consequently stretched at times, and this is something to be addressed after the dust has truly settled and IT operations enter into long term strategic mode.
In the end we delivered the project on time and on budget, the technology standing up to the demands placed upon it. The business has a lean platform for which to go forward. I feel we made it because of the individuals involved, a fantastic client, but also because of Clojure.
Unlike on recent projects this didn’t feel like a regular ‘Clojure project’ to me, where I’ve largely seen Clojure brought in to disrupt and to enthuse existing development teams, usually involving some training and emotive selling of the benefits. On this project Clojure was the accepted choice from day one, and with the deadline in mind the emphasis was on getting the job done. In this sense Clojure didn’t feel like a new technology or a risky choice, it just felt like another mainstream language we use at work. That feeling of cutting-edge freshness only really transpired when we occasionally got bit by the immaturity of a library, or when the less-frequent but enjoyable debates about idiomatic code usage patterns occurred.
We took Clojure for granted, but it’s still a reason we got over line. People expect that you can get up and running faster with dynamic languages than with some of the mainstream languages such as Java and .NET, and I think this is true. But for me the real win with choosing a immutable functional language such as Clojure is around the 2/3 mark in a project. Just when a large traditional OO system starts to buckle under it’s own gravity and the business rules become harder to reach through the applied refactorings and introduction of design patterns, this is when a Clojure code-base will shine in comparison. With the business rules being freely accessible and ready to be worked with, that deadline seems just a little bit more achievable.
March 17, 2015 § Leave a Comment
I have to acknowledge a blunter truth about my relationship with TDD. That is to say I don’t do much TDD because a) I’m impatient, and b) I like to code impulsively. I strive to have the problem space booted up in my head and then walk around it arbitrarily, as Paul Graham has written about in his ‘the head’ article. When solving a problem I may start out with a vague idea of how the design will play out, but really I just want to try out a few rapid ideas, weighing up the various pros and cons as I go. I may use some tests, I may not.
My thinking is that TDD – as applied by the masses – compromises one’s ability to code intuitively against the problem space in their head. By formally marshalling the art of coding into a series of strict red/green chequerboard moves, your overall ability to think laterally about the problem is reduced.
At this point I must readily concede that the TDD conversation can’t be had without considering the technologies and languages involved. If you are doing battle with a traditionally large OO code-base, then TDD may be appropriate for a host of reasons. Primarily, it does keep you on the rails.
Being honest, can we admit that hacking on large OO systems can be a huge amount of fun? You get to lose yourself in a giant Sudoku. Design patterns, generics, some magical AOP code, what’s not to like if solving complex problems is your thing? The trick for many is to not to get lured in, however tedious that may be.
I do therefore hold respect for TDD in certain contexts. It’s like a puritanical response against the will of developers to get stuck into battles that we should probably be avoiding. TDD helps to keeps the mind clear, to keep it focused. There’s a world of dark and dangerous refactorings we could be doing if we let our inner impulses off the leash. Beginners may find TDD a useful tool to take incremental steps, and this indeed is a strong point in its favour, as discipline aids learning.
I just think that if you need TDD all the time to engage with a code-base, then it’s a smell of something being overly complex.
There is then the other the argument that TDD’d unit-tests are really just the same thing as REPL sessions, so therefore it’s unfair to attack them, that they are really just fulfilling the role of the safely walled-in playground. My view is that it sure doesn’t feel this way when I’m wrestling mock frameworks and spending hours of my life wiring stuff up, unless the playground is of the variety found in the Terminator 2 opening sequence. There’s a reason BeanShell sadly didn’t get widespread traction; you need a simpler language to make the REPL a pleasant experience.
My main overarching concern isn’t just the intuitive development approach though, as that’s largely horses for courses and is subjective to the individual. My main worry is about aesthetics, and the reason I’m pondering this is because Eleanor McHugh brought it up on a recent panel debated on TDD prompted by DHH’s article on “TDD is dead. Long live testing”.
It’s surely a worthy goal for code to be beautiful, to be aesthetically pleasing to the eye. Given developers will be spending years at a time patrolling a single code-base, it pays to have the code elegant and prideful looking. Beauty in code has a high correlation with simplicity, and it’s usually evident when a developer has thoroughly understood the complexities of the problem and has pinned down the right level of abstraction.
When a codebase has been dogmatically TDD’d, I think it ends up looking a little sad, as though the chunks of production code are being unfairly surrounded by unit-test SWAT teams. Everything is in a state of lock down, where you are restrained against making serious changes because the effort required to work with someone else’s tests is just too high. You then also have to suffer the moral quandary of considering whether to delete someone else’s test code, this isn’t trivial and so useless code has a habit of sticking around.
I don’t believe rigid TDD as applied by the masses is a good thing for aesthetics in code. I think it’s rather the opposite case, as going-through-the-motions unit-tests tend to take on opposing aesthetics. Through the eyes of a TDDist, one sees beauty in ensuring that every edge case is accounted for, and that every discreet piece of logic has a test. A TDD’d codebase can therefore be a sterile place. Well tested and functional, just lacking that human touch of grace and simplicity.
July 7, 2014 § 1 Comment
My last blog post covered building a newspaper website with Clojure. I thought it’d be useful to go deeper and write about the problems we suffered when the newly built Clojure cluster occasionally went down. We referred to the issues causing these outages as Cluster Killers.
Cluster Killer #1 – Avout on Zookeeper
Avout is a distributed STM; giving you Atoms and Refs whereby the state is managed in a centralised store such as ZooKeeper. I’m fan of the ideas behind Avout and we used it to store all our config. Config changes in ZooKeeper would be delivered out to cluster nodes on the fly and the relevant STM watchers fired accordingly.
Our bug stemmed from our underlying network having some sporadic temporal issues. To make Avout work, the refs and atoms need to be backed by data in ZooKeeper which in turn meant that each web-app node in our cluster needed a persistent ZooKeeper connection, to keep a watch on the data. This is OK, but occasionally the persistent connection would be dropped and so each node would then need to re-establish the connection. This is still OK, and reconnection is handled automatically by the Java ZooKeeper client. So what was the problem?
The root-cause issue is tricky to explain and it took us a while to diagnose and to successfully reproduce. ZooKeeper has the concept of watchers that watch for state changes. When disconnection occurs between client and server the client-side watchers are fired but not removed. When this happens Avout re-adds the watchers. Hence if you have 5 refs backed by 5 watchers, then after a disconnect this becomes 10 watchers, then 20 etc. Eventually we had over 100K watchers on our nodes and so occasionally they self-destructed. In the middle of the night and at Christmas.
I’m aware that I should probably have fixed the problem in Avout rather than write this post. We tried switching to Atoms in Avout rather than Refs and ran into a completely different bug, and so we ended up dropping the technology altogether and rolled out our own less ambitious ZK/watcher wrapper code.
Personally I wouldn’t advise using Avout again until all the GitHub issues have been fixed and there are reports of others successfully using it in production. It looks like this will happen now as the library has gained a contributor, and so I’m hopeful.
Cluster Killer #2 – Using Riemann out the box
Riemann has achieved cult status in the Clojuresphere and deservedly so. In the past we’ve been tempted to faff around with tangental event handling logic directly in application code, but Riemann gives us a sensible place where we can put this fluidic configuration. I.e. I want to email out an alert only when X happens Y amount of times, but only at a constrained rate of Z per hour.
Riemann is a handy tool to have, but on our project it became a victim of it’s own success. We initially used it for dispatching metrics to graphite and for alerting, yet after a time other teams wanted in on the action. We took our eye of the ball as we welcomed Clojure in through the back door onto other projects. Soon our Riemann config became enriched with rules keeping so much state that the Riemann server suffered an out-of-memory.
OK, surely the Riemann server going down is not a Cluster Killer? Well, by default the Riemann Clojure client uses TCP connections and communicates synchronously with the server waiting for acks. It’s up to the code-users of the client to ensure that they handle the back-pressure building up inside the application when the Riemann client/server pipe becomes blocked.
We hadn’t deeply considered this in our rush to exploit Riemann funkage, and so when our Riemann server went down all our Clojure apps talking to Riemann went down too.
By all means use the Riemann Clojure client, but wrap it with fault tolerant code.
Cluster Killer #3 and #4 – Caching
We were building a newspaper website. As is common we didn’t serve all realtime traffic directly to the users as that would be challenging in the extreme. Instead we made use of cloud caching providers. Occasionally we tweaked the caching timeout rules so that we could serve more traffic dynamically (as was the ultimate goal), but often the wrong tweak in the wrong place could put massively more load on the server pool and the cluster would struggle. I figured that caching configuration was like walking a tightrope; even parts art and science.
On a related note we were using Mustache for our template rendering. Mustache templates need to be compiled ahead of being used for actual HTML rendering. Once a config change went live that set the Mustache compiled template cache time to zero. The CPUs went mental.
What Cluster Killers have taught me
Firstly, is that shit will aways happen. It was embarrassing when our new shit did happen as we built the system to stop the old shit from happening. Worse is that our new shit was shit that people weren’t used to seeing.
I’ve now got more respect for all kinds of monitoring. We had our own diagnostic framework that fired tracer bullets into the live environment and used Riemann to alert on errors. We were quite happy with this, as it meant that many problems were seen only by us, and we could act. We also had fantastic Ops personnel who delivered a Nagios/Graphite solution that gave us a whole manner of metrics on CPUs, heap-sizes etc for which I was grateful. We got skilled up on using AWK and friends to slice up logs files into something meaningful, and profiling the JVM was essential.
Performance testing is key as is A/B testing. I hear about companies (i.e. USwitch) doing great things in this space that I would hope to emulate at some stage.
Lastly I’m now a touch more conservative about using bleeding edge technology choices. Best to spend time browsing the project pages on github before adding to your lein deps.
Any other Cluster Killer stories people want to share?
March 16, 2014 § Leave a Comment
This screencast demonstrates how you can keep of track of host/ips that you frequently remote REPL into using CIDER.
February 27, 2014 § Leave a Comment
This screencast demonstrates some recent updates to Emacs Cider around working with multiple REPLs. This covers what’s in the relevant section in the Cider README.
Hope it’s useful.
February 24, 2014 § 3 Comments
Let’s be frank about it; the MailOnline isn’t to everyone’s taste. As the worlds biggest newspaper website it is a guilty past time for many. It has some decent editorial content but it can also be distressingly shallow.
I worked for there for a year. I was lured by the opportunity to rebuild the old website system in Clojure. Whilst some in my circle have been furthering mankind over at the Guardian, I’ve been working for alternative forces.
There is a thread of arrogance in my desire to join a big media company and petition the building of a new system. The developers I found there humbled me in being some of the most personable and most talented I’ve had the pleasure of working with. Not to mention diverse.
When I first joined this project Fred George had been extolling the values of Programmer Anarchy. I don’t want to get bogged down in discussing the particulars of this software methodology, suffice to say that it is not a software methodology. Programmer Anarchy was Fred’s way of putting developers more in charge where a command and control regime had previously existed. It was a means to shake up an establishment which has since successfully led to durable self organising teams leveraging modern technologies.
I arrived as “anarchy” was rife and the continents had yet to form. The rewriting of the public facing newspaper website had yet to begin and so there was a clear opportunity to put a stake in the ground.
The reality was that I was fortuitous. The incumbent CTO had already manifested a data pipeline where all the data needed for the website went from the old system straight into an ElasticSearch instance. He knew that only way to begin slaying the old publishing/editing system was to free up the data. Future developers could now build a new Eden by exploiting this easily accessibly NoSQL store.
Before Clojure (BC) the CTO had the idea that competing teams would trial a Node Vs Ruby Vs X solution as to best appropriate the data for building the next generation website. I was joined on my first day by Antonio Terreno and we soon started hacking on Clojure to see what we could do.
After a couple of weeks we presented a prototype of the website back to the wider team. During the presentation we demonstrated how easy it is to hit ElasticSearch from the REPL and to mash Clojure data structures against HTML templates as to get the desired output. We showed how you could easily arrange sequences of articles using functions such as
partition-by to get table layouts. With website content being broken up into simple data structures along with Clojure’s stupendously powerful sequence library, it all becomes very easy.
After the presentation the wasn’t much appetite to build similar efforts in Ruby or Node*, and so Clojure was declared as the winner.
There was nothing onerous about what we had achieved. We used some functional programming to get the data into a form that it could be easily rendered, and we were organised about how we pulled back data from ES as to get good performance.
Yet fundamentally we hadn’t written much code. I raised this sheepishly with the CTO and his response was: “that’s how I know it’s the right solution”. Clojure is the winner here.
A year later and 20k LOC the website is predominantly Clojure based. The home pages, main channel pages, and articles are served through Clojure, along with a mobile version of each. The old platform will linger on serving legacy parts of the website that need to be updated, but its day are numbered. Killing off the various dusty corners of the website requires business stakeholder input as well as bullish developers.
The bulk of the work was (and still is) around having a production ready platform; monitoring, diagnostics, regression testing, metrics gathering, performance tuning and configuration management**. We have a variety of back-end services which interact with systems such as Facebook and Twitter, and we’ve upgraded significantly the data-pipe that carries data from the old system into the new. We also have a significant web-application that proxies the main web application as to manipulate the HTML using Enlive as to feed it into a CMS for editors to use.
All this is really only phase one. There is a pent up demand from the business for new functionality and many separate sub-systems that need to be rewritten.
For me, the real win has been how the wider development team has embraced Clojure. By the end of my year there are devs hacking on elisp as to make their development environments more personable. We have vim warriors doing their vim warrior thing. There have been numerous Clojure open source contributions and involvements on the London Clojure community scene. We’ve been able to spawn Clojurians in various countries outside the UK. We had a couple of presentations at the Berlin EuroClojure and I had a world beating hangover afterwards for the plane ride home.
I’m sad to move on but I leave behind a strong team who have fully embraced Clojure and love the technology. My immediate future is working for JUXT full time (juxt.pro).
* Node.js is a dominant choice for various self-contained public facing web applications.
** I plan to do a couple of future posts about the technical particulars.
June 5, 2013 § 10 Comments
I got some comments from Renzo on a previous post I wrote about Clojure and testing, and some feedback on a recent talk I gave, via email and talk down the pub after. I ponder some of these points here.
I’ve made the point that using Clojure really does force a reappraisal of the TDD situation, especially if you’re coming from an OO background. I still think this is the case, but at the same time I acknowledge that’s there nothing uniquely special about Clojure that makes this so.
Renzo suggests that some people are fine without tests because ‘they can easily grasp a problem in their head and move on’. I agree with this but I also think that the language of choice does have a direct bearing. Put bluntly, I think that most people using Clojure will write less unit-tests compared to traditional OO programming.
It’s all about feedback.
If you were to create a shopping list of things you really want for your development experience then what would you put at the top? I think most people would put ‘rapid development’ at first place. I.e. you want to make a change somewhere in your codebase, hit F5 in your browser or wherever, and then to see the changes appear immediately.
And to be honest I think this aspect of development – rapid feedback – is absolutely key. If you can just change code and see the results immediately then it really does lessen the value of writing actual test classes to gauge the results of intended changes. This may not be completely healthy, and one could imagine more cases of sloppy hacking as a result, but I’d be surprised if there wasn’t a coloration between fast feedback of changes and the amount of unit-test classes written. For example I’d be interested to know if Java devs using the Play web framework end up writing less unit-tests as a result of moving from Spring/Struts etc. I’d expect so.
Clojure is nothing special for giving rapid-feedback on changes, just glance across at the dynamic scripting languages such Ruby, Python and PhP. But if you’re coming from the enterprise Java world, where I’ve spent some years, then this is one of the first things that really knocks you on the head.
I think the next item on the list of ‘top development wants’ will often be an interactive command prompt; the REPL, for reasons that are well documented.
I gave a recent talk and one person said that it feels strange for him as a Python developer that I and others are talking up a tool that so many have taken for granted for decades, like we’re raving about this thing called the wheel.
This is a fair point, but it doesn’t take away the amazing difference for devs coming from the traditional statically typed OO landscape, where the REPL is not so much employed.
And I think this does have a direct affect on the TDD process. Primarily, you have a place to explore and to play with your code. Put the other way if you don’t have a REPL and you don’t have ultra-fast feedback on changes, then what do you do? You’d probably want to write lots of persistent unit-tests, or main methods, just so that you can pop into the codebase and run small parts of it.
FP and Immutability
Immutability makes for a saner code-base. I make the point that if you work with a language that’s opinionated in favour of propagating state-changes, then it would perfectly reasonable to want more tests in place, as to pin desired behavour down.
FP and dynamic languages lead to a lot less code. There’s less ceremony, less modeling. Because you’re managing less code you do less large scale refactorings. Stuff doesn’t change as much once it’s been written. The part of traditional TDD that leaves behind test-classes for regression purposes is less needed.
It’s my current opinion that what you get left out of TDD once you have amazingly fast feedback and a REPL is regression testing.
This is still needed of course, but there’s huge value as treating it as a separate concern to other forms of testing. I’ve been guilty of seeing it all wrapped up in TDD, as not something distinct warranting its own attention.
In some respects deciding how much regression tests to apply becomes a bit like ‘how much performance tuning do we need’. It depends on the business context, and it’s probably better not to do it prematurely, and with consideration and measurement.
There is an argument that unit-tests serve a purpose for live documentation. I would just question this. Although some FP code can be difficult at first glance it can still be written expressively, using vars with names more than one character. And there’s always comments with examples.
This is my view on why devs using Clojure seem to write less unit-tests. I wouldn’t go to the next level of dogma and dare say that devs shouldn’t ever write unit-tests. In fact I’d expect any unit-tests on a Clojure project to be of high quality and to really have a purpose. I’m also a bit reluctant to judge TDD as a whole. I think TDD for masses of devs including myself has taught us how to effectively refactor, and through some definition of TDD we’re probably still doing it, whether it’s through the REPL or just by hitting F5.
May 21, 2013 § 6 Comments
This post is a continuation of my earlier ‘Clojure at a Bank’ posts. I’ve since left the bank and am working for a large newspaper company, fortunately for me still writing Clojure.
It’s an obvious point to make, that different projects can have very different testing demands. At the bank we managed a throughput of financial products so it was critical that we got no surprises. Prod deployments were often like moon-landings, staged well in advance with lots of people in mission control.
At the newspaper it’s a bit different. Whilst bugs are still not to be warmly welcomed with a red carpet, the direction of investment can be shifted away from pre-emptive regression testing to active monitoring, along with some A/B testing.
Though the contexts of projects can differ wildly, I can still mull over some very familiar debates and practices.
TDD can get a bad rap because it’s so often thoughtlessly applied. People used to hold up 90-100% coverage metrics like they meant something, their eyes glazing over from staring at the IDE bar going red/green all day. I know this because I’ve experienced that dogmatic mentality myself, churning out little unit-test classes because that’s how I thought software should be written. A more senior colleague once expoused to me when surveying our plateau of tests that ‘writing production code is hard’. He was right.
I think that most people would concur that having lots of unit-tests can be very expensive. Deciphering mock scaffolds is little fun, as is encountering pointless tests around wiring. Refactoring prod code can be gleeful, but refactoring other peoples tests is just a pain.
But none of this is argument against TDD, rather just a recoil against how it’s dogmatically applied. The same goes for people holding up massive opinions about Scrum and puritanical XP, and frankly OO at times. Nearly everything sucks if it’s done by the book with no human contemplation and is not continuously improved.
Still, I think that adopting Clojure really does force a reappraisal of the testing situation.
Firsly I think that everyone still does TDD, it’s just that you use the REPL instead of writing actual test files. Some scaffolding you might keep around for future use, but it has to earn its keep.
Secondly, immutability is another serious game changer, as you’d expect with Clojure. Testing functions at a command line REPL that have simple data-in, data-out contracts is trivial. Compare this to OO land where you’re passing mutable state around and the contrast becomes apparent. Code written in a style that is more likely to have side-effects does need a higher rigour of testing, where you can emphasize more with the compulsion to break everything down into very small units and leave tests behind around them, just so that you can be sure that their behavour is pinned down.
There may be a problem that devs don’t write enough tests in Clojure-land, and strangely the reverse can be also true as devs over-compensate by adding tests of questionable value. I’ve been guilty of this by adding tests around the nitty-gritty kernel of a codebase that have never failed the build and no-ones probably ever read.
This is an area where I feel I’ve learnt some. My starting point is that I’ve seen projects basically killed by people going mad with ‘specification by example’ styled tests. Fit, Concordion, JBehave… DSL frameworks that encourage non-techies to write a boatload of tests that are later unmaintainable. I’ve seen occasions where sometimes it appears to work out, like when the DSL is written in Java all the way down, but most of the time it’s a large fail.
At the bank we had thousands of FIT tests and I still fantasise about going back and removing every last one of them. They mashed together the ideas of outside-in TDD, talking to the business, talking to the QAs, regression testing and system documentation. Out of these concerns regression testing was the only persistent benefit. The cost of them was huge including slow run times and lots of duplicated HTML and Java to manage.
Our Clojure efforts led us into different territory. Working on a sub-system and with regression as our prime concern, we stored up a large catalogue of test input and output expectation datasets – ‘fixtures’. We then made the fixtures queryable and stored them in ElasticSearch. From the REPL you could easily pull back some choice fixtures and then send them through the codebase. Because the fixtures contained expected outputs, we had some nice HTML pass/fail reports produced for results. We got some of the benefits of FIT for a fraction of the ongoing cost.
This approach is often referred to as data-driven testing. The tests in our case just were frozen JSON documents. As I was leaving we played with extending this approach to cover the whole of the system. Even though our system managed a complicated life-cycle of financial products, it was still possible to watch a product going through it and make observations that you could then store as JSON fixture data. You could theorectically record everything that happened as a result of the FIT tests being run and then use this data to remove them.
I regret that we didn’t tackle the testing mess head on across the whole system before doing anything else major. But to be fair we couldn’t have done this without the learnings we got from the Clojure rewrite of a sub-system, and we what we picked up around testing in particular.
I’m going to write a subsequent post about how we’ve gone about testing at the newspaper.
May 2, 2013 § 4 Comments
The last couple of Clojure projects I’ve been on the team has invested a bit of time in pimping the REPL. Here are some of the things we’ve got happening when a custom bootstrap REPL namespace is loaded:
Dev of the day
Because it’s nice to make people feel special, show the dev of the day:
This can be pulled from GIT:
This code needs the command line tool figlet installed.
Print last few GIT commits.
Make the project feel alive with some recent commits
May as well show the GIT branch whilst you’re at it:
A dev on our team wanted to draw a picture. Rather than using something like Omnigraph he stuck to the principal of always using Emacs for everything, no matter how painful or irrational. So he fired up artist-mode and starting sketching. In one of our projects we’ve now got a nice ASCII sequence diagram explaining the flow to the wandering REPL dev. The ascii itself is stored as a txt file, slurped in and displayed upon bootstrap.
We once had a sequence of project quotes stored in a namespace and the REPL displayed one of them using rand-nth. It’s a nice way to start the day being confronted with messages like ‘The Horror.. The Horror!’, or quotes from Macbeth, Forest Gump and various gangster movies. If you get sick of the same quotes then add a new one.
Since my current project is for a newspaper company we’ve got a random headline showing up. You could in theory sit down and read the news (a flavour of) by repeatedly recompiling the REPL namespace.
- If you’ve got a web-app, launch it.
- Display what is running, ports etc.
- Use stuff –
(:use [clojure.pprint] [clojure.repl])
- Config – show what’s in ZooKeeper or whatever.
Useful REPL functions: Our bootstrap code has a bunch of handy functions in for making life easier for the dev. For example pre-loaded queries against some data-store or helper functions for data-structures you’re always munging. We have a separate repl-helpers NS that contains them and we list the contents using ns-publics.
I’m interested in what other people/teams have done to make life more interesting in the REPL. What’s cool and fun?
February 9, 2013 § 1 Comment
I read Paul Ingles’ post “From Callbacks to Sequences” which led me to Christophe Grand’s handy pipe function. The
pipe function gives you two things wrapped in a vector: a sequence to pass on for downstream consumption, and a function for feeding the pipe which in turn adds more to that sequence. I’ll reproduce the code here, with a slight alteration to limit the underlying queue as to avoid OOM:
This is great if you want to have a thread pushing stuff into the pipe using the ‘feeder’ function, and then to pass around a lazy sequence representing whatever is in the pipe. Nice.
I had a situation which was a bit different:
- I have a sequence and I want to do some processing on it to return another lazy sequence, exactly like what the
- But actually I want to do some parallel processing of the sequence, so therefore
pmapwould seem like a better choice.
- In addition, I want to be be able to govern how many threads are used during the pmap operation, and I would also like to explicitly set some kind of internal buffering so that downstream consumers will nearly always have stuff to work on.
We used the pipe function to write this.
This basically uses the pipe function having n threads feeding the pipe and then closing it when all threads are done. The act of closing it is performed by the supervisor thread and the use of a countdown latch.
Here’s an example creating a pipe-seq and consuming it:
The above takes ten seconds – ten threads consuming the seq.
We’ve got some code chaining up a few calls to pipe-seq, the consumer function being fanned out to different virtual thread-pools of different sizes, depending on the type of operation being performed.
One consideration of this approach is that the queues are kept as an internal detail of the pipe-seq function. If you want visibility over the queues this could be an issue and this approach may not be for you.
Feedback welcome, particularly if there’s a better way.