I’ve posted recently about a team at an investment bank wanting to make the transition from Java to Clojure. Here I want to write about some of the issues around our Clojure code being ‘immature’. Before I do though it’s only fair I state up front that not all of our early code was terrible, Clojure is indeed a pragmatic language where you can write decent and understandable code relatively easily. Still..
Comments and LOC
Most of the devs on our team have a TDD/DDD/BDD background with half having once plied a trade as XP consultants. Our approach to writing beautiful Java code was to make it flow and to tell a story. Expressive names for classes, methods and variables, each chosen to convey clarity and meaning to the fortunate reader.
Therefore when we jumped both feet into Clojure we unconsciously brought with us the belief that comments just weren’t needed. Add in to the mix that we gave our args the shortest possible names – most of the time just single characters – one could argue that we purposefully went about trying to obfuscate what we were writing.
Then to make our code more fugly, we executed the common newbie sin of not really knowing what’s in Clojure.core but churning out bucket-loads of FP code anyway. For example we had a brave early attempt to get around assoc-in not being able to work with nested vectors as well as maps when it actually could (assoc-in m [:key1 0 :bar]). This led to some funky code existing in functions with interesting sounding names like ‘weave-in-vectors’ – the choice of naming being a sure fire smell that there must be a more idiomatic way of doing things. Then there’s the little stuff: ‘if’ and ‘let’ vs ‘if-let’, ‘let’ as its own form rather than embedded into a ‘for’ as ‘:let’. Then there’s zipmap, juxt, mapcat, group-by… a considerable list of helper functions that avoids us having to write our own cruft.
I also have to own up to having a personal fetish for a low LOC wanting it to compare ever so favourably to the old Java stack that preceded it. The cost was that some people wondered wtf some of my code did but at least there were few lines of it. There’s got to be some prize for that, right?
We matured past these issues by communicating amongst ourselves as we found better ways of doing things and thankfully we had a team where criticism was generally well received. Clojure itself is an opinionated language and when you’re coding in a more idiomatic way the pieces of the language tend to fit more easily together. Idiomatic = more graceful/simplistic. Don’t say to a colleague: “Your code is shit”, do say: “There’s a more idiomatic way of doing this”. StackOverFlow and blog posts are full of examples of how to write more idiomatic code for particular use-cases and the Clojure Google Groups are good also.
In Java/.NET development we now have extremely powerful tools that help you to navigate your way around a large code-base – i.e. Eclipse/Intellij for Java. As the amount of class files inexorably grows it never really seems to matter and you just get used to it. (Here’s a controversial Recursivity blog post entitled “IDEs Are a Language Smell”).
In FP a single namespace will nearly always contain much more logic than compared to the average OO class. Since you’ll be spending more focused time in fewer files this then creates a need for namespaces to be presentable. Comments at the top can be helpful, they should have fewer public functions, and they should be split up if they get too large – we’ve occasionally split out the cruft from a namespace into an accompanying ns-utils.clj to make the main one clean. We’ve also reapplied various bits of DDD and OO to model namespaces around business domain concepts and to keep them well encapsulated.
Then there’s (:require) vs (:use). (:require) is much better as each dependency usage is clearly marked with a prefix so you can clearly see where dependencies are used. This is kind of obvious but in the early days we used (:use) in most places – without only – and now we’re having to go back over and correct. Note that we did play around with using lein-slamhound for optimising our namespace declarations but then we found that the kind of namespaces you typically want to use this on need restructuring anyway.
Macros, Protocols, Defrecords
Having fun with macros is a right of passage. Some people passionately detest them whilst others enjoy using complicated solutions to solve complicated problems. I’ve learned that if I’m going to build a macro then it helps to keep it minimal and to delegate out the logic into a separate function relatively quickly. We once had some special code eval’ing deftest forms to generate tests based on some data that we had saved up. The idea was that the auto-generated tests would then play nicely with lein-test and consequently our build server. The trouble is that arguments to macros have to be serialized and you’re limited in by this, not to mention that the code can become that much harder to follow. By looking under the hood at what the deftest macro actually did – basically registering test functions – we replaced this little mini-framework of macros and evals without much fuss and it gave better performance in return (we stopped reloading the same data twice, once to register the tests and once to run them). Macros are helpful and powerful but they come with a cost.
Protocols are like having an extremely awkward member of the team around; someone who can get stuff done with an air of awesomeness yet at the same time you’re wondering if there is just isn’t a simpler way. They have some quirks setting them up and it’s added complexity, but on the whole I have to say that Protocols have been good for us. We had some hairy areas of the codebase where we were doing concurrent operations passing around a lot of functions, partially built up or otherwise. Introducing Protocols allowed us to pass a family of related functions around together as one thing with some immutable state. In a different area of the code-base they also laid down some hard interface definitions. We didn’t strictly need them for this purpose but enforcing a little of compiler-checked OO felt like a good thing.
Defrecords were primarily introduced for performance reasons. Where we were creating lots of little map instances we switched to using simple Defrecords instead. I would also argue that they forced us to think a bit harder about our modeling of data and use of simple data structures, leading to cleaner code in some areas.
In my opinion figuring how to write Clojure code that is more idiomatic and simplistic is what makes Clojure fun, along with the fact that Clojure is blistering pragmatic. The learning never stops.