Clojure at a Real Estate Portal
April 7, 2015 § 12 Comments
About a year ago I was approached by a former employer to help him build a very large, greenfields system in Java. It was a new online real estate portal that also needed a separate estate agents portal, a robust and scalable data processing pipeline, an internal administrative app to be used by a support team, and a multitude of data import and data management tools. The project had an unmovable deadline of around ten months for release and would be given a high profile launch in the UK. The website needed to be fast, robust and beautiful.
I initially turned the opportunity away as it’s no longer a personal want to code on large statically typed OO systems. This is an argument that will rage on for eons; static vs dynamic typing, OO vs functional, and I won’t cover it in detail again here. Suffice to say that having worked on both Java and Clojure projects with very smart people involved, I just think that Clojure offers a simpler approach to programming, primarily for its grounding of opinionated immutability and persistent data collections. In the case of this project with the tight timescales involved, I genuinely thought that Java would be the riskier choice, given the propensity for developers to spend large amounts of their time iterating over an object model, which may or may not reward them later down the line. I politely made the case for Clojure in my response and then I waited, not investing too much hope in what could happen.
My former employer came back and said that Clojure looked a solid option. They had researched it and had become enthused about the potential benefits of functional programming and the pragmatism of Clojure. They had some deep concerns about being able to staff the project quickly, and this is where JUXT came in. We landed in five seasoned Clojure craftspeople on day one and a sixth soon after. Along with a solutions architect the client already had in place, we knuckled down and got to work. The client would successfully hire more Clojurians from the community as the project went on.
The architecture needed to be lean. The domain is more complex than you’d think, and so we needed to be able to make mistakes and recover. We went about building the system in an organic bottom up way, avoiding top level abstractions until they were justified. We also didn’t go in for the microservices approach, deciding early on to try and keep our codebase relatively unfactored out for longer. Clojure code is more naturally decoupled in not having objects or static types to bind code together, and so there’s less of a need to split it up preemptively. Keeping code together for longer is a conservative approach, as you don’t have to make the reverse journey if the split goes wrong.
For our technologies we made use of Ring, Compojure, Bidi, and http-kit for our Clojure web-apps, mixing in Friend and Liberator where appropriate. There is some debate as to whether Liberator is a good fit for general web applications vs RESTful services, but otherwise the use of this stack was pretty seamless.
We used Elasticsearch as the data repository for the front-end. I personally see ES as an old ally now, having used it on two Clojure projects prior to this one. Performance wise it’s extremely quick owing to a large in-memory cache it uses, and its search API is immensely powerful. Newer features such as ‘aggregations’ solve difficult problems for you. Because ES is JSON based it works seamlessly with a Clojure stack where the act of deserialising JSON via REST is trivial.
On the back-end we used tried and trusted Postgres. No complaints here, when it was midnight close to the release date and we needed to get a backlog of data through the system, then knowing how to fiddle an index as to boost performance makes you appreciate the well beaten track. My colleague Martin Trojer did some sterling work around building up a migration library to manage Postgres (and other data stores), and he also laid down some test fixture code for blowing away data in your local db prior to test runs.
Not every technology choice we tried would bring us sunbeams. We tried Storm early on as a contender for managing our asynchronous data processing pipeline. Storm is a great bit of kit, but it seems better suited to manipulating more straightforward numerical data, rather than bouncing large documents around a topology. When you have to frequently hit various data stores and want to update some central state, then you’re kind of going against the Storm ideal of having a collection of lightweight processing units wired up as an atomic job. Also there’s no denying that Storm is non-trivial to setup in the cloud, and in general it’s heavy to get up and running from the REPL. Onyx is a newer kid on the block that ought to be worth investigating.
We ended up making heavy use of Clojure’s core.async to manage our data processing pipeline, using external message queues to share the work across distributed processes. I couldn’t be happier with that approach, as the result is a lightweight, easily testable collection of functions wired up in a declarative, straightforward manner.
At one stage in the project JUXT conducted its own delivery assurance review as is standard practice for software consultancies. We brought in an external colleague who essentially reviewed our practices and choice of technologies. He pushed us to go leaner and simpler still, and as a result we collectively killed a few of our ‘darlings’ that were causing more of an overhead than they were worth. Not every decision we made paid immediate dividends, but what was more important was the shared team humility that would later lead to architectural agility.
Having a bunch of senior Clojure developers bring their own hard-won ideals to the table was undoubtedly a supremely joyful element of the project. For example I hadn’t jumped in with both feet into Prismatic Schema, fearing a lingering association with the kind of upfront modeling design you see so often on Java/.NET projects. Now though I think Schema is great and it was extremely useful for us, giving us vital contractual guarantees in our data-processing pipeline and a model for data coercions. We get the part of the cake back from OO that we enjoyed, the declarative schema definitions, and in this case they can be landed in organically rather than being introduced from the get-go.
Then someone came along and showed me Swagger, giving us a beautiful web UI for free on top of a previously headless restful service integrating with Schema. We wanted to be lean and pragmatic to be delivery focused, but now we were also enjoying our technology.
It was a fun project, but because of the amount of fixed scope and a non-negotiable deadline (TV-ads), it was the most challenging I’ve had so far. In particular with the need to ramp up development capacity it was difficult to get everybody singing from the same hymn sheet. Faced with the blank canvas nature of the greenfields and the introduction of new technologies there is a tendency to ‘manage by democracy’, which can lead to the ‘big architectural debate’ anti-pattern, which if left unchecked can ultimately impede delivery. I was guilty at times of this, and looking to the future I like the approach of ‘managing as a meritocracy‘.
Another challenge was ensuring a shared team ownership. Lisp programming is arguably an individualistic pursuit, intensified in our case by developers sprinting forward to build up the system ready for launch. The cohesiveness of the team was consequently stretched at times, and this is something to be addressed after the dust has truly settled and IT operations enter into long term strategic mode.
In the end we delivered the project on time and on budget, the technology standing up to the demands placed upon it. The business has a lean platform for which to go forward. I feel we made it because of the individuals involved, a fantastic client, but also because of Clojure.
Unlike on recent projects this didn’t feel like a regular ‘Clojure project’ to me, where I’ve largely seen Clojure brought in to disrupt and to enthuse existing development teams, usually involving some training and emotive selling of the benefits. On this project Clojure was the accepted choice from day one, and with the deadline in mind the emphasis was on getting the job done. In this sense Clojure didn’t feel like a new technology or a risky choice, it just felt like another mainstream language we use at work. That feeling of cutting-edge freshness only really transpired when we occasionally got bit by the immaturity of a library, or when the less-frequent but enjoyable debates about idiomatic code usage patterns occurred.
We took Clojure for granted, but it’s still a reason we got over line. People expect that you can get up and running faster with dynamic languages than with some of the mainstream languages such as Java and .NET, and I think this is true. But for me the real win with choosing a immutable functional language such as Clojure is around the 2/3 mark in a project. Just when a large traditional OO system starts to buckle under it’s own gravity and the business rules become harder to reach through the applied refactorings and introduction of design patterns, this is when a Clojure code-base will shine in comparison. With the business rules being freely accessible and ready to be worked with, that deadline seems just a little bit more achievable.