Clojure at a Real Estate Portal

April 7, 2015 § 12 Comments

About a year ago I was approached by a former employer to help him build a very large, greenfields system in Java. It was a new online real estate portal that also needed a separate estate agents portal, a robust and scalable data processing pipeline, an internal administrative app to be used by a support team, and a multitude of data import and data management tools. The project had an unmovable deadline of around ten months for release and would be given a high profile launch in the UK. The website needed to be fast, robust and beautiful.

I initially turned the opportunity away as it’s no longer a personal want to code on large statically typed OO systems. This is an argument that will rage on for eons; static vs dynamic typing, OO vs functional, and I won’t cover it in detail again here. Suffice to say that having worked on both Java and Clojure projects with very smart people involved, I just think that Clojure offers a simpler approach to programming, primarily for its grounding of opinionated immutability and persistent data collections. In the case of this project with the tight timescales involved, I genuinely thought that Java would be the riskier choice, given the propensity for developers to spend large amounts of their time iterating over an object model, which may or may not reward them later down the line. I politely made the case for Clojure in my response and then I waited, not investing too much hope in what could happen.

My former employer came back and said that Clojure looked a solid option. They had researched it and had become enthused about the potential benefits of functional programming and the pragmatism of Clojure. They had some deep concerns about being able to staff the project quickly, and this is where JUXT came in. We landed in five seasoned Clojure craftspeople on day one and a sixth soon after. Along with a solutions architect the client already had in place, we knuckled down and got to work. The client would successfully hire more Clojurians from the community as the project went on.

The architecture needed to be lean. The domain is more complex than you’d think, and so we needed to be able to make mistakes and recover. We went about building the system in an organic bottom up way, avoiding top level abstractions until they were justified. We also didn’t go in for the microservices approach, deciding early on to try and keep our codebase relatively unfactored out for longer. Clojure code is more naturally decoupled in not having objects or static types to bind code together, and so there’s less of a need to split it up preemptively. Keeping code together for longer is a conservative approach, as you don’t have to make the reverse journey if the split goes wrong.

For our technologies we made use of Ring, Compojure, Bidi, and http-kit for our Clojure web-apps, mixing in Friend and Liberator where appropriate. There is some debate as to whether Liberator is a good fit for general web applications vs RESTful services, but otherwise the use of this stack was pretty seamless.

We used Elasticsearch as the data repository for the front-end. I personally see ES as an old ally now, having used it on two Clojure projects prior to this one. Performance wise it’s extremely quick owing to a large in-memory cache it uses, and its search API is immensely powerful. Newer features such as ‘aggregations’ solve difficult problems for you. Because ES is JSON based it works seamlessly with a Clojure stack where the act of deserialising JSON via REST is trivial.

On the back-end we used tried and trusted Postgres. No complaints here, when it was midnight close to the release date and we needed to get a backlog of data through the system, then knowing how to fiddle an index as to boost performance makes you appreciate the well beaten track. My colleague Martin Trojer did some sterling work around building up a migration library to manage Postgres (and other data stores), and he also laid down some test fixture code for blowing away data in your local db prior to test runs.

Not every technology choice we tried would bring us sunbeams. We tried Storm early on as a contender for managing our asynchronous data processing pipeline. Storm is a great bit of kit, but it seems better suited to manipulating more straightforward numerical data, rather than bouncing large documents around a topology. When you have to frequently hit various data stores and want to update some central state, then you’re kind of going against the Storm ideal of having a collection of lightweight processing units wired up as an atomic job. Also there’s no denying that Storm is non-trivial to setup in the cloud, and in general it’s heavy to get up and running from the REPL. Onyx is a newer kid on the block that ought to be worth investigating.

We ended up making heavy use of Clojure’s core.async to manage our data processing pipeline, using external message queues to share the work across distributed processes. I couldn’t be happier with that approach, as the result is a lightweight, easily testable collection of functions wired up in a declarative, straightforward manner.

At one stage in the project JUXT conducted its own delivery assurance review as is standard practice for software consultancies. We brought in an external colleague who essentially reviewed our practices and choice of technologies. He pushed us to go leaner and simpler still, and as a result we collectively killed a few of our ‘darlings’ that were causing more of an overhead than they were worth. Not every decision we made paid immediate dividends, but what was more important was the shared team humility that would later lead to architectural agility.

Having a bunch of senior Clojure developers bring their own hard-won ideals to the table was undoubtedly a supremely joyful element of the project. For example I hadn’t jumped in with both feet into Prismatic Schema, fearing a lingering association with the kind of upfront modeling design you see so often on Java/.NET projects. Now though I think Schema is great and it was extremely useful for us, giving us vital contractual guarantees in our data-processing pipeline and a model for data coercions. We get the part of the cake back from OO that we enjoyed, the declarative schema definitions, and in this case they can be landed in organically rather than being introduced from the get-go.

Then someone came along and showed me Swagger, giving us a beautiful web UI for free on top of a previously headless restful service integrating with Schema. We wanted to be lean and pragmatic to be delivery focused, but now we were also enjoying our technology.

It was a fun project, but because of the amount of fixed scope and a non-negotiable deadline (TV-ads), it was the most challenging I’ve had so far. In particular with the need to ramp up development capacity it was difficult to get everybody singing from the same hymn sheet. Faced with the blank canvas nature of the greenfields and the introduction of new technologies there is a tendency to  ‘manage by democracy’, which can lead to the ‘big architectural debate’ anti-pattern, which if left unchecked can ultimately impede delivery. I was guilty at times of this, and looking to the future I like the approach of ‘managing as a meritocracy‘.

Another challenge was ensuring a shared team ownership. Lisp programming is arguably an individualistic pursuit, intensified in our case by developers sprinting forward to build up the system ready for launch. The cohesiveness of the team was consequently stretched at times, and this is something to be addressed after the dust has truly settled and IT operations enter into long term strategic mode.

In the end we delivered the project on time and on budget, the technology standing up to the demands placed upon it. The business has a lean platform for which to go forward. I feel we made it because of the individuals involved, a fantastic client, but also because of Clojure.

Unlike on recent projects this didn’t feel like a regular ‘Clojure project’ to me, where I’ve largely seen Clojure brought in to disrupt and to enthuse existing development teams, usually involving some training and emotive selling of the benefits. On this project Clojure was the accepted choice from day one, and with the deadline in mind the emphasis was on getting the job done. In this sense Clojure didn’t feel like a new technology or a risky choice, it just felt like another mainstream language we use at work. That feeling of cutting-edge freshness only really transpired when we occasionally got bit by the immaturity of a library, or when the less-frequent but enjoyable debates about idiomatic code usage patterns occurred.

We took Clojure for granted, but it’s still a reason we got over line. People expect that you can get up and running faster with dynamic languages than with some of the mainstream languages such as Java and .NET, and I think this is true. But for me the real win with choosing a immutable functional language such as Clojure is around the 2/3 mark in a project. Just when a large traditional OO system starts to buckle under it’s own gravity and the business rules become harder to reach through the applied refactorings and introduction of design patterns, this is when a Clojure code-base will shine in comparison. With the business rules being freely accessible and ready to be worked with, that deadline seems just a little bit more achievable.

  • http://scholanoctis.com Oliver Godby

    Great article, Jon – thanks for sharing your experiences and congratulations on the on-time, on-budget status. I am off to look at a bunch of the things you talk about above, Swagger in particular sounds very interesting… 😉

  • Ryan

    I used Swagger in Scala before, which was quite promising. Not sure the Clojure Swagger is similar though.

  • shay

    HI,
    You mentioned using elastic search for the front end, but is it not a server tech? how would you use it in a clojusrescript SPA web app?

  • jonpither

    Hi Shay,

    By “front-end” I mean’t the entire web application (client + server) that generated the website. On this project we called this is the “Front end” as there were various upstream, back-office data processing systems.

    For a ClojureScript SPA it should be trivial to call ES HTTP REST endpoints – I’ve called them often enough from the server-side, there should be no difference in ClojureScript (using cljs-http).

    • shay

      Great tx

  • bahadirio

    One of the greatest benefit of using Clojure is the ease of refactoring things where most of the time it gets messy in other languages, tendency to not to touch the code causes project slowdown. Also core.async abstractions is a big winner as well esp when the overall architecture get more distributed

  • Pingback: Bookmarks for December 14th | Chris's Digital Detritus()

  • http://leonid.shevtsov.me Leonid Shevtsov

    Hi Jon, did you use some kind of abstraction library over plain SQL for Postgres? How did you organise your database access layer? Thanks.

    • jonpither

      HI Leonid, we used HoneySQL. No magic for a DB layer, just a db package with relevant namespaces in it pertaining to the different domain pieces.

      • jonpither

        PS – I like HoneySQL, I think having SQL as CLJ datastructures is a good thing, but this is being actively debated (see YesQL)

  • Drew

    Great post! It seems odd that you went with bidi and swagger, but left yada out in favor of liberator. Would you mind explaining what fueled that decision?

    • jonpither

      Hi Drew. The main thrust of development was in 2014, so Yada wasn’t available then. Personally, I think for websites Liberator isn’t really justified (and for places where it might be, I would use Yada, once it’s post-alpha).

What’s this?

You are currently reading Clojure at a Real Estate Portal at Pithering About.

meta