Clojure at a Newspaper

February 24, 2014 § 3 Comments

Let’s be frank about it; the MailOnline isn’t to everyone’s taste. As the worlds biggest newspaper website it is a guilty past time for many. It has some decent editorial content but it can also be distressingly shallow.

I worked for there for a year. I was lured by the opportunity to rebuild the old website system in Clojure. Whilst some in my circle have been furthering mankind over at the Guardian, I’ve been working for alternative forces.

There is a thread of arrogance in my desire to join a big media company and petition the building of a new system. The developers I found there humbled me in being some of the most personable and most talented I’ve had the pleasure of working with. Not to mention diverse.

When I first joined this project Fred George had been extolling the values of Programmer Anarchy. I don’t want to get bogged down in discussing the particulars of this software methodology, suffice to say that it is not a software methodology. Programmer Anarchy was Fred’s way of putting developers more in charge where a command and control regime had previously existed. It was a means to shake up an establishment which has since successfully led to durable self organising teams leveraging modern technologies.

I arrived as “anarchy” was rife and the continents had yet to form. The rewriting of the public facing newspaper website had yet to begin and so there was a clear opportunity to put a stake in the ground.

The reality was that I was fortuitous. The incumbent CTO had already manifested a data pipeline where all the data needed for the website went from the old system straight into an ElasticSearch instance. He knew that only way to begin slaying the old publishing/editing system was to free up the data. Future developers could now build a new Eden by exploiting this easily accessibly NoSQL store.

Before Clojure (BC) the CTO had the idea that competing teams would trial a Node Vs Ruby Vs X solution as to best appropriate the data for building the next generation website. I was joined on my first day by Antonio Terreno and we soon started hacking on Clojure to see what we could do.

After a couple of weeks we presented a prototype of the website back to the wider team. During the presentation we demonstrated how easy it is to hit ElasticSearch from the REPL and to mash Clojure data structures against HTML templates as to get the desired output. We showed how you could easily arrange sequences of articles using functions such as partition-by to get table layouts. With website content being broken up into simple data structures along with Clojure’s stupendously powerful sequence library, it all becomes very easy.

After the presentation the wasn’t much appetite to build similar efforts in Ruby or Node*, and so Clojure was declared as the winner.

There was nothing onerous about what we had achieved. We used some functional programming to get the data into a form that it could be easily rendered, and we were organised about how we pulled back data from ES as to get good performance.

Yet fundamentally we hadn’t written much code. I raised this sheepishly with the CTO and his response was: “that’s how I know it’s the right solution”. Clojure is the winner here.

A year later and 20k LOC the website is predominantly Clojure based. The home pages, main channel pages, and articles are served through Clojure, along with a mobile version of each. The old platform will linger on serving legacy parts of the website that need to be updated, but its day are numbered. Killing off the various dusty corners of the website requires business stakeholder input as well as bullish developers.

The bulk of the work was (and still is) around having a production ready platform; monitoring, diagnostics, regression testing, metrics gathering, performance tuning and configuration management**. We have a variety of back-end services which interact with systems such as Facebook and Twitter, and we’ve upgraded significantly the data-pipe that carries data from the old system into the new. We also have a significant web-application that proxies the main web application as to manipulate the HTML using Enlive as to feed it into a CMS for editors to use.

All this is really only phase one. There is a pent up demand from the business for new functionality and many separate sub-systems that need to be rewritten.

For me, the real win has been how the wider development team has embraced Clojure. By the end of my year there are devs hacking on elisp as to make their development environments more personable. We have vim warriors doing their vim warrior thing. There have been numerous Clojure open source contributions and involvements on the London Clojure community scene. We’ve been able to spawn Clojurians in various countries outside the UK. We had a couple of presentations at the Berlin EuroClojure and I had a world beating hangover afterwards for the plane ride home.

I’m sad to move on but I leave behind a strong team who have fully embraced Clojure and love the technology. My immediate future is working for JUXT full time (juxt.pro).

* Node.js is a dominant choice for various self-contained public facing web applications.
** I plan to do a couple of future posts about the technical particulars.

§ 3 Responses to Clojure at a Newspaper

  • A german says:

    Hi,

    I’m interested on how you interfaced with Elasticsearch from Clojure? Did you use Elastisch? How did it perform/scale?

    Cheers

  • Jon Pither says:

    Hi,

    We do use Elastisch. We use the native client which is ~5x faster than the RESTful one.

    Performance with ES is great. Since ES has a large memory cache it’s comparable to Redis. The main performance bottleneck with ES will nearly always be deserialising the data (i.e. keywordizing the map data).

    One thing we did which was “out there” was to a serialise subsection of data in a large document into a single binary blob field, and then we stored that one field directly in the index itself. Then when we fetched data back we could ask for just that field. This gave us a performance boost.

    We only used this approach for one specialised case but it shows you have options.

    Overall though ES is great.

  • Hello Dear, are you actually visiting this site on a regula
    basis, if so after that you will absolutely obtain nice knowledge.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

What’s this?

You are currently reading Clojure at a Newspaper at Pithering About.

meta