Clojure at a Bank – Freeing the Rules

November 29, 2012 § 1 Comment

I’ve written previous posts about a team at an investment bank making the switch from Java to Clojure. In this post I’d like to focus on the business rules being moved in the process.

Rules/Memes

I’ve found that business rules at large institutions tend to resemble viral memes and genes. This is to say that they are regularly duplicated amongst systems, manual or automated, and most will persist long after the people maintaining them move on. The hardy ones manage to jump from dying systems into new ones.

In our case the rules are migrating from a large monolithic Java stack into a new Clojure code base. We wanted to give them a reformed existence that would be more streamlined, free of elaborately crafted OO structures, and where they would not be hemmed in by overly enthusiastic and rigidly defined tests.

Tags rather than Types

One of the first things we did differently was to use tags rather than types for modeling the business domain. In the old world a financial product would immediately be given its own Java class and its similarities with other products expressed through inheritance and interfaces, perhaps with some mixin styled object composition thrown in. This approach has failed our system. Sure that with the right people, time and foresight, most systems in most languages can be beautiful, but in my opinion the generalistic approach of modeling the problem with static types opens the door wide open to epic waste and codebases with limited options.

For our Clojure codebase we instead simply treated products and trades as maps of immutable data and we built up a corresponding set of keywords – ‘tags’ – to describe what we know about them.

Imagine that we’re processing orders for cups of coffee. Without breaking the problem domain down to the nth degree, we can start by introspecting and tagging whatever data comes our way with business significant terms such as :hot, :milky, and :has-froth. Predicates to drive business rules become trivial to write, i.e. (when (superset? tags #{:sprinkles :caffiene :large}) (do-something)). The code matching up data to tags would then become an all-important kernel of the system.

CSS styled JSON building

We used tags for a range of things, one of the prime cases being to build up JSON for sending to a downstream system. We used tags to determine a) JSON schemas and b) JSON values.

Taking values first, imagine that we’ve built up a JSON structure but with the values omitted, i.e:

{:beverage {:type _ :temperature _ }}

A simple DSL for defining values may then look like:

(values/add :type #{:cappucino} "Coffee")

and:

(values/add :temperature #{:cappucino} "Hot")

Here the rules look a lot like CSS. We use path selectors such as :type and :temperature and instead of matching on HTML class attributes we simply match on tags; #{:cappucino}.

The rules for building up the schema could work in a similar way. A :frappuccino may need a different outgoing data structure than for a :cappucino, perhaps to do with some inherent complexity of iced drinks.

What’s covered here is fairly trivial stuff but it’s a starting point for more. You could extend this DSL as to allow multiple value-rules to be defined at once, to add predicates, and to make the values Clojure functions that work off the input data. You do of course need some core boilerplate code for mashing the JSON schema and value rules together but this should be simple to write.

Like CSS there are some pros and cons. Pros are that adding new rules becomes trivial and that the rules themselves are inherently reusable. Cons are of having to manage a large set of flattened rules.

Rules as Data

On our project we’ve got a fight on our hands in that the number of business rules is large whilst having a wide amount of variance across them. Instead of spreading the rules out we have now a few thousand – and growing – number of rules living in a concentrated handful of Clojure namespaces. At first glance this looks like poor design, this is fugly and bad right? What about storing the rules in a DB or reapplying a fair bit of OO styled modeling?

First because all Clojure code is data then we absolutely do have a rule DB, and not just a pile of namespace code editable in Emacs. By thinking of the rules entirely as data we’ve opened options for ourselves.

For example we’ve built a UI that allows non-technical users to directly browse the rules and to walk the hierarchies between them. Users can play around with input data and tags to see what rules are used for a given context. They can view information about why a certain rule is selected above another, the CSS-like specificity behavour. We also show the source code, namespace and line number behind each rule. A team member recently added the ability for devs to click on an html link that opens up Emacs at exactly the right point where the rule-code is.

The rules form an audit trail of how payloads were crafted. Since we also use them for data reconciliation purposes users can now write comments on data mismatches, directly in the context of a rule.

REPL, Builds and Tests

The tags and rules-as-data approach has allowed us to build up a set of tools for experimenting in the REPL. For example it’s common to query an ElasticSearch instance from the REPL to bring back a certain population of test-data and then to throw it at the rule engine to see what happens. We’ve got build-server agents doing a similar thing to alert us when changes in the rules bring about unexpected consequences.

As a more immediate line of defense we’ve got unit tests utilising core.logic to make sure that the rules are sane and that repetition and redundancy are kept to a minimum if not prohibited. We’ve added to the UI the ability to highlight sets of rules that are too close to each other, where a revision of tags might be applied to clean things up.

And if one day we decided that our rules needed to be stored in a completely different way, then we could always write some more Clojure code to read them back in and to spit them out again as something different. The door to moving them into a graph DB, RDF, or into a fact based DB like Datomic is not closed.

Wrap up

We’ve learnt that when you’re working with rules as data then many more possibilities open up as to what you can do with them. Quite the opposite to what you typically see at large institutions where the rules are tucked away like diamonds in the earth inside of type-heavy OO modeled systems. For us Clojure has been a great emancipator of business rules.

Clojure at a Bank – Clojure Code Immaturity

November 4, 2012 § 13 Comments

I’ve posted recently about a team at an investment bank wanting to make the transition from Java to Clojure. Here I want to write about some of the issues around our Clojure code being ‘immature’. Before I do though it’s only fair I state up front that not all of our early code was terrible, Clojure is indeed a pragmatic language where you can write decent and understandable code relatively easily. Still..

Comments and LOC

Most of the devs on our team have a TDD/DDD/BDD background with half having once plied a trade as XP consultants. Our approach to writing beautiful Java code was to make it flow and to tell a story. Expressive names for classes, methods and variables, each chosen to convey clarity and meaning to the fortunate reader.

Therefore when we jumped both feet into Clojure we unconsciously brought with us the belief that comments just weren’t needed. Add in to the mix that we gave our args the shortest possible names – most of the time just single characters – one could argue that we purposefully went about trying to obfuscate what we were writing.

Then to make our code more fugly, we executed the common newbie sin of not really knowing what’s in Clojure.core but churning out bucket-loads of FP code anyway. For example we had a brave early attempt to get around assoc-in not being able to work with nested vectors as well as maps when it actually could (assoc-in m [:key1 0 :bar]). This led to some funky code existing in functions with interesting sounding names like ‘weave-in-vectors’ – the choice of naming being a sure fire smell that there must be a more idiomatic way of doing things. Then there’s the little stuff: ‘if’ and ‘let’ vs ‘if-let’, ‘let’ as its own form rather than embedded into a ‘for’ as ‘:let’. Then there’s zipmap, juxt, mapcat, group-by… a considerable list of helper functions that avoids us having to write our own cruft.

I also have to own up to having a personal fetish for a low LOC wanting it to compare ever so favourably to the old Java stack that preceded it. The cost was that some people wondered wtf some of my code did but at least there were few lines of it. There’s got to be some prize for that, right?

We matured past these issues by communicating amongst ourselves as we found better ways of doing things and thankfully we had a team where criticism was generally well received. Clojure itself is an opinionated language and when you’re coding in a more idiomatic way the pieces of the language tend to fit more easily together. Idiomatic = more graceful/simplistic. Don’t say to a colleague: “Your code is shit”, do say: “There’s a more idiomatic way of doing this”. StackOverFlow and blog posts are full of examples of how to write more idiomatic code for particular use-cases and the Clojure Google Groups are good also.

Namespaces

In Java/.NET development we now have extremely powerful tools that help you to navigate your way around a large code-base – i.e. Eclipse/Intellij for Java. As the amount of class files inexorably grows it never really seems to matter and you just get used to it. (Here’s a controversial Recursivity blog post entitled “IDEs Are a Language Smell”).

In FP a single namespace will nearly always contain much more logic than compared to the average OO class. Since you’ll be spending more focused time in fewer files this then creates a need for namespaces to be presentable. Comments at the top can be helpful, they should have fewer public functions, and they should be split up if they get too large – we’ve occasionally split out the cruft from a namespace into an accompanying ns-utils.clj to make the main one clean. We’ve also reapplied various bits of DDD and OO to model namespaces around business domain concepts and to keep them well encapsulated.

Then there’s (:require) vs (:use). (:require) is much better as each dependency usage is clearly marked with a prefix so you can clearly see where dependencies are used. This is kind of obvious but in the early days we used (:use) in most places – without only – and now we’re having to go back over and correct. Note that we did play around with using lein-slamhound for optimising our namespace declarations but then we found that the kind of namespaces you typically want to use this on need restructuring anyway.

Macros, Protocols, Defrecords

Having fun with macros is a right of passage. Some people passionately detest them whilst others enjoy using complicated solutions to solve complicated problems. I’ve learned that if I’m going to build a macro then it helps to keep it minimal and to delegate out the logic into a separate function relatively quickly. We once had some special code eval’ing deftest forms to generate tests based on some data that we had saved up. The idea was that the auto-generated tests would then play nicely with lein-test and consequently our build server. The trouble is that arguments to macros have to be serialized and you’re limited in by this, not to mention that the code can become that much harder to follow. By looking under the hood at what the deftest macro actually did – basically registering test functions – we replaced this little mini-framework of macros and evals without much fuss and it gave better performance in return (we stopped reloading the same data twice, once to register the tests and once to run them). Macros are helpful and powerful but they come with a cost.

Protocols are like having an extremely awkward member of the team around; someone who can get stuff done with an air of awesomeness yet at the same time you’re wondering if there is just isn’t a simpler way. They have some quirks setting them up and it’s added complexity, but on the whole I have to say that Protocols have been good for us. We had some hairy areas of the codebase where we were doing concurrent operations passing around a lot of functions, partially built up or otherwise. Introducing Protocols allowed us to pass a family of related functions around together as one thing with some immutable state. In a different area of the code-base they also laid down some hard interface definitions. We didn’t strictly need them for this purpose but enforcing a little of compiler-checked OO felt like a good thing.

Defrecords were primarily introduced for performance reasons. Where we were creating lots of little map instances we switched to using simple Defrecords instead. I would also argue that they forced us to think a bit harder about our modeling of data and use of simple data structures, leading to cleaner code in some areas.

Wrap Up

In my opinion figuring how to write Clojure code that is more idiomatic and simplistic is what makes Clojure fun, along with the fact that Clojure is blistering pragmatic. The learning never stops.

Where Am I?

You are currently viewing the archives for November, 2012 at Pithering About.