Email iconarrow-down-circleGroup 8Path 3arrow-rightGroup 4Combined Shapearrow-rightGroup 4Combined ShapeUntitled 2Untitled 2Path 3ozFill 166crosscupcake-icondribbble iconGroupPage 1GitHamburgerPage 1Page 1LinkedInOval 1Page 1Email iconphone iconPodcast ctaPodcast ctaPodcastpushpinblog icon copy 2 + Bitmap Copy 2Fill 1medal copy 3Group 7twitter icontwitter iconPage 1

Quick links

Introduction

This post outlines our journey into the world of Elixir over the last 12 months for the Limpid Markets platform.

Background

Limpid Markets is a real-time web application used to trade different precious metal products. It currently supports three products, two of which are derivatives (swaps and EFP) and the third is for physical precious metal. The core of the application is an orderbook where traders leave their prices and view other trader’s prices. Each product has its own orderbook and its own organisations (banks) and users (traders).

Here are some key points about this view:

  • Each orderbook represents a different product with different rules
  • Each user of the application currently has permission to access between 1 and 4 products
  • Users belong to organisations (banks, trading houses etc) and can only enter into negotiation with traders from other organisations
  • Traders must not know the identity of who left prices on the orderbook other than their own organisation’s prices

Which means that:

  • Each browser client has a different version of the data
  • The data for each product must be processed so that orders belonging to other organisations are anonymised

Original solution

The method for generating the product data was to iterate on each organisation in the system and then process 3 sets of data which formed a new version for that organisation. This was done each time the state of the orderbook changed, for instance, when any trader added or amended a price. With only a few organisations, this is not an especially intensive operation and when we set out to build the initial version in 2015, the number of organisations was fairly small (circa 40).

Each time a new version of the data was created, it would be associated with an incrementing sequence number and saved in redis. Updates would then be emitted to clients connected to a socket.io server on a namespace reserved for their organisation.

Growing pains

The original solution was a great success; we created and scaled from one to four products from the ground up, capturing 75% of all interbank market participants, including some of the biggest banks in the world. But four years down the line the application has become very different to what it was when we first set out. Thus far we’ve navigated the changes to the application without any significant change to the architecture or development stack, but eventually we started noticing the following growing pains:

  • Because we’re processing sequentially (on a single core), a delay between first and last emit started to occur
  • We were getting a high demand of requirements that needed user-level customisation but we were limited because the data snapshots sent to the client were always on an organisation level
  • Any basic user-level customisations to the user experience meant forcing the client to do some unnecessary processing on the data during rendering, this made server side rendering the application very difficult
  • Our application runs on multiple nodes, so we had to deal with redis deadlocking and potential race conditions
  • We had to use redis as a pub-sub for socket.io as we needed to emit updates from every node. There was an additional cost and it also created another dependency on a third party service we weren’t able to control
  • For each update we were making extra HTTP requests to a different backend service to get all of the organisations in our system, in order to loop through them which gave us a performance bottleneck

After sitting down with the project stakeholders in the summer of 2017 to review the vision for Limpid Markets for the future it was time to make some important decisions about how the software could meet the business needs moving forward.

Time for change

With more products being planned (and the additional organisations and traders associated with this), we decided it was time to revisit the orderbook as we knew that it would become a bottleneck to the future needs of the business.

We had a few options at this stage, we knew that some of the performance problems we were hitting could be solved by starting to utilise NodeJS clustering and with the benefit of hindsight, we were sure we would be able to design a much better overall solution.

However, having recently completed a successful hack week playing around with Elixir and Phoenix we decided to consider it as an alternative solution. We wrote a small prototype in Elixir to get a feeling for how the application could work and found that:

  • The built-in concurrency and distribution features of the language felt much more natural than trying to leverage a node cluster
  • The fault-tolerant nature of OTP would prevent us from taking down our entire system in the event of a bug
  • The immutable nature of the language lent itself well to the problem space of our sequenced data structure
  • Phoenix channels could greatly simplify our websocket strategy and remove the need of redis as a pub-sub

A new direction

The initial prototyping gave us some confidence that Elixir was not only a good fit for the future business needs but also that there were some substantial gains to be made over the industry alternatives such as Node.js. So we pursued the Elixir route further, determining three key aspects of the new direction which would require further exploratory work.

OTP

OTP is a collection of tools that ships with Elixir and Erlang and is perhaps the biggest departure from our old architecture.

We designed what is known as a Supervision Tree, a tree of worker and supervisor processes, which allows us to create a concurrent fault-tolerant application through message passing.

With Limpid Markets we have ended up modelling each product as a supervisor, and each organisation processor as a worker. When an update to a product happens, messages with the raw data are passed to each worker, which can handle all of the processing. These workers can be distributed seamlessly across many physical nodes in the cloud, with very little overhead.

This is far better than our previous solution because we’ve:

  • Built a fault tolerant solution. If we have an error processing data for one organisation, the rest of the organisations should continue running as expected and the process which crashed will be started
  • Built a concurrent solution: data is processed simultaneously for each organisation

The individual processes that form the supervision tree of the orderbook application

mnesia

In our first solution we were using redis as a store for organisation data structures. But when we processed the data, we needed to store the raw JSON for a particular organisation. This had various drawbacks:

  • The data had to be serialized and deserialized regularly, which has a performance penalty and also created additional complexity in the application
  • We had to rely on an additional external service (redis) to handle any update to our views, which has a performance penalty and impacts our ability to deliver the service to a strict service level agreement
  • There was a lot of work and complexity involved to get locking to work as expected in order to prevent race conditions

By using Elixir we get to leverage the powerful Erlang ecosystem. mnesia is a distributed DBMS baked right into the language. mnesia is a layer built on top of ETS and DETS; term storage built into the Erlang, but adds powerful features such as replication and synchronisation.

mnesia is a massive improvement upon redis because:

  • We can run synchronised transactions across multiple nodes, complete with read-locks
  • The data is stored in RAM on the nodes running the application, making it incredibly fast and no need for an external service
  • We can store native elixir/erlang data structures instead of just serialised JSON

JSON Patch (RFC 6902)

Our application utilises an operation format called JSON patch in order to keep payloads between server and client lightweight and versatile. The server sends a list of patch operations for the client to apply in order to update the grid. Here’s an example:

     { "op": "remove", "path": "/a/b/c" },
     { "op": "add", "path": "/a/b/c", "value": [ "foo", "bar" ] },
     { "op": "replace", "path": "/a/b/c", "value": 42 }

Our application therefore must be able to generate these patches. Unfortunately we could not find an off the shelf solution to do this for us, so we created our own and released it as a package on Hex, the elixir package manager.

Elixir ships with a build tool called Mix and a unit testing tool called ExUnit, which makes creating new packages like this a breeze. Because things are opinionated the only thing you really need to concentrate on is business logic.

Conclusion

We’re at the start of our Elixir and OTP journey but have already seen a lot of benefits. Using OTP to supervise our application has made it easier to reason about and easier for us to add new services. Phoenix is predictable and easy to work with, we’ve found it a perfect fit for real-time applications. MNesia has simplified our application because it has removed the need for us to use Redis at all.

You get an awful lot out of the box with Elixir, with mnesia (dbms), ExUnit (testing framework), Mix (build tool) and Hex (package manager) all coming wrapped up with the language, which allows you to hit the ground running.

Functional languages can be a bit of a daunting prospect if you’re used to OOP languages like PHP or Go, but Elixir does a great job of easing you in; you can go a long way without needing to know some of the more complex elements of the language and ecosystem, such as metaprogramming.

We’re very pleased of the way our new solution is shaping up, we’ve solved all of the problems we had, and have ended up with a much simpler codebase.

Want a job? Get in touch.

We're currently on the lookout for a software developer. If this article sounds up your street then drop us a line to arrange a chat.

Share: