Christopher Johnson Oct 2019 8 min read 218 views
When it came to building Stratiam as a robust and scalable system, we realised there would be some decisions that would be best made early on. One of those was how we architect the backbone of a system that will need to import and process billions of lines of data without adversely affecting the experience of users wanting to access and view reports based on that data.
In the past it was very common for both functions of a piece of software to be bundled together, even to run on the same server simultaneously. This is what developers call a monolithic application architecture. It has its advantages, it is quicker and easier to build up front, with a small team, and easy to see everything that is going on at once. However, as it scales up, and has more users, more data to process and a bigger team of developers, it becomes slower to operate, slower to develop and ultimately more expensive to keep running.
We needed a better solution from the get go for Stratiam. Whilst it's true that we could have built a monolithic application in a shorter amount of time, we knew immediately that we would quickly run into issues. Ahead of time, we knew that some of our data providers would be sending millions, perhaps even billions of lines of data to us, and that the resources required to process this could easily quickly become detrimental to end user experience. Not only that, but we needed a database solution that could handle huge imports with ease, whilst still responding to requests to retrieve that data. We needed separation of concerns, so a slowdown or problem one part of the application didn't affect others. We needed scalability, so it was possible to increase and decrease our capacity within different parts of the system independently. We needed microservices.
Microservice architecture is based on the concept of loosely coupled, broadly self-contained services. In our case, each one of these services is responsible for managing a particular type of data, both obtaining it from a data source, (for example a network router or a printer), and then processing and inserting it into a database. The system we use is managed in such a way that we can scale up or down capacity on an individual service basis. That means if we had 100 routers that we needed to collect data from, but only one printer, we could automatically increase the number of services collecting router data, whilst running only a single service to collect the printer data. Indeed, if we were only collecting data infrequently, the services could shut down completely in between collection cycles.
Historically, we would have operated one or more physical servers doing the collection and processing of data. This was expensive to maintain, especially given that there were times when the servers would be virtually idle, but that the hardware also needed to be able to handle peaks in demand. Inevitably, this meant an expensive solution that was built and maintained to handle the highest periods of demand, and that required expansion with each new customer or increase in service.
Fortunately, times moved on, and with cloud services we were able to set up an equivalent on virtual machines (VMs), where we could maintain equivalent hardware processing capabilities. Because the new VMs we accessed were not 'real' hardware (they run within a bank of physical machines inside a data centre) we had the ability to increase or decrease the number of servers and the hardware specification with a few mouse clicks. This was good news, it reduced the amount of idle hardware and redundant capacity, but still did not eliminate these things completely. We still faced peak demand moments that defined the minimum hardware specification required for our machines, and it was frustrating paying for high specification servers during the time that they were idle. We needed a new approach.
Enter containerisation. In short, it is similar to VMs, but rather than virtualizing the hardware, it is just the operating environment and application which are bundled together, with the goal of being able to run on any host machine without special configuration. A key difference is how it engages with the underlying hardware - whereas a VM would need to contain an entire operating system (OS) and virtualise access to the physical hardware (through a hypervisor), containers typically gain direct access to the underlying hardware through separate secure access to it from the main OS. This means they are not restricted by the host OS, but also have no access to its critical processes.
How does this help? Because of their differing characteristics, new containers can be started in seconds or even milliseconds, rather than the minutes a VM would typically take. This means that they can be started and stopped much faster and with more ease. Coupled with a management system that can scale up and down based on demand, it begins to form the basis of a solution for our expensive hardware redundancy problem.
Rather than running a few high end servers (physical or virtual) that have the capacity to deal with periods of peak demand, we have moved towards an auto-scaling system of individual microservice containers. Essentially, every single function we require is packaged into a standalone container, and when demand is high (such as gathering data from 100 routers in our example earlier) the system will automatically launch additional containers to meet demand. Crucially, it will also monitor demand and shut down excess containers when it is less busy. Depending on the configuration, it can even scale down to zero containers. This is good news, because we aren't running expensive redundant VMs, most cloud providers only charge for the time a container is active. It gives us an almost infinitely scalable solution without the waste of idle servers waiting just in case it gets busy.
Yes, although we are happy these are manageable. For example, containerisation requires compatibility between the container format and the container management platform. On the whole, this is currently limited to Linux based hosts and applications. Given we are writing the code running inside the containers and the software and libraries we need are widely supported and available for this, we don't foresee this as an issue.
The adoption of new technologies additionally holds some risk, such as whether there are undiscovered vulnerabilities. We are reassured to observe many large players in the market (Google, Microsoft, Amazon et. al.) adopting containerisation and microservice based approaches. Undoubtedly they will be investing huge amounts in its continued security and patching any vulnerabilities discovered. By utilising cloud based services we can benefit directly from their support and security, offering increased protection over managing our own hardware.
A new approach to software design and development also presents new obstacles for developers and designers who may be used to older monolithic application architecture. Not least amongst these is the new challenge of deploying hundreds of tiny standalone applications instead of one large package. However, we don't shy away from these hurdles, we run towards them and practice jumping! As a company we invest heavily in training staff to keep on the cutting edge of new technology. We're always looking for the best talent, and we're never afraid of a hearty team discussion on the best approach to solving a problem.
So yes, there are some potential downsides to microservices and containerised applications, but we're confident that the many upsides and their ability to serve our foreseeable future needs far outweigh these.
We believe that right now microservices are the way forward in highly available, fast scaling and robust application and hardware architecture, and that's why we're building them right into the heart of Stratiam.