Two and a half years ago, in February 2014, I found myself profiling Spring web application a couple of weeks before the release date. The app exposed REST services that handled every request in a thread of its own, fetched from the thread pool, like many other servlet containers handle requests. When the QA engineer sent several requests to a specific endpoint, the app ‘choked’ and didn’t respond for several minutes. The code of the endpoint had a synchronization bug that eventually caused all the threads in the pool to block while a single thread was performing a heavy computation. Of course, I found the problem and fixed it; however, the scaling issue was still there. It made me wonder whether there is a better alternative to the thread-pre-request paradigm.
The answer that I found was in the form of the Actor Model (and its implementation by the Akka framework). In the actor model (as conceived in 1973 by Carl Hewitt et al), actors are “fundamental units of computation that embody processing, storage and communication”. Every actor has a unique identifier (address), and communicates with other actors via asynchronous messaging.
When an actor receives a message, it may create new actors, send messages to other actors or modify its behavior (or state). How is it different from the regular object-oriented approach? The key is the asynchronous messaging. It allows decoupling between the sender and the receiver.
Let’s take the example above: Actor1 sends a persist message to Actor 2. If we wrote them as objects, then actor1 would call actor2.persist and pass the message to it. That’s fine until Actors 3, 4 and 5 also want to call actor2.persist simultaneously from different threads; problem – now we need a synchronization mechanism for managing the calls to actor2.persist. This is exactly the point where things start to get messy. In the actor world, Actors 3, 4 and 5 send persist messages to Actor 2. The synchronization is solved by another advantage of the actor model – every actor handles a single message at a time. Because the communication is asynchronous, the actors are not blocked on the response and can process other messages at the same time.
I know what you are thinking: How does this magic happen? Actors don’t exist by themselves, they are created in actor systems. An actor system is responsible for managing the actors so that every actor will handle a single message at a time. It is also responsible for the translation of the actor address to its actual instance, making the communication transparent for actors that run on different machines in the same actor systems.
And one message at a time? How does it work with scale and performance?
The implementation of the actor system uses a mailbox, a queue for every actor, to store the received messages. It then uses a scheduler to decide when will the actor dequeue the message and process it. Since there is no direct connection between actors, several different actors can process their message at once, enabling the actor system to optimize scheduling for maximum CPU utilization. The implementation differs from one language to another, some use threads for concurrency (Akka for example), while others may use processes (Erlang). But the principle is the same.
• In the actor model, actors exist in actors’ systems
• Every actor has a unique address
• Actors communicate via asynchronous messaging
• Actors handle a single message at a time
In the next parts of this article, we will introduce you to the Akka toolkit, a great implementation tool for the actor system written in Scala.