Techniques: dealing with blocking operations

Game logic often gets split into “time-sensitive” and “blocking” parts. When dealing with I/O-bound components, we often want to let them run separately from the main game loop. This way they won’t block the whole game as they’re waiting for their data.

The typical example is reading from disk, or accessing a database. Disk operations are orders of magnitude slower than operations on memory. If the data we need is already loaded and in memory, that’s great, we can use it immediately – but if it isn’t, we’ll have to load it first. And we wouldn’t want to put the whole game on hold, while we wait for the drive to seek to the appropriate sector on the disk.

The standard solution is to split the work into multiple threads: the fast main thread, which handles time-sensitive processing, and a worker thread (or a pool of threads), which will handle anything that could potentially block. But how to implement this?

In an ideal world, we’d just hand over the current continuation to a new lightweight process, operating on immutable data, and that would take care of everything. But ours is not an ideal world; we have to deal with messy details. :)

In practice, there are significant differences of opinion on how to go about farming out slow tasks to asynchronous workers. This will depend greatly on what the language allows, and what kind of bookkeeping we’re willing to live with.

Pool of work unit processors

Some higher-level languages have very nice abstractions for dealing with units of work. At the heart of this abstraction is a dispatcher: we give it a work unit, and it will take care of executing it independently of the main thread. Dispatchers typically manage their own pool of worker threads – each incoming work unit gets assigned to the next available thread, or queued up for future execution if all workers are busy.

An example of a dispatcher in Java is the Executor interface. One might say something like:

  final ProxyObject proxy = ProxyFactory.get(creatureType);
  executor.execute(new Runnable() {
    // this might take a while:
    ObjectData data = db.load(proxy.id);
    proxy.update(data);
  }

The code snippet creates a new anonymous Runnable object with a specific payload (loading data from the database, updating the proxy object with the results). The call to execute() returns immediately – and the work unit will be executed sometime in the future on a separate thread. (Anonymous class instances are a particular feature of Java, of course, but it’s possible to achieve the same effect by passing in callback functions or their equivalents.)

This level of abstraction is very convenient. Having customized work units whenever we need them lets us keep all game logic in the same place: the work unit code is written in the same place as the rest of game logic, even though it will be executed on a different thread, at some other point in time. This makes the code easy to follow.

It’s also easy to chain work units together. For example, suppose we have a multi-step process: read from the database, do some processing on the results, then use that to read something else from the database. It’s very clear how to write a work unit that does the processing, and in turn creates another work unit.

But that’s also a problem: any shared data will now require explicit synchronization, since it will be accessed from other threads at unpredictable points in the future. In the example above, the proxy object will have to be made safe for updates from multiple threads. Ad hoc work units mean we have to pay the cost of making shared data thread-safe.

Standalone processor

Sometimes sharing work units across threads is unavailable (eg. because of platform limitations) or undesirable. We can then use a more traditional solution: a dedicated processor that runs on a separate thread, which accepts work requests, and takes care of everything.

For example, a game thread might ask a resource loader for some resource by its id, and get a proxy back:

  ProxyObject proxy = app.resourceLoader.startLoading(creatureType);

…and that’s it. An empty proxy object is now ready to use, albeit lacking any useful data. At some point in the future, the processor will do all the blocking operations, and process their results: read from the database, lock the proxy, and clobber its contents with updated data. But this will be invisible to the caller. (Async proxies are another interesting abstraction, one deserving its own topic.)

Performance and access control are the main benefits of this approach. If the processor runs only its own processing code, instead of arbitrary work units, it’s much easier to inspect what it does and tune its performance. As for access control, the loader does not exist in the same “code space” as the game logic, and has no access to local game data. We need to explicitly give it access to anything that it might need. This way it’s easier to keep track of which threads are accessing what kind of data, and see where we need to provide locks or other synchronization mechanisms.

But its generality is also a problem. Ad-hoc work units are very easy to customize – we can do custom processing depending on the surrounding game logic (eg. a work unit for an agent might do one thing, a work unit for a scenery object might do something else, and so on). A generic processor, however, will not have this kind of game-specific logic.

Since it will be used by all parts of the game (and possibly more than one game), developers will be tempted to treat it as a library, and keep it general, abstract, and reusable. While that’s an excellent approach for a library, it limits what it can do for a specific game.

If customization is desirable, it will be up to the game system to find a way to manipulate the results of asynchronous processing. As a result, client objects (eg. proxies) may end up inheriting a lot of responsibility for post-processing the results.

Separation

These two cases illustrate two popular approaches to dealing with slow, I/O-bound operations:

  1. Having a separate pool of worker threads and ad hoc asynchronous work units
  2. Having a dedicated subsystem that is completely responsible for a limited set of desirable asynchronous operations

Dedicated system approach is more popular with performance-minded applications, such as console games, because the system’s performance can be more precisely managed. Tight control also allows for certain optimizations, such as pre-fetching data before it’s actually needed.

Ad hoc execution of asynchronous work units affords greater flexibility. They are easier to customize than a generic library, and there are no concerns about how exactly to split the work between the library and game code. At the same time, the solution requires a sacrifice of control. It’s more common in applications where it’s acceptable to trade some performance in exchange for readability and customizability.