Converting a RESTful Webservice to OpenDDS

Middleware News Brief (MNB) features news and technical information about Open Source middleware technologies.

Contents

Introduction

There is no question that the dominant distributed application paradigm is RESTful webservices. However, RESTful webservices have their limitations, and interactions between webservices can sometimes become awkward when services call each other or when one service needs a consistent copy of the data in another service. The Data-Centric Publish Subscribe (DCPS) model of the Data Distribution Service (DDS) may reduce this awkwardness.

In this article, I demonstrate how to translate a RESTful webservice to a DCPS/DDS application. Issues like API evolvability and security will be covered in future articles.

A Multi-player Online Game

           +---------+   +---------+     +---------+     +-----------------------+
           | PlayerA |   |         |---->|         |     |                       |
           | PlayerB |---| Server1 |     |         |****>| Leaderboard/Map/etc.  |
           | PlayerC |   |         |<----|         |     |                       |
           +---------+   +---------+     |         |     +-----------------------+
                                         | Control |
           +---------+   +---------+     |         |     +-------+
           | PlayerD |   |         |---->|         |     |       |
           | PlayerE |---| Server2 |     |         |<----| Admin |
           | PlayerF |   |         |<----|         |     |       |
           +---------+   +---------+     +---------+     +-------+
                                              |
                                              |
                                              |
                                         +---------+
                                         | PlayerA |
                                         | PlayerB |
                                         | PlayerC |
                                         |    .    |
                                         |    .    | Database
                                         |    .    |
                                         | PlayerD |
                                         | PlayerE |
                                         | PlayerF |
                                         +---------+

Imagine an interactive multi-player online game or simulation. Each player establishes a network connection to one of the game servers. In the example above, PlayerB is connected to Server1. The server hosting a player maintains the dynamic state of the player, which it must share with other players.

The server hosting a player is responsible for communicating with the player. To send an update from PlayerB to PlayerE, the message arrives at Server1, Server1 forwards it to Server2, and Server2 forwards it to PlayerE. Other game services may also send messages to a player. Thus, understanding which server is hosting a player is critical to game play.

The game system has a persistent store (database) of player information. The database contains information such as the player's handle, experience, items, etc. The Control Service marries the persistent player data with the dynamic player data and exposes this for other services to consume. Some services, like a real-time leaderboard, map, AI, or summarizer, may consume this data as a stream of updates. Other services, like an administrative console, may consume and manipulate this information in an ad hoc way, e.g., disconnecting an abusive player and adding that player to a deny list.

To frame the discussion about REST and DCPS, we will consider the subset of functionality that is related to player connectivity. That is, consider an extremely simplified game where the only things a player can do are connect, disconnect, and be disconnected by an administrator. A leaderboard shows who is connected, how long they have been connected, and their average connection duration.

RESTful Webservices

           +--------+       +--------+       +----------+
           | Client |------>| Server |------>| Database |
           +--------+       +--------+       +----------+

REST stands for REpresentational State Transfer. As an approach to designing an API and building distributed applications, REST dictates that an application consists of a set of HTTP resources that are manipulated with HTTP methods like GET, PUT, POST, PATCH, and DELETE. A client makes a request to the server indicating the method, resource, and perhaps a new representation of that resource. The server processes the request and indicates a result with a status and possibly a new representation of the resource. The set of methods is small and closed while the set of resources is open. The methods allow the client to create, read, update, and delete resources.

The resources served by a RESTful webservice are often stored in a logically and possibly physically separate system like a database. In this arrangement, the database is responsible for persisting the resources while the server is responsible for the business logic, security, logging, etc. Pushing the management of state into the database allows the server to be stateless, which facilitates horizontal scaling of the server. However, the scalability of systems that incorporate a database is still limited by the database.

In our example, the resource is a connected player. A Server must PUT a resource for the connected player in the Control Service when the player connects and DELETE the same resource when the player disconnects. An administrator can issue a GET to the Control Service to retrieve the connected player resource that has been augmented with persistent data. An administrator can also DELETE the augmented connected player resource. This DELETE must propagate to the Server hosting the player, which then disconnects the user. Notice that the Server effectively creates a web hook for the Control Service to perform the DELETE because it must be targeted at a specific Server.

Implementing a separate Leaderboard Service requires implementing publish/subscribe with REST. (Implementing the Leaderboard in the Control Service leans toward a monolithic design, which is often not a good idea.) The Leaderboard could:

  • Poll the Control Service, which must provide the stream of events received from the Server
  • Use a web socket so the Control Service can actually stream the events to the Leaderboard
  • Install a web hook that gets invoked when a player connects or disconnects

Since this interaction lends itself to publish-subscribe, most designs will decouple the Control Service from the Leaderboard using a generic Publish-Subscribe service.

Awkward RESTful Interactions

There are three interactions to consider: Control-Admin, Server-Control, and Control-Leaderboard. From a RESTful perspective, the Control-Admin interaction is straightforward and not problematic.

The bi-directional calls between the Servers and the Control Service are problematic. Essentially, the Servers are attempting to replicate their state to the Control Service, and the Control Service is attempting to replicate its state to the Servers. The main issue is consistency. The Server needs a mechanism by which it can ensure that the Control Service has all of the player connection records that it maintains in the face of transient failures. This is done by either 1) having the Server periodically re-PUT all of its player connection records or 2) having the Control Service periodically GET all of the player connection records for a Server. Both designs are inefficient. The same problem has to be solved in the opposite direction; i.e., the Control Service has to synchronize the Servers.

The streaming nature of the Control-Leaderboard interaction may be problematic depending on the semantics of the Leaderboard. Presumably, the Leaderboard should eventually be consistent with the Servers, which means there must be some kind of synchronization protocol to overcome transient failures. This synchronization protocol will also be used in the event that the Control Service restarts or the Leaderboard restarts. If the Control Service restarts, the old player connection records should eventually expire and new player connection records should be streamed as they are received from the Servers. If the Leaderboard restarts, the Leaderboard must synchronize with the Control Service and then process new player connection records. It may be necessary to republish all of the player connection records when restarting the Leaderboard.

If one had complete control over the system, one might try to redesign the system to avoid these difficulties. For example, one could combine the Server and Control Service or the Control Service and the Leaderboard. However, such a redesign may be impractical or impossible for a variety of reasons, like security, administration, microservices vs. monolith, and performance.

The core problem in both the Server-Control interaction and the Control-Leaderboard interaction is that of reliable data replication. That is, one service wants a complete and up-to-date copy of all or a subset of the resources in another service. Designs limited to RESTful webservices will require additional logic to handle failures, retries, expiration, reconciliation, etc. This adds complexity, which increases cost and makes testing difficult. From a development perspective, a service should describe the resources that exist while a generic protocol does the heavy lifting of transferring the resources and ensuring consistency.

Translating RESTful Webservices to DCPS

The DCPS capabilities of DDS allow developers to write distributed applications using a shared cache model. An entry in the cache is called a sample. A group of logically related samples is called an instance, which is identified by a unique key. Each sample represents a version of that instance. A topic is a set of instances that all have the same type. A DataWriter creates an instance by registering the instance or writing a sample of the instance. A DataReader reads or takes samples from a topic. DataReaders can also receive events when a new sample is available. DataWriters unregister or dispose the instance to remove it. A Publisher is a set of DataWriters, and a Subscriber is a set of DataReaders. A Participant is a set of Publishers and Subscribers.

To translate a RESTful webservice to DCPS, we will use an analogy that is common to both REST and DCPS: the database. To start the analogy, a row in a database corresponds to a resource in REST, which corresponds to an instance in DCPS. The primary key of the row corresponds to the identity of the REST resource, which corresponds to the key for the instance. The data in the row is the resource (HTTP) or sample (DCPS). The basic row operations in a database are create, read, update, and delete (CRUD). The following table shows how CRUD operations translate to HTTP methods for REST and operations in DCPS:

| CRUD         | HTTP           | DCPS               | Description         |
| ------------ | -------------- | ------------------ | ------------------- |
| Create       | PUT/POST       | register/write     | Create a resource   |
| Read         | GET            | read/take          | Retrieve a resource |
| Update       | POST/PUT/PATCH | write              | Update a resource   |
| Delete       | DELETE         | unregister/dispose | Delete a resource   |
 

To translate a RESTful webservice into a DCPS application:

  1. Define a topic for each resource kind.
  2. Define an instance for each resource by registering/writing the instance.
  3. Subscribe to and read/take samples where appropriate.
  4. Unregister/dispose the instance where appropriate.

DDS systems allow DataWriters and DataReaders to use different Quality of Service (QoS) options to achieve different objectives. Three QoS policies that are relevant to the system described above are reliability, durability, and partition.

The reliability QoS can either be best effort or reliable. For the simple game described above, reliable is the correct choice because the services should resend their samples in the event of failure.

The durability QoS has a variety of options, but the most appropriate one is transient local durability. Transient local durability means that a DataWriter will retain a configurable number of samples per instance so that late-joining DataReaders can receive a copy. This means that the Server will resend its player connection samples if the Control Service restarts. Similarly, the Leaderboard can synchronize with the Control Service. Reliability with durability facilitates eventual consistency in the event of failures, scaling out, and scaling in.

The partition QoS allows a Topic to be divided into logically separate groups. In the game example, the partition QoS can be used to restrict interactions between a Server and the Control Service to player connection records maintained by the Server.

DCPS Design Sketch

There are two topics in the simplified game: the Player Connection topic and the Augmented Player Connection topic.

The Player Connection topic needs at least the following:

    @topic
    struct PlayerConnection {
        @key string guid;
        string player_id;
        DDS::Timestamp_t connected_since;
        string server_id;
    };

The Player Connection topic is partitioned on the id of the server (server_id). The Server process writes samples on the partition corresponding to its id. The Control Service subscribes to all partitions of this topic by using a wildcard *.

Upon receiving a sample from the Player Connection topic, the Control Service writes a sample on the Augmented Player Connection topic. The Augmented Player Connection topic includes all members of the Player Connection topic and whatever members are added by the Control Service. Minimally,

    @topic
    struct AugmentedPlayerConnection {
        @key string guid;
        string player_id;
        DDS::Timestamp_t connected_since;
        string server_id;
        boolean force_disconnect;
        float average_connection;
    };

The force_disconnect member is set by the Admin service to indicate the player should be disconnected. This topic is partitioned on the server id. The Leaderboard subscribes to all partitions of this topic by using a wildcard *. The Server processes subscribes to their subset of this topic by using a partition equal to their id. The Server disconnects players for whom force_disconnect is true.

When a player connects:

  1. The Server writes a sample on the Player Connection topic.
  2. The Control Service reads/takes the sample.
  3. The Control Service writes a sample on the Augmented Player Connection topic.
  4. The Server and the Leaderboard read/take the sample.

When a player disconnects:

  1. The Server unregisters/disposes the instance on the Player Connection topic.
  2. The Control Service reads/takes the sample.
  3. The Control Service unregisters/disposes the corresponding instance on the Augmented Player Connection topic.
  4. The Server and the Leaderboard read/take the sample.

When an administrator disconnects a user:

  1. The Control Service writes a sample on the Augmented Connection topic with force_disconnect == true.
  2. The Server reads/takes the sample.
  3. The Server disconnects the user.
  4. The Server unregisters/disposes the instance on the Player Connection topic.
  5. The Control Service reads/takes the sample.
  6. The Control Service unregisters/disposes the corresponding instance on the Augmented Player Connection topic.
  7. The Server and the Leaderboard read/take the sample.

Conclusion

RESTful webservices have a straightforward translation to DCPS due to a mutual similarity with databases. HTTP resources correspond to instances and samples, and HTTP methods correspond to DDS primitives, such as register/write, read/take, and dispose/unregister.

Adopting the DCPS model of DDS may be advantageous when one service needs a copy of the resources managed by another service. This may improve latency because the data is already available at the point where it is required.