This article will discuss the approach of decentralized data management by exploring some advanced data management architectural patterns that facilitate decentralized applications. By default, the microservice database management philosophy favors decentralization in different aspects of software design. This focuses primarily on decentralization, which does not guide the organization’s business logic, but it guides the way how data persists.
In the traditional database management approach, which is monolithic, software designs usually have monolithic data stores like SQL server, containing a single database with many tables. In this architecture, a centralized DB acts as an engine for data persistence. Different portions of the application may be offloaded into the SQL server as complex joins, queries, or stored procedures.
Microservice architecture mostly favors decentralized data management.
About REST (Representational State Transfer)
To organize the enterprise data in a decentralized structure, it is crucial to understand how data modeling needs to be done using REST or Representational State Transfer. REST guides the development of different stateless database systems now. The fundamental principle of REST architecture now is to give the resources which are part of the application and use some standard verbs of HTTP to interact with these resources.
A standard microservices deployment may take a decentralized data management approach, which may serve a different resource for various services, such as one service for the user resources, another for the text messaging resources, one for relationships, etc. Each service may have its own subsidiary database. However, this does not mean that there should be multiple database servers for each of these databases. All these databases under an application may have the same logical distinction, and all are hosted on a single SQL server.
However, the creation of such a logical distinction may set the base for quick and easy physical scaling overtime. If the platform gains a massive adoption, then admins can also further split into different logical databases and host those on multiple physical servers.
Avoiding SQL JOIN – Decentralized Data Management
A significant characteristic of an ideal decentralized data management approach is to drop the need for SQL JOIN. The need to have JOINs may be related to making the APIs easier for the clients. Say, for example, a messaging app we use may have a timeline view. This timeline may need to have the latest messages from each of the users by showing the name of the user and an image beside the message.
With a basic REST API, which we define, the client may need to make some API calls to get this detailed view. There needs to multiple APIs requests to be made, like a request to get the list of users, two or more requests get the name and image for each user, and requests to get the messages from each user. This may be unacceptable from the performance viewpoint as there is no not much roundtrip latency between client and server before this view is displayed.
An ideal solution for this is to add a more logical route to the API. With this, the client will be able to fetch a timeline resource for getting the data to be rendered in the timeline view. This is the significant difference between the approach of centralized data management and the decentralized data management approach.
In monolith databases, this route may be coded as SQL JOIN offloaded to the database server, which will access all the tables and generate the results. For SQL database management solutions, you can take the assistance of relational database services offered by RemoteDBA.com.
In decentralized data management, an SQL JOIN is not only the advisable approach, but it may also be impossible if the data is appropriately separated with proper physical boundaries. Timeline services can request backing the microservices in just milliseconds as timeline services.
Other microservices may be hosted in the same data center or the same containers hosted in a machine. To reduce the roundtrip latency, timeline service may also leverage the “bulk fetch” endpoints. The user microservice may have a particular endpoint that will accept a given list of user IDs to return matching objects. With this, the timeline service needs to make only one request for each service to get the output.
In decentralized data management, timeline services will function as a central point to define the logic for what the timelines are. As the requirements of the businesses change and the clients have to display the latest messages from each user, then it can be changed quickly in the timeline service without making any modifications in the other microservices which host the primary resources.
The separation between how data is stored and manipulated also needs to be refactored as they continue to adhere to the resources as expected by the timeline services. The maintainers of each service may also rewrite how the relationships are stored without the breakage of timeline services. However, in the case of JOIN queries, it requires to view all the joins against the given table and update the table structures if you want to achieve the same.
Side effects of decentralized data management
The significant side effect of a decentralized data management is the additional need to deal with eventual consistency. In centralized data stores, the developers may use transactional capabilities to ensure that the data remains consistent among various tables. Although, this may not be the case if data is kept separately in different physical or logical DBs.
Say, for example, a user tries to fetch the timeline exactly when another user deletes the account. The timeline services may first fetch the list of the second user from the related services. The second user deletes the account, with which the user object is deleted from the related service.
When the timeline service tries to turn the ID into the user details by requesting the related service, a 404 Not Found errors response is popped. Here, decentralized data modeling may also require conditional handling for detection or such conditions where the data is changed between two consecutive requests.
Decentralized data management has its own advantages and disadvantages, which can be appropriately deployed by starting with the REST basics. Doing this may need careful handling of the challenges of eventual consistency. You may also leverage the polyglot persistence for storing various types of data in the storage to handle the data best.