Use of Persistent Storage on Distributed Application Code Shall be Simple

By libor

Distributed system are from its nature very hard to manage and even harder to build correctly. One of its key components which must be done right is persistent storage subsystem. Main reason for careful care is storage usually is one of sub-systems which limit entire application in reliability, availability exec. speed and scalability.

As persistent sub-system gets on central stage of interest over time this triggers writing several very interesting papers describing varies approaches to this domain area. Among the most well known is Google’s BigTable or recent paper from Amazon’s Dynamo.

If you want to see quick summary of possible approaches when saving data on distributed system you can go to Dare Obasanjo page which was partially created to explain Dynamo functionality.

While Dare’s article does very nice job of explanation what are pros/cons of each selected approach including that one used in Dynamo, it does not really considers additional possible constrains imposed by business. The most important one is SaaS business model or constrains of versatility in  service SLA definition (i.e. some services in the system might required higher/lover quality characteristics like reliability, durability, etc.).

What I actually mean by specific SaaS business model constrains? Well in short this means that running software must be capable of provide each client with customized level of reliability, availability and exec. speed according client specific SLA even on persistent storage level. For example one client company might elect for simple reliability model where all client data are stored only in one datacenter while other client wants data accessible/replicated data diversification from 2 or more different geographical locations. If this is your case then is impractical make application source code a way to be explicitly aware of stored data geographical location or prepare in code all possible storage option combinations which might occurs. Therefore correct solution is to make application code as simple as possible and rather build-in complexity of resolving version conflicts and replication to DAL or persistent storage layer only.

Lets return back to Dare’s examples and show how one can improve client and server code to enable easy use of such distributed storage. Examples are built with considerations of two users updating their friend list by adding each other as reference to it. Moreover both are using data from very different geo places. System is build due scalability requirements the way of storing data to different persistent storage places usually locally close to user location (i.e. use of shards). Main message is that data for both users will eventually become consistent.

As I have mentioned before leaving application developer to deal with all those kinds of troubles and compensations in case of failed data transfer on save is not good way forward. It makes application code complex, repeats same pattern all over and result code is very sensitive on proper implementation logic. Errors can’t be spotted unless good volume or manual testing is done or maybe some thorough code inspection is executed.

Used storage framework shall therefore make very easy way for developers to deal with persistent storage without worry to do any manual or in code data synchronization. Developer shall not even know where is data physically saved and what level of resilience they got (both might be subject of SLA which typically differs client from client).

Therefore application code we are following in our solution looks ideally like following snippet (fatal errors when calling methods is solved by throwing exception to higher level):

// Saving data no matter what location or resilience level is used 

public void confirm_friend_request_D(user1, user2){  

   DAL.update_friend_list(user1, user2, status.confirmed);  

   DAL.update_friend_list(user2, user1, status.confirmed);  

}  

// Retrieved data no matter from what location.  

// Load balancing support on persistent level when data read possible  

public list get_friends_D(user1){  

   return DAL.get_friend_list(user1);  

}

Persistent storage infrastructure shall then take much more responsibility and make higher effort to be able automatically resolve conflicts between versions and propagate changes from one geo location to another once persistent storage become available again. What I mean by this is that persistent storage sub-system shall be aware of possible relations between physical data locations even between virtual storage nodes and resolve missing or outdated versioned data.

In ideal situation will storage sub-system identify data dependences including those on geo place location (i.e. as in Dare’s case is identified that data for each user are placed on different datacenter location) and run consistency checks/repair/merge process.

With storage system driven reconciliation is raised question which storage component is active in data reconciliation – i.e. one which failed and is back on running track or some “independent” component which actively polls failed services to identify they are back in service. While component with active polling is possible way our experience shows is better to leave such activity on recovered service.

Recovered service can better estimate which data are missing, from where to pull/push missing data and finally when is process recovery fully done so it can be fully operational for all business read/write operations. In typical case of our execution environment (i.e. resilience maintained on sharded virtual storage node done across two datacenters) is recovery from short outage (i.e. say outage few minutes due network failure) resolved on single member node at most in several minutes under full system load. Recovery shall be transparent to all client applications and even shall not make any significant impact on other business saving transactions. In any case you are still optionally able to run reconciliation during off-peek hours or during maintenance window in case data amount is huge (i.e. you are doing join of new node whith replication of entire data set from active node to newly joined one).

Selecting proper storage strategy approach in distributed system is really tricky part. It is highly dependent on business requirements and target functionality. I think if there will be more distributed applications with SOA backbone we will see more those implementations and definitely stronger “drive” towards high scalability, reliability and faster exec. speed. From this environment might even emerge few recommended patter/frameworks which will solve all today’s issues in uniform way.

Tags: , , , ,

Leave a Reply