Ninety Six Megabyte, Ten Ten Minutes discussed on Software Engineering Daily


If you only had one replica of rocks DB that would be enough to serve your queries. But of course, if that database got blown away in a hurricane or got ruined otherwise than you would lose all of your data. So you need to replicate. Your rocks db instances, and you need to have these replicas maintain consistency. Among each other. So you have your tikey data broken up into regions and a region is a set of continuous key value pairs and each of these regions is replicated some number of times described the replication strategy for these regions of data in a tie cave, e instance, short so regions. Or you can call it chunks or partition is like you describe how we automatically breakdown. The key value pairs into a smaller chunks. So then we copy them and the way we replicate them and how we determine the different configurations of copying these pieces of data to ensure that when disasters happen, you still have your data is very much dictated by our implementation of the raft consensus protocol, so just a quick primer on the consensus protocol itself rats is what is known as a Corum based consensus protocol in what that means is that you literally have a group called wrath group. And this group consists of the same copy of data that live on different machines, and they have to vote a for a leader. So one of the copy becomes his so called leader that actually serves the traffic in interacts with the application layer. So in our case, the teddy be server and. The application layer on top of it. While the other copies serve as what is called followers to be there to essentially be ready to become a leader. If the leader machine somehow falls down, and then there will be a re election process to elect a new leader on a different machine to do the same thing. So that your service is always up, and because it's a election based system the copy of these regions that we make will always be in odd numbers. So either three copies or five copies or seven copies. And that's something that you as a user can configure the pending on a how available essentially that you want to have your data to be obviously more copy will give you more availability more failure recovery kinda backups. But that does give you a larger data footprint, so like measured before everything is kind of a trade off. If you want more nine, essentially you. To have to probably have been capacity to store this data and some people are okay with it. Some people aren't and the default size for these region. Currently is ninety six megabyte by default. And that is more a trial and error process in which we figured out through all our production usage testing that this is more or less optimal size to start with to balance again between the copies and the network traffic that needs to make wrath consensus work, and also, you know, the hot spot that could form with each of the copies hotspot, meaning that one particular region or one particular machine is getting all the traffic for some reason. And then other machines aren't doing nearly as much work these tikey machines in then you will suffer performance degradation of because those hot spots are forming, and we do have solutions within the system to actually detect this hot. Spot as forms and than automatically redistribute the the copies of the data again into different parts of the cluster. So that these hotspot skits removed automatically within the system, which is a really nice feature at a lot of users with large scale, really really likes. We could go very deep into the replication layer. But we're obviously you have like ten ten minutes left or so, right..

Coming up next