How a Read Is Done in a Paxos Rsm
For the 87th DistSys paper, we looked at "Log-structured Protocols in Delos" by Mahesh Balakrishnan, Chen Shen, Ahmed Jafri, Suyog Mapara, David Geraghty, Jason Flinn Vidhya Venkat, Ivailo Nedelchev, Santosh Ghosh, Mihir Dharamshi, Jingming Liu, Filip Gruszczynski, Jun Li Rounak Tibrewal, Ali Zaveri, Rajeev Nagar, Ahmed Yossef, Francois Richard, Yee Jiun Song. The paper appeared at SOSP'21. This is a second Delos newspaper; the starting time 1 was "Virtual Consensus in Delos," and we covered it in the 40th DistSys meeting.
The first Delos newspaper looked at building the replication/consensus layer to back up strongly consequent use-cases on acme. The system created a virtual log made from many log pieces or Loglets. Each Loglet is independent of other Loglets and may take a different configuration or fifty-fifty a different implementation. Needless to say, all these differences between Loglets are transparent to the applications relying on the virtual log.
With the replication/consensus covered by virtual log and Loglets, this new paper focuses on creating modular architecture on pinnacle of the virtual log to support different replicated applications with different user requirements. In my mind, the idea of the Log-structured protocol expressed in the newspaper is all nigh using this universal log in the most modular and make clean way possible. One tin build a large application interacting with the log (reading and writing) without too many thoughts into the design and code reusability. After all, we have been building ad-hoc log-based systems all the fourth dimension! On the Facebook scale, things are dissimilar — it is not ideal for every team/production to reinvent the wheel. Instead, having a smarter reusable architecture tin can go a long way in saving fourth dimension and money to build ameliorate systems.
Anyway, imagine a organisation that largely communicates through the shared log. Such a system can create new log items or read the existing ones. In a sense, each log detail is similar a message delivered from ane identify to one or several other locations. With message manual handled by the virtual log, the Delos application only needs to handle encoding and decoding these "log-detail-messages."
Fortunately, we already have a expert idea about encoding and decoding messages while providing services forth the encoding/decoding path. Of course, I am thinking of common network stacks, such as TCP, or even more broadly, the OSI model. Delos operates in a very similar fashion, just also with a great focus on the reusability and composability of layers. When a customer needs to write some data to the log, it tin can class its client-specific message and laissez passer it downwards the stack. Lower layers can practice some additional services, such as ensuring session guarantees or batching the messages. Each layer of the stack wraps the message it received with its ain headers and information needed for decoding on the consumer side. Delos calls each layer in the stack an engine. The propagation downwards through layers continues until the message hits the lowest layer in the stack — the base engine. The task of the base of operations engine is to collaborate with the log organization.
Similarly, when a system reads the message from the log, the log item travels up the stack through all of the same layers/engines, with each engine decoding it and ensuring the properties specific to that engine before passing it up. An imperfect real-world analogy for this process is sending a paper mail. Beginning, yous write the letter; this is the highest layer/engine shut to the application. And so y'all put information technology in the envelope — now the letter of the alphabet is protected from others seeing it. Then goes the postage stamp — it is fix to exist sent, then goes the mailbox — client batching, then postal service role — server-side batching, then transmission, and so the process starts to get undone from the bottom up.
Of grade, I oversimplified things a bit here, but such message encapsulation is a pretty straightforward abstraction to utilise. Delos uses information technology to implement custom Replicated State Machines (RSMs) with different functionality. Building these RSMs requires a bit more functionality than but pushing letters upward and down the engines. Luckily, Delos provides a more all-encompassing API with required features. For example, all engines have access to the shared local storage. Also, moving items upwards or down the stack is not done in a fire-and-forget mode, as responses can flow between engines to know when the call/request gets completed. Furthermore, it is possible to have more one engine sitting at the same layer. For example, one engine can be responsible for serializing the data and pushing information technology downwardly, and another engine can receive the item from an engine beneath, deserialize and apply it to RSM. The effigy illustrates these capabilities.
This tiered modular approach allows Delos to reuse layers across applications or even etch some layers in unlike order. So when i application or apply-case needs batching, that technology team does non need to reinvent the batching from scratch. Instead, they tin take an existing batching layer and add information technology to their stack in the appropriate location. The flexibility and reusability immune Facebook engineers to implement two different control-airplane databases with Delos. One datastore, chosen DelosTable, uses a tabular array API, while another organization, called Zelos, implements a ZooKeeper compatible service.
I think I will cease my summary here. The newspaper goes into more detail almost the history of the project and the rationale for making certain decisions. It too describes many applied aspects of using Delos. The main lesson I learned from this paper is near the overall modular layered design of big RSM-based systems. I think we all know the benefits but may ofttimes footstep aside from post-obit a clean, modular blueprint equally the projects get bigger and pressure level builds up to deliver faster. But then, what do I know about product-grade systems in academia? Nevertheless, I'd beloved to see a follow-up newspaper when more than systems are congenital using Delos.
Equally usual, we had our presentation. This fourth dimension Micah Lerner delivered a curtailed just very nice explanation of the paper:
Give-and-take.
1) Compages. This paper presents a clean and modular compages. I do non think there is anything majorly new & novel here, so I view this paper more like an experience study on the benefits of good design at a large company. I think there is quite a flake of educational value in this newspaper.
In the group, we also discussed the possibility of applying similar modular approaches to more traditional systems. For instance, we looked at MongoDB Raft in the group before. Cipher should preclude a similar blueprint based on a Raft-provided log in a distributed database. In fact, similar benefits can be accomplished — multiple client API, optional and configurable batching functionality, etc. That being said, for a system designed with i purpose, information technology is easy to get-go designing layers/modules that are more coupled/interconnected and dependent on each other.
We had some other like ascertainment in the group that mentioned a somewhat similarly designed internal application a while back, only again with a less clear separation between modules/layers.
2) Performance.A performance impact is a natural question to wonder almost in such a layered/modular system. The newspaper spends a chip of time in the evaluation explaining and showing how layers and propagation between them add very little overheads. What is non clear is whether a less generic, more purpose-built solution could accept been faster. This is a tough question to respond, as comparing different architectures is not trivial — sometimes it tin be hard to tell whether the deviation comes from design choices or implementation differences and nuances.
iii) Read cost & Virtual Log.This function of the discussion goes dorsum quite a bit to the first Delos paper. With a native Loglet, Delos assumes a quorum-based performance for Loglets, which may have less than ideal read functioning. This is because NativeLoglet uses a sequencer to write, but requires on quorum read and waits for reads with the checkTail performance. So a client will read from the quorum, and assuming the Loglet is not sealed (i.e., closed for writes), the client must wait for its knowledge of theglobalTail(i.e., globally committed slot) to grab up with the highest locally committed slot it observed earlier. This process is similar to a PQR read! Naturally, it may have higher read latency, which volition largely depend on how fast the customer'south noesis of globally committed slot catches up. In the PQR newspaper, we too describe a few optimizations to cut down on latency, and I wonder if they can apply here.
Moreover, a customer does not demand to perform an expensive operation for all reads — if a client is reading something in the by known to be, it can utilize a cheaperreadNextAPI, practically allowing a local read from its collocates LogServer.
4) Engineering costs.This give-and-take stemmed from the operation discussion. While big companies care a lot about performance and efficiency (even a fraction of % of CPU usage means a lot of money!), the engineering costs matter a lot too. Non having to redo the same things for dissimilar products in unlike teams can interpret into a lot of engineering savings! Not to mention this can allow engineers to focus on new features instead of re-writing the same things all over again. Another point is maintenance — cleaner and ameliorate-designed systems will likely be cheaper to maintain as well.
5) Not-control plane applications?The paper talks about using Delos in two control-plane applications. These often have some more specific and unique requirements, such as cipher external dependencies, and stronger consistency. The paper as well mentions other possible control plane apply cases, so it does not announced like Delos is done hither.
At the same time, we were wondering if/how/when Delos tin be used outside of the control aeroplane. For Facebook, at that place may not exist likewise much need for strongly consistent replication for many of their user-facing apps. In fact, it seems similar read-your-write consistency is enough for Facebook, and then deploying Delos may not be needed. At the aforementioned time, user-facing apps can have on more dependencies and external dependencies, achieving some code reuse this way.
Another point fabricated during the discussion is about making a more general and flexible replication framework that tin can support strongly-consistent cases and higher-throughput weaker consistency applications. We would not exist surprised if Delos or its successors volition one day support at least some stronger-consistency user-facing applications.
Reading Group
Our reading groups takes place over Zoom every Midweek at 2:00 pm EST. Nosotros have a slack group where we mail papers, hold discussions and most importantly manage Zoom invites to the papers. Delight join the slack grouping to get involved!
We covered nevertheless another state machine replication (SMR) paper in our reading grouping: "Rabia: Simplifying State-Machine Replication Through Randomization" by Haochen Pan, Jesse Tuglu, Neo Zhou, Tianshu Wang, Yicheng Shen, Xiong Zheng, Joseph Tassarotti, Lewis Tseng, Roberto Palmieri. This paper appeared at SOSP'21.
A traditional SMR approach, based on Raft or Multi-Paxos protocols, involves a stable leader to order operations and drive the replication to the remaining nodes in the cluster. Rabia is very unlike, as it uses a clever combination of determinism to independently society requests at all nodes and a binary consensus protocol to check whether replicas agree on the adjacent request to commit in the organization.
Rabia assumes a standard crash fault tolerance (CFT) model, with upwards tof node failures in a2f+one cluster. Each node maintains a log of requests, and the requests execute in the log's gild. The log may comprise a NO-OP instead of a request.
When a customer sends a request to some node, the node will first retransmit this request to other nodes in the cluster. Upon receiving the request, a node puts it in a min priority queue of pending requests. Rabia uses this priority queue (PQ) to independently and deterministically club pending requests at each node, such that the oldest request is at the caput of the queue. The idea is that if all nodes have received the aforementioned set of requests, they will have identical PQs.
At some afterward bespeak in time, the second phase of Rabia begins — the authors call it Weak-MVC (Weak Multi-Valued Consensus). Weak-MVC itself is broken downward into ii stages: Propose Phase and Randomized Consensus Stage. In the suggest stage, nodes exchange the asking at the head of PQ forth with the log'due south next sequence numberseq.This stage allows the nodes to run across the state of the cluster and prep for the binary consensus. If a node sees the majority of a cluster proposing the same request in the same sequence number, and then the node sets its state to ane. Otherwise, the node assumes a state of 0.
At this point, the binary consensus begins to essentially certify ane of the 2 options. The outset option is that a majority of nodes want to put the same request in the same sequence number (i.e., the country of 1). The 2nd option is to certify that there is no such common request in the majority(country of 0). For binary consensus, Rabia uses a modified Ben-Or algorithm. Ben-Or consists of 2 rounds that may repeat multiple times.
In round-1, nodes exchange their country and compute a vote to be either 0 or ane. The vote corresponds to the majority of state values received, then if a node received enough messages to indicate that the majority of nodes are in state 1, and so the node will take on the vote value of one. Similarly, if a majority has state 0, and then the node will vote 0. If no bulk is reached for any state, the vote is nil.
Round-ii of Ben-Or exchanges votes betwixt nodes. If the majority of nodes agree on the aforementioned non-nil vote, the protocol can terminate. Termination means that system has agreed to certify the request from the proposal if the consensus value is 1 or to create NO-OP if a value is 0.
In an ideal state of affairs, all participating nodes volition accept the same asking at the head of their PQs when the propose phase starts. This means that nodes will have the same land at the stop of the propose phase, allowing the binary consensus to certify the proposed request at its sequence number in merely i round trip (Circular-ane + Round-2 of Ben-Or). Then the request distribution + proposal + Ben-Or consensus under such an ideal case only takes iv message exchanges or ii RTTs. It is way longer than Multi-Paxos' platonic instance of ane RTT between the leader and the followers, but Rabia avoids having a single leader.
A less ideal situation arises when no majority quorum has the same request at the head of PQ when the proposal starts. Such a example, for example, may happen when the proposal starts before the request has had a chance to replicate from the receiving node to the rest of the cluster. In this case, binary consensus may accomplish the agreement on state 0 to not certify any operation in that particular sequence number, essentially producing a NO-OP. The authors debate that this NO-OP is a skilful thing, equally it gives enough fourth dimension for the inflight requests to reach all the nodes in the cluster, go ordered in their respected PQs. As a consequence, the system volition advise the same request in the side by side sequence number after the NO-OP. Both of these situations plant a fast path for Ben-Or, every bit information technology terminates in but one iteration (of form the latter situation does not commit a request, at least not until the retry with college sequence number).
Now, it is worth pointing out that the fast path of one RTT for binary consensus is not ever possible, especially in the light of failures. If besides many nodes accept a nil vote, the protocol will not take plenty votes agreeing for either state (one – certify the request, 0 – create a NO-OP), and the Ben-Or process must repeat. In fact, the Ben-Or protocol can repeat voting many times with some random money flips in between the iterations to "jolt" the cluster into a decision. For more information on Ben-Or, Murat's weblog provides ample details. This jolt is the randomized consensus part. The authors, still, replaced the random coin flip at each node with a random, only deterministic money flip so that each node has the same money flip value for each iteration. Moreover, the coin flip is simply needed at the node if there is no vote received from other nodes in round-ane of Ben-Or, otherwise, the node can presume the land of the vote it received. The whole process tin repeat multiple times, so it may not be very fast to terminate.
The paper provides more details on the protocol. Additionally, the authors take proved the condom and liveness of the protocol using Coq and Ivy.
The big question is whether we need this more complicated protocol if solutions like Multi-Paxos or Raft work well. The paper argues that Raft and Paxos get more than complicated when the leader fails, requiring the leader ballot, which does not happen in Rabia. Moreover, Paxos/Raft-family solutions likewise crave besides many boosted things to be bolted on, such every bit log pruning/compaction, reconfigurations, etc. The statement for Rabia is that all these extra components are easier to implement in Rabia.
Quite frankly, I accept problems with these claims, and the paper does not really go deep enough into these topics to convince. For case, 1 argument is that leader ballot is complicated and takes time. Surely, not having a leader is expert to avoid the leader election and performance punishment associated with information technology. But this does not come for free in Rabia. The entire protocol has more messages and rounds in the common instance than Multi-Paxos or Raft, and then it kind of shifts the complexity and price of having a leader election once in a blue moon to having more lawmaking and more communication in the common case. I am not sure it is a adept tradeoff. The performance claim of irksome leader elections in Paxos/Raft is also shaky — failures in Rabia can derail the protocol of the fast path. I am non sure whether the impact of operating under failures is comparable with leader-election overheads upon leader failures, and I hope Rabia may have a point here, only the paper provides no evaluation for whatever kind of failure cases.
And speaking of evaluations, this was the biggest disappointment for me. The authors claim Rabia compares in performance to Multi-Paxos and EPaxos in three and five nodes clusters, with 3-nodes in the aforementioned availability zones allowing Rabia to outperform EPaxos. In fact, the figure below shows that Rabia beats Multi-Paxos all the time.
But there are a ton of assumptions and tweaking going on to get these results. For example, Rabia needs to accept enough fourth dimension to replicate client requests before starting the propose phase to have a good take chances for completing on the fast path. And then the testing is done with ii types of batching to give the delay needed. The effigy mentions the customer batching. However, at that place is also a much more than all-encompassing server-side batching which is mentioned only in the text. Of grade, there is nothing wrong with batching, and information technology is widely used in systems. For all the fairness, the paper provides a table with no batching results, where Multi-Paxos outperforms Rabia fivefold.
The biggest issue is the lack of testing nether less-favorable conditions. No evaluation/testing under failures. No testing when the network is degraded and does not operate on the timing atmospheric condition expected by the protocol. These issues impact existent performance and may create reliability issues. For example, a network degradation may cause Rabia to not use a fast path and consume more resources, reducing its maximum processing capacity. Such a scenario can act equally a powerful trigger for a metastable failure.
Every bit usual, we had a nice presentation of the paper in the reading group. Karolis Petrauskas described the paper in great particular:
Discussion.
one) Evaluation.I have already talked about evaluation concerns, and this was i of the discussion topics I brought upwards during the meeting.
ii) Use of Ben-Or.Ben-Or is an elegant protocol to reach binary consensus, which is not usually useful for solving land machine replication. Traditionally, Multi-Paxos or Raft concur on a value/command and its sequence number, so they need a bit more than than just a aye/no agreement. However, Rabia transforms the problem into a series of such yes/no agreements by removing replication and ordering from consensus and doing information technology apriori. With deterministic timestamp ordering of requests, Rabia just needs to wait for the operation to be on all/most nodes and agree to commit it at the side by side sequence number. And then the consensus is no longer reached on a value and order, only on whether to commit some command and some sequence number.
3) Practicality.The evaluation suggests that the arroyo can outperform Multi-Paxos and EPaxos, but whether it is applied remains to be seen. For 1, it is important to see how the solution behaves under less ideal weather condition. Second, it is also important to meet how efficient it is in terms of resource consumption. EPaxos is not efficient despite beingness fast. The boosted message exchanges over Multi-Paxos, Raft, and even EPaxos may cost Rabia on the efficiency side.
4) Algorithms.The paper provides some nice algorithms the illustrate how the protocol works. However, some of the atmospheric condition are not necessarily confusing. In the aforementioned algorithm, the authors utilize f+ane, n-f, and floor(n/2)+1 to designate the majority in an n=2f+i cluster. Please proofread your algorithms — a chip of consistency can amend readability a lot!
Reading Group
Our reading groups takes identify over Zoom every Wednesday at ii:00 pm EST. Nosotros accept a slack group where we post papers, hold discussions and most importantly manage Zoom invites to the papers. Please join the slack group to get involved!
Our concluding reading group coming together was nearly storage faults in state auto replications. We looked at the "Protocol-Aware Recovery for Consensus-Based Storage" paper from FAST'xviii.
The paper explores an interesting omission in about of the state auto replication (SMR) protocols. These protocols, such as (multi)-Paxos and Raft, are specified with the assumption of having a crash-resistant deejay to write the operation log and voting metadata. This disk information allows crashed nodes to restart safely. Still, the real-life gets in a manner a flake, every bit infallible storage is as real as unicorns.
Storage may fail in peculiar means, when some information may get corrupted, while almost other data is right and the server itself continues working. The problem here is handling such failures. The simplest way is to treat the server as crashed. All the same, the server must remain crashed, every bit restarting may get into fifty-fifty more severe state corruption, as the server replays the operations from a faulty log. The newspaper talks about a diverseness of other approaches taken to deal with these data bug. The authors state that all the mechanisms they have explored were faulty and led to liveness or rubber issues. I personally exercise not buy such a blanket argument, but a few of the examples in the newspaper were really interesting.
The paper then suggests a solution – Protocol-Aware Recovery (PAR). The main point here is to avoid advertising-hoc solutions because they are either slow, unsafe, complicated, or all of the to a higher place. This makes sense since such a big omission (potential for information-corrupting disk failures) in protocols should be addressed at the protocol level. The paper draws heavily on the Raft country machine protocol and develops the recovery procedure for it.
The log recovery is leader-based and can be broken down into two sub-protocols: follower recovery and leader recovery. The followers are recovered past restoring the information from the leader who always knows of all the committed history. Leader recovery is a bit more tricky and occurs as part of a leader election. Of class, if a not-faulty node can be elected a leader, and so recovering faulty nodes is easy with the follower recovery. Nonetheless, the leader election requires a node to have the most up-to-date log to get a leader, limiting a option of nodes for the job. That being said, the node can exist elected with a corrupted log, but information technology needs to recover the corrupted entries from the followers. If the entry is not available on any of the followers, the state motorcar becomes stuck (every bit information technology should). The protocol only recovers committed log entries and follows Raft logic to discard non-committed log suffix if it has corrupted entries.
In improver, to log recovery, the paper likewise talks virtually snapshot recovery. The idea behind snapshot recovery is to make certain all nodes have the same snapshots at the aforementioned index in the log, break them into "chunks" and recover chunks as needed from other nodes.
Here is the presentation by Rohan Puri:
Discussion
i)The need for logs?The paper assumes that a state machine takes periodic snapshots to a disk/drive, and such snapshot in combination with a log can be used for node recovery later. This implies that the bodily electric current country of the state motorcar can exist lost due to a server restart. Still, some state machines are direct backed by the disk, in essence, representing a rolling snapshot that gets updated every fourth dimension an functioning from the log applies. Recovery of such disk-backed state machine can be quicker and require simply log entries happening later on the crash/restart. Of course, this does not hateful that the disk-backed state machine itself cannot be corrupted. In whatsoever example, the log entries are required for recovery and tin be garbage nerveless in one case all nodes have persisted the state machine to disk (either equally office of normal operation or a snapshot), making the time-frame for the log entries to remain useful to exist relatively small.
A more than interesting problem may ascend in trying to recover the corrupted state automobile. If we rely on this "rolling-snapshot" disk-backed state auto, the mechanism the paper uses for snapshot recovery won't work, since dissimilar copies of the country machine may be misaligned always-then-slightly. Of course, one can ever do the costly node restore procedure — restore to some prior snapshot and replay the log, merely this is wasteful and requires keeping an actress snapshot and log from the snapshot onwards. In the spirit of the newspaper, nosotros should rely on distributed copies instead and be able to restore the corruption without relying on storing redundant copies on the same server
2)Persistent retention vs RAM and recovery for in-retentivity SMR. If we build a state car replication (SMR) to work purely off RAM, then we practise non take the luxury of retaining any state after a restart. As such, in-retentivity state machines must have unlike mechanisms to ensure safety. For example, in traditional Multi-Paxos with a disk, a node always remembers the current term/ballot and past votes it has participated in. Without durable memory, a node restart erases the previous voting state, allowing a node to vote on something information technology has already voted on before, but with a lower term/ballot. This is not safe and may atomic number 82 to a double-commit on the same log entry when a node promises to some new leader, and and so after restart makes a 2nd promise in the aforementioned log index to some older leader.
Allowing for corruption in persistent memory is somewhat similar to not having persistent memory at all, at least when dealing with crashes/restarts. The very piece of information/metadata we need to ensure safety and avoid double voting equally in the example above may be corrupted and cannot be used later a restart. Nonetheless, the same precautions used for in-retentiveness replicated state machines will work with corrupted storage every bit well and let for safe recovery. For example, to preclude the double-voting case, a recovering node needs to run a "mock" leader election (or a leader election with a term guaranteed to not succeed). Such leader election will ensure the node gets a proper view of the current ballot/term in the cluster to make sure it no longer accepts votes from prior leaders. After such a mock ballot, the node tin first accepting/voting for log entries while recovering any prior log and/or state automobile from whatever of the replicas. Of course, the full recovery completes when enough information is shipped from other nodes (i.e. snapshots + missing log entries).
There are a few differences between RAM and persistent storage when it comes to recovery. First of all, while it seems similar both tin can lose information (one due to a reboot, another due to some random abuse), persistent storage even so has a hint of information existence missing. This is like not remembering what the node has voted for or who was the leader, only notwithstanding having a 6th sense that something was voted upon. This extra piece of information may exist useful in recovery, and indeed the protocol from the paper takes advantage of that to improve fault tolerance and safe. The recovery protocol preserves rubber when the bulk of nodes neglect at the aforementioned log alphabetize, as the protocol knows something is missing entirely and volition halt for safety. In the RAM setting, a mass reboot (i.e. majority of nodes) leads to a collective loss of retentivity without whatsoever hint that something may have been agreed upon, leading to a rewrite of the log.
The second difference is that persistent retentiveness may not lose all the data, so fewer items must be shipped from the followers.
3)Leader-bound recovery.The paper suggests recovering followers from the leader node. This can put more than load on the leader, who is already a bottleneck in the protocol. Information technology seems similar information technology may exist possible to recover committed log entries from followers (the paper already does so for leader recovery) to brand the recovery procedure less demanding for the leader.
four)Byzantine.The newspaper touches a bit on this topic. Data corruption on disk can be viewed through the lens of Byzantine error tolerance. The corruption causes a node to act outside of the protocol specs, and byzantine-tolerant protocols are designed to handle such "out-of-spec" behaviors. The newspaper is a good case of how we can ofttimes solve some specific types of byzantine behaviors without resorting to the full-blown PBFT-style solutions. This is very practical, as we want the state machine to handle data corruptions, but we practice not want to pay the performance penalty associated with BFT protocols.
v)Luckilyhood of data corruption.Another signal of discussion was effectually the likelihood of such data-faults happening. It does not seem like these are too frequent, only they do happen. Nosotros touched on a few anecdotal occurrences. For example, some firmware issues causing the deejay to not write some large buffers of data.
It is also worth noting error correction. Fault correction is standard for server-form retention, and it comes at a relatively pocket-size monetary/functioning cost. Similar mistake-correction technologies are used in disks and drives, allowing for pocket-size errors (i.due east. a bit-flip) to be fixed by the drive. In fact, NAND flash SSDs rely on error correction in normal operation.
6) Infallible deejay. Protocols assume disk is always correct. Why? Even on the surface, this does not come as a super tight assumption. And especially on the scale of millions of SMR instances deployed across millions of machines.
Reading Group
Our reading groups takes identify over Zoom every Wednesday at two:00 pm EST. We have a slack group where we post papers, hold discussions and almost importantly manage Zoom invites to the papers. Please join the slack group to get involved!
Source: http://charap.co/tag/rsm/
Post a Comment for "How a Read Is Done in a Paxos Rsm"