Bug 5504: Add PreLeader raft state 28/42728/4
authorTom Pantelis <tpanteli@brocade.com>
Wed, 27 Jul 2016 19:52:53 +0000 (15:52 -0400)
committerTom Pantelis <tpanteli@brocade.com>
Fri, 5 Aug 2016 15:10:14 +0000 (15:10 +0000)
commitd86f990976dcc2879b40dec7df1b3b5fba8cba78
tree6abeb2df0ead794fc6c59daee68fda3bda814dd3
parenta47dd7a5d21ca68804a6d0e2e3ca765f223c2ef4
Bug 5504: Add PreLeader raft state

The following scenario can result in a "store tree and candidate base
differ" IllegalStateException on commit:

A follower receives a replicate and adds it to the log, say at index 1,
but the leader transfers or dies before committing and applying it to the
state. The follower becomes leader and when the next tx is applied, log
index 2, it has to first apply all log entries from the previous term that
hadn't been committed yet, in this case index 1. Since we got consensus for
index 2 that means index 1 has also been replicated to a majority. Therefore
ApplyState is sent for index 1 and then index 2. However index 1 is applied
as a "foreign" candidate while index 2 is in the pre-commit state. When
index 2 is applied the commit fails.

To prevent this scenario, we introduce a new raft state, PreLeader,
which is transitioned to from Candidate if there are uncommitted
entries, ie commit index < last log index. The PreLeader state performs all
the duties of Leader with the added behavior of attempting to commit all
uncommitted entries from the previous leader's term. Raft does not allow a
leader to commit entries from a previous term by simply counting replicas -
only entries from the leader's current term can be committed (ยง5.4.2). Rather
then waiting for a client interaction to commit a new entry, the PreLeader
state immediately appends a no-op entry (NoopPayload) to the log with the
leader's current term. Once the no-op entry is committed, all prior entries
are committed indirectly. Once all entries are committed, ie commitIndex matches
the last log index, it switches to the normal Leader state.

The PreLeader state is considered an inactive leader state and thus
client transactions are delayed until it transitions to Leader.

Change-Id: I20a541de0eba9b0075b9952dc6d5808943b7bb8f
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
18 files changed:
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/RaftActor.java
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/RaftState.java
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/behaviors/AbstractLeader.java
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/behaviors/AbstractRaftActorBehavior.java
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/behaviors/Candidate.java
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/behaviors/Leader.java
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/behaviors/PreLeader.java [new file with mode: 0644]
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/persisted/NoopPayload.java [new file with mode: 0644]
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/AbstractRaftActorIntegrationTest.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/MockRaftActorContext.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/PreLeaderScenarioTest.java [new file with mode: 0644]
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/RaftActorServerConfigurationSupportTest.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/behaviors/CandidateTest.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/behaviors/FollowerTest.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/behaviors/LeaderTest.java
opendaylight/md-sal/sal-distributed-datastore/src/main/java/org/opendaylight/controller/cluster/datastore/entityownership/EntityOwnershipShard.java
opendaylight/md-sal/sal-distributed-datastore/src/main/java/org/opendaylight/controller/cluster/datastore/shardmanager/ShardInformation.java
opendaylight/md-sal/sal-distributed-datastore/src/test/java/org/opendaylight/controller/cluster/datastore/DistributedDataStoreRemotingIntegrationTest.java