Bug in AbstractLeader replication consensus 22/32722/4
authorTom Pantelis <tpanteli@brocade.com>
Wed, 13 Jan 2016 21:14:27 +0000 (16:14 -0500)
committerGerrit Code Review <gerrit@opendaylight.org>
Mon, 18 Jan 2016 18:16:43 +0000 (18:16 +0000)
commit5a8f765f5a6a24019d3ff6121d40a6594ba08f19
tree68815acdbcbc91d9ba6b57a17f5cbaedf0902726
parent1bbdb10607c6c198cb80ec506cd5637098bb917d
Bug in AbstractLeader replication consensus

I ran into an issue where the leader's commit index wasn't advancing
for new log entries even though consensus was reached. This scenario can
occur if the leader previously didn't get consensus and thus didn't commit
and apply a log entry and later regains leadership with a higher term.

The code in handleAppendEntriesReply doesn't update the commit index
if an entry's term doesn't match the current term. This behavior is correct
as per the raft paper - ยง5.4.1: "Raft never commits log entries from
previous terms by counting replicas". However the code also breaks out
of the loop and thus can never make progress on new entries in the current
term that reach consensus. This part is incorrect - as per raft "once an
entry from the current term is committed by counting replicas, then all
prior entries are committed indirectly". Therefore we need to continue
processing subsequent log entries in order to eventually make progress.

Change-Id: I2d093848c3a846e1f6420ac695b4ff652a65bf6b
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/behaviors/AbstractLeader.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/behaviors/LeaderTest.java