Fix intermitent testFollowerResyncWith*LeaderRestart failure 17/62417/2
authorTom Pantelis <tompantelis@gmail.com>
Mon, 28 Aug 2017 19:44:31 +0000 (15:44 -0400)
committerTom Pantelis <tompantelis@gmail.com>
Wed, 27 Sep 2017 03:50:19 +0000 (03:50 +0000)
commit5fdf80c2b78f3c33500522c0c4d4ad0a37f82c12
treeed482f86eb67b2959233ee44433087a5074b3060
parent5ebbb1f1bf1cabf30ae5098fab1ca9c6dc8e921c
Fix intermitent testFollowerResyncWith*LeaderRestart failure

NonVotingFollowerIntegrationTest#testFollowerResyncWithOneMoreLeaderLogEntryAfterNonPersistentLeaderRestart fails intermittently:

NonVotingFollowerIntegrationTest.testFollowerResyncWithOneMoreLeaderLogEntryAfterNonPersistentLeaderRestart:233 Did not receive message of type class org.opendaylight.controller.cluster.raft.base.messages.SnapshotComplete

This seems to be a side-effect of https://git.opendaylight.org/gerrit/#/c/62255/
which changes the timing a bit such that an install snapshot doesn't occur on the
follower which should happen in order to completely re-sycnc it with the leader -
instead it ends up removing the stale out-of-sync entries and appending the new ones
from the leader which gets the journal up-to-date but the stale entries had already
been applied to the state which leaves the state out-of-sync with journal. I added
an additional check in the follower to force the leader to install a snapshot
if the first out-of-sync log entry index <= the lastAppliedIndex which means the
entries to be removed have already been applied to the state.

Change-Id: Ic3815a694a8531d9f7f42f19ad8978d52fc902b3
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
(cherry picked from commit 88e2974b8d391d6e91a6338b0a1b8dbf966a8a71)
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/behaviors/Follower.java