Alleviate premature elections in followers 64/42564/7
authorTom Pantelis <tpanteli@brocade.com>
Tue, 26 Jul 2016 04:07:02 +0000 (00:07 -0400)
committerTom Pantelis <tpanteli@brocade.com>
Tue, 2 Aug 2016 07:55:32 +0000 (03:55 -0400)
commit364229dd715facec8ef8c73d6c60546c5f38b103
tree9f456cbcd82f30ff57941d3101768f608477b608
parentb70b396725749d3fd6ca761f02f4b630f6f4f1ce
Alleviate premature elections in followers

If a follower actor is busy or some non-leader messages take longer to process,
leader messages may get backed up enough to cause the election timer to
expire, thereby resulting in an unwanted election and leader disruption. To
alleviate this scenario, I added a Stopwatch to keep the last time a leader
message was received, ie when a leader message is received it restarts
the Stopwatch. When ElectionTimeout is received, it checks if the
elapsed time of the Stopwatch has exceeded the election timeout
interval. Therefore if leader messages were occurring during the
election timeout interval but were delayed, they will be processed
before the ElectionTimeout message and restart the Stopwatch such that the
elapsed time will/should be less than the election timeout interval by the
time ElectionTimeout is received (unless the last leader message happened to
take longer than the election timeout interval).

There are cases where ElectionTimeout is manually sent to force an
election timeout (eg during leadership transfer). In these cases we
don't want to check the Stopwatch so I added an explicit TimeoutNow
message to distinguish the 2 messages.

Change-Id: I6b745288040da2fdcef1d29cb5ffc482c9e66003
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
12 files changed:
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/RaftActorServerConfigurationSupport.java
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/base/messages/TimeoutNow.java [new file with mode: 0644]
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/behaviors/Follower.java
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/behaviors/Leader.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/RaftActorServerConfigurationSupportTest.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/behaviors/DelayedMessagesElectionScenarioTest.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/behaviors/FollowerTest.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/behaviors/LeaderTest.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/behaviors/PartitionedCandidateOnStartupElectionScenarioTest.java
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/behaviors/PartitionedLeadersElectionScenarioTest.java
opendaylight/md-sal/sal-distributed-datastore/src/test/java/org/opendaylight/controller/cluster/datastore/ShardTest.java
opendaylight/md-sal/sal-distributed-datastore/src/test/java/org/opendaylight/controller/cluster/datastore/entityownership/EntityOwnershipShardTest.java