Fix intermittent LeaderTest/CandidateTest failures 65/34565/1
authorTom Pantelis <tpanteli@brocade.com>
Thu, 11 Feb 2016 08:24:53 +0000 (03:24 -0500)
committerTom Pantelis <tpanteli@brocade.com>
Fri, 12 Feb 2016 15:45:48 +0000 (15:45 +0000)
commitfa7505bed86cdf45971f669e87813bb778ad8ac8
tree737ab81d50a9f365fd11712bdf071a5f24f4e024
parent5d80babc178948bb31c844979ac7ed382cbf5965
Fix intermittent LeaderTest/CandidateTest failures

The test cases in LeaderTest and CandidateTest have been failing
intermittently. A particular test in CandidateTest has recently started
failing fairly regularly on jenkins for some reason.

The common denominator is that an initial message to an actor isn't
received and goes to dead letters instead, even though the actor was
just created. This seems related to the use of ActorSelection in the raft
behavior classes, I suspect a timing issue where the underlying actor
isn't actually created/available yet via actorSelection. I had seen this
in the past and attempted to alleviate it by adding a verifyActorReady to
TestActorFactory to verify with retries that the actor can be obtained via
actorSelection.resolveOne. However it doesn't appear resolveOne works as
advertised or maybe a successful call doesn't mean a message will
succeed.

I changed verifyActorReady to send an Identify message to the
actorSelection and verify successful response. On my system LeaderTest
would usually fail within 30 test runs. After the change it ran
successfully 400 times.

Change-Id: I2da7d4a4d14c68810e87fc64b711b5c80608f5d7
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
(cherry picked from commit 5e8721fd675825ec5c9f826aed61c97e22188960)
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/TestActorFactory.java