Fix intermittent LeaderTest/CandidateTest failures 44/34544/2
authorTom Pantelis <tpanteli@brocade.com>
Thu, 11 Feb 2016 08:24:53 +0000 (03:24 -0500)
committerGerrit Code Review <gerrit@opendaylight.org>
Fri, 12 Feb 2016 15:45:41 +0000 (15:45 +0000)
commit5e8721fd675825ec5c9f826aed61c97e22188960
tree76c99c34e3d5c71c191f617d5020d6d2b4581729
parent224aa4f574c63576961dc9dc37e075e2e5096a5a
Fix intermittent LeaderTest/CandidateTest failures

The test cases in LeaderTest and CandidateTest have been failing
intermittently. A particular test in CandidateTest has recently started
failing fairly regularly on jenkins for some reason.

The common denominator is that an initial message to an actor isn't
received and goes to dead letters instead, even though the actor was
just created. This seems related to the use of ActorSelection in the raft
behavior classes, I suspect a timing issue where the underlying actor
isn't actually created/available yet via actorSelection. I had seen this
in the past and attempted to alleviate it by adding a verifyActorReady to
TestActorFactory to verify with retries that the actor can be obtained via
actorSelection.resolveOne. However it doesn't appear resolveOne works as
advertised or maybe a successful call doesn't mean a message will
succeed.

I changed verifyActorReady to send an Identify message to the
actorSelection and verify successful response. On my system LeaderTest
would usually fail within 30 test runs. After the change it ran
successfully 400 times.

Change-Id: I2da7d4a4d14c68810e87fc64b711b5c80608f5d7
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
opendaylight/md-sal/sal-akka-raft/src/test/java/org/opendaylight/controller/cluster/raft/TestActorFactory.java