Bug 2187: EOS shard recovery after AddShardReplica 04/29904/6
authorTom Pantelis <tpanteli@brocade.com>
Wed, 18 Nov 2015 05:50:09 +0000 (00:50 -0500)
committerGerrit Code Review <gerrit@opendaylight.org>
Wed, 2 Dec 2015 11:03:03 +0000 (11:03 +0000)
commite04c7f93b0b614580c45318585f7709192465757
tree335cc869005866c2f78288fd9886c27f8e3db2e1
parent733636ec5f1b4caecd130a6a26f6d196af6ff854
Bug 2187: EOS shard recovery after AddShardReplica

On restart after an EOS shard replica is added and persisted, the
ShardManager recovers its snapshot and attempts to add the local member
to the shard replicas in the configuration. However, since there's no
static module conguration for the EOS shard, the ShardManager can't
create the shard on recovery complete. The shard does get created on
the subsequent CreateShard message however, if there's no local shards
in the static configuration, it creates the shard as inactive, ie with
the DisableElectionsRaftPolicy which we don't want.

To alleviate this, the ShardManager now stores its recovered snapshot
and, on CreateShard, if the shard was in the recovered shard list then
it was pre-existing so is not initialized with the
DisableElectionsRaftPolicy.

I extended
DistributedEntityOwnershipIntegrationTest::testEntityOwnershipShardBootstrapping
to restart the newly created replica and verify it's re-instated
properly. I added the customRaftPolicyClassName to the OnDemandRaftState
so the test can verify.

Testing revealed some timing issues in the EntityOwnershipShard on
re-instatement where pending modifications weren't sent to the leader.
The EntityOwnershipShard does respond to raft behavior state changes to send
pending modifications but, on startup, if the shard stays in the
follower state then no behavior change occurs. In that case the leaderId
changes and onLeaderChanged occurs so I changed it to also notify the
commit coordinator to commit the next batched transaction, if any. I
also did the same for onPeerUp since, in some test runs, the MemberUp
event hadn't occured yet.

Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
Change-Id: Id6bf966e0aa9a0f12f30327c617cb84f10e6b10f
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/RaftActor.java
opendaylight/md-sal/sal-akka-raft/src/main/java/org/opendaylight/controller/cluster/raft/client/messages/OnDemandRaftState.java
opendaylight/md-sal/sal-distributed-datastore/src/main/java/org/opendaylight/controller/cluster/datastore/ShardManager.java
opendaylight/md-sal/sal-distributed-datastore/src/main/java/org/opendaylight/controller/cluster/datastore/entityownership/EntityOwnershipShard.java
opendaylight/md-sal/sal-distributed-datastore/src/test/java/org/opendaylight/controller/cluster/datastore/entityownership/DistributedEntityOwnershipIntegrationTest.java