Bug 7065 - sal-cluster-admin not export java binding inteface in MANIFEST.MF decoupling cluster-admin api and impl. export cluster-admin java binding api Change-Id: Iac19d722bd805310ba8eb1dcd1341b0b1e5741bd Signed-off-by: Geng Xingyuan <geng.xingyuan@zte.com.cn>
Fix CS warnings in sal-cluster-admin and and enable enforcement Fixed checkstyle warnings and enabled enforcement. Most of the warnings/changes were for: - white space before if/for/while/catch - line too long - illegal catching of Exception (suppressed) - adding final for locals declared too far from first usage Change-Id: I0b78c01398a1c62220980e0c8ad22db288208d59 Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
Fix intermittent failure in testRemoveShardReplica This test has failed on jenkins a few times: java.lang.AssertionError: Shard cars is present at org.junit.Assert.fail(Assert.java:88) at org.opendaylight.controller.cluster.datastore.MemberNode.verifyNoShardPresent(MemberNode.java:174) at org.opendaylight.controller.cluster.datastore.admin.ClusterAdminRpcServiceTest.testRemoveShardReplica(ClusterAdminRpcServiceTest.java:373) The log output indicates member-2 hadn't re-joined with member-1 yet after it was restarted. So when RemoveServer was sent to member-1 to remove member-2, it tried to send the ServerRemoved message to the member-2 shard but it wasn't delivered and thus the shard wasn't shut down and removed. To alleviate this I added a waitTillReady call on member-2's config data store to ensure it has synced with the shard leader on member-1. Change-Id: I8de9e585998d9f7b2ab8e4fd3f23c1ab222886cc Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
Move ServerConfigurationPayload to cluster.raft.persisted This introduces its mirror copy and modifies the old class so that it readResolve()s to the new class. It also adjusts all users to use the new class. The new class uses Externalizable proxy pattern to allow the class itself be evolved without breaking compatibility. Also NoOpPayload is retrofitted this way, which makes all subclasses of Payload not have their serialization format tied to Payload itself. Change-Id: I26010a9e1438dbc4cb1822e1c4dbb51e2b6e538e Signed-off-by: Robert Varga <rovarga@cisco.com>
Fix intermittent failure in ClusterAdminRpcServiceTest testRemoveShardLeaderReplica(org.opendaylight.controller.cluster.datastore.admin.ClusterAdminRpcServiceTest) Time elapsed: 8.187 sec <<< FAILURE! java.lang.AssertionError: Leader Id Expected: (a string containing "member-2" or a string containing "member-3") but: was "member-1-shard-cars-config_testRemoveShardLeaderReplica" at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.junit.Assert.assertThat(Assert.java:865) at org.opendaylight.controller.cluster.datastore.admin.ClusterAdminRpcServiceTest$2.verify(ClusterAdminRpcServiceTest.java:412) at org.opendaylight.controller.cluster.datastore.MemberNode.verifyRaftState(MemberNode.java:140) at org.opendaylight.controller.cluster.datastore.admin.ClusterAdminRpcServiceTest.testRemoveShardLeaderReplica(ClusterAdminRpcServiceTest.java:409) member3 tried to become leader but hadn't gotten MemberUp for member2 yet so it didn't have its address when it sent out RequestVote. The verification of new leader timed out before it coild try again. The call to waitForMembersUp on line 397 should be for replica3 and not replica2. Change-Id: I3a714c91ba974b16b2c310027b09f9658915a639 Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
Fix intermittent test failures in CDS Seeing intermittent failures on jenkins, eg Failed tests: PartitionedLeadersElectionScenarioTest.runTest1:37->setupInitialMemberBehaviors:313->AbstractLeaderElectionScenarioTest.initializeLeaderBehavior:207 Missing messages of type class org.opendaylight.controller.cluster.raft.messages.AppendEntriesReply Sometimes the initial AppendEntries messages go to dead letters, probably b/c the follower actors haven't been fully created/initialized by akka. So added retries as a workaround. Failed tests: ClusterAdminRpcServiceTest.testChangeMemberVotingStatesForShard:555->verifySuccessfulRpcResult:296 Rpc failed with error: RpcError [message=Failed to change member voting states for shard cars: Shard member-3-shard-cars-config_testChangeMemberVotingStatusForShard currently has no leader. Try again later., severity=ERROR, errorType=RPC, tag=operation-failed, applicationTag=null, info=null, cause=null] Needs to ensure node3's datastore shards are ready with leaders. Change-Id: I5031c2a7b3e6eeddbf80b8eb346492acd11d664c Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
BUG-5280: introduce DistributedDataStoreClientActor This patch introduces a common ClientActor, which keeps track of frontend generations. Also introduce bind for DistributedDataStore, which uses this common infrastructure. Interface between the DistributedDataStore and the actor world is captured as DistributedDataStoreClient. Change-Id: I42c3281ca790fb5615a593740424ac494469e6a7 Signed-off-by: Robert Varga <rovarga@cisco.com>
Implement cluster admin RPCs to change member voting states Added 3 new RPCs for changing voting states: change-member-voting-states-for-shard change-member-voting-states-for-all-shards flip-member-voting-states-for-all-shards These replace the original ones added in Be that weren't implemented. They were added as placeholders based on how it was thought it would work at that time. New related ShardManager messages were added that are sent by the ClusterAdminRpcService. The flip-member-voting-states-for-all-shards RPC is a shortcut that obtains the current voting states via the GetOnDemandRaftState message to the RaftActor and inverts them. New fields were added to the OnDemandRaftState response to return the voting states. Modified the ShardStats JXM bean to report the new OnDemandRaftState fields. Added a check in RaftActorServerConfigurationSupport to ensure that there's at least 1 voting member otherwise one can end up with an unusable shard with no ability to elect a leader. Fixed a couple bugs in Leader and AbstractLeader that were found during testing. AbstractLeader needs to take into account the follower's voting state when determining if the leader is isolated. Change-Id: I58686e3ce94d58de7cf289e55bb717ba46bc1de5 Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
BUG-5280: use MemberName instead of String Codebase uses Strings to identify various entities throughout the code. Since we have introduced MemberName as an Identifier, use that instead of a plain string to improve type safety and clarity throughout users. Change-Id: Iace25ef2c7cda0ea94449d1543d4ca73b80fb591 Signed-off-by: Robert Varga <rovarga@cisco.com>
Move ClusterAdminRpcService to its own bundle The ClusterAdminRpcService can't be instantiated with the clustered datastore blueprint xml b/c it needs the binding RPC registry service so I moved it to its own bundle. I made the ClusterAdminProviderModule a no-op since the ClusterAdminRpcService is now created via blueprint. I also had to export some packages from the sal-distributed-datastore bundle. Change-Id: Icaf025517ed9b08a82a81310f1e5dd2ac0647559 Signed-off-by: Tom Pantelis <tpanteli@brocade.com>