... Requirements: ClusterManagement.ClusterManagement_Setup must be called before Shard_Stability_Init
...
... It is possible to use it for stateless comparison.
+... Variable @{DEFAULT_SHARD_LIST} contains default module shards.
Library Collections
Library String
Resource ${CURDIR}/ClusterManagement.robot
*** Variables ***
&{stored_details}
+@{DEFAULT_SHARD_LIST} default:config default:operational topology:config topology:operational inventory:config inventory:operational
*** Keywords ***
Shards_Stability_Init_Details
Library Collections
Resource ${CURDIR}/../ClusterManagement.robot
Resource ${CURDIR}/../MdsalLowlevel.robot
+Resource ${CURDIR}/../ShardStability.robot
Resource ${CURDIR}/../WaitForFailure.robot
*** Variables ***
Verify_Singleton_Constant_During_Isolation
[Documentation] Iterate over all non-isolated cluster nodes. They should return the correct constant.
: FOR ${index} IN @{cs_all_indices}
+ \ BuiltIn.Run_Keyword_If "${index}" == "${cs_isolated_index}" BuiltIn.Log Node not triggered, behavior not well described, see bugs 8207, 8214.
\ BuiltIn.Run_Keyword_Unless "${index}" == "${cs_isolated_index}" Verify_Singleton_Constant_On_Node ${index} ${CS_CONSTANT_PREFIX}${cs_owner}
Isolate_Owner_And_Verify_Isolated
BuiltIn.Set_Suite_Variable ${cs_isolated_index} ${cs_owner}
${non_isolated_list} = ClusterManagement.List_Indices_Minus_Member ${cs_isolated_index} member_index_list=${cs_all_indices}
${node_to_ask} = Collections.Get_From_list ${non_isolated_list} 0
+ BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST} member_index_list=${non_isolated_list}
BuiltIn.Wait_Until_Keyword_Succeeds 10s 2s ClusterManagement.Check_New_Owner_Got_Elected_For_Device ${CS_DEVICE_NAME} ${CS_DEVICE_TYPE} ${cs_isolated_index}
... ${node_to_ask}
Get_And_Save_Present_CsOwner_And_CsCandidates ${node_to_ask}
Rejoin_Node_And_Verify_Rejoined
[Documentation] Rejoin isolated node.
ClusterManagement.Rejoin_Member_From_List_Or_All ${cs_isolated_index}
- BuiltIn.Wait_Until_Keyword_Succeeds 10s 2s Verify_Singleton_Constant_On_Node ${cs_isolated_index} ${CS_CONSTANT_PREFIX}${cs_owner}
+ BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
+ BuiltIn.Wait_Until_Keyword_Succeeds 60s 3s Verify_Singleton_Constant_On_Node ${cs_isolated_index} ${CS_CONSTANT_PREFIX}${cs_owner}
Register_Flapping_Singleton_On_Nodes
[Arguments] ${index_list}
Resource ${CURDIR}/../ClusterManagement.robot
Resource ${CURDIR}/../MdsalLowlevel.robot
Resource ${CURDIR}/../TemplatedRequests.robot
+Resource ${CURDIR}/../ShardStability.robot
Resource ${CURDIR}/../WaitForFailure.robot
*** Variables ***
${idx_trans_as_list} = BuiltIn.Create_List ${idx_trans}
MdsalLowlevelPy.Start_Write_Transactions_On_Nodes ${ip_trans_as_list} ${idx_trans_as_list} ${ID_PREFIX} ${DURATION_30S} ${TRANSACTION_RATE_1K} chained_flag=${False}
ClusterAdmin.Make_Leader_Local ${idx_to} ${shard_name} ${shard_type}
- ${new_leader} ${new_followers} = BuiltIn.Wait_Until_Keyword_Succeeds 10s 1s ClusterManagement.Verify_Shard_Leader_Elected ${shard_name}
+ ${new_leader} ${new_followers} = BuiltIn.Wait_Until_Keyword_Succeeds 30s 5s ClusterManagement.Verify_Shard_Leader_Elected ${shard_name}
... ${shard_type} ${True} ${idx_from}
BuiltIn.Should_Be_Equal ${idx_to} ${new_leader}
${resp_list} = MdsalLowlevelPy.Wait_For_Transactions
${idx_trans_as_list} = BuiltIn.Create_List ${idx_trans}
MdsalLowlevelPy.Start_Produce_Transactions_On_Nodes ${ip_trans_as_list} ${idx_trans_as_list} ${ID_PREFIX} ${DURATION_30S} ${TRANSACTION_RATE_1K}
MdsalLowlevel.Become_Prefix_Leader ${idx_to} ${shard_name} ${ID_PREFIX}
- ${new_leader} ${new_followers} = BuiltIn.Wait_Until_Keyword_Succeeds 10s 1s ClusterManagement.Verify_Shard_Leader_Elected ${shard_name}!!
+ ${new_leader} ${new_followers} = BuiltIn.Wait_Until_Keyword_Succeeds 30s 5s ClusterManagement.Verify_Shard_Leader_Elected ${shard_name}!!
... ${shard_type} ${True} ${idx_from}
BuiltIn.Should_Be_Equal ${idx_to} ${new_leader}
${resp_list} = MdsalLowlevelPy.Wait_For_Transactions
Leader_Isolation_Test_Templ
[Arguments] ${heal_timeout} ${shard_name}=${SHARD_NAME} ${shard_type}=${SHARD_TYPE}
[Documentation] Implements leader isolation test scenario.
+ ${li_isolated} BuiltIn.Set_Variable ${False}
${producing_transactions_time} = BuiltIn.Set_Variable ${${heal_timeout}+60}
${all_indices} = ClusterManagement.List_All_Indices
${leader} ${follower_list} = ClusterManagement.Get_Leader_And_Followers_For_Shard shard_name=${shard_name} shard_type=${shard_type} member_index_list=${all_indices}
${date_start} = DateTime.Get_Current_Date
${date_end} = DateTime.Add_Time_To_Date ${date_start} ${producing_transactions_time}
ClusterManagement.Isolate_Member_From_List_Or_All ${leader}
- BuiltIn.Wait_Until_Keyword_Succeeds 10s 2s ClusterManagement.Verify_Shard_Leader_Elected ${shard_name} ${shard_type} ${True}
+ ${li_isolated} BuiltIn.Set_Variable ${True}
+ BuiltIn.Wait_Until_Keyword_Succeeds 45s 2s ClusterManagement.Verify_Shard_Leader_Elected ${shard_name} ${shard_type} ${True}
... ${leader} member_index_list=${follower_list}
BuiltIn.Sleep ${heal_timeout}
ClusterManagement.Rejoin_Member_From_List_Or_All ${leader}
- BuiltIn.Wait_Until_Keyword_Succeeds 20s 2s ClusterManagement.Get_Leader_And_Followers_For_Shard ${shard_name} ${shard_type}
+ ${li_isolated} BuiltIn.Set_Variable ${False}
+ BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
+ BuiltIn.Wait_Until_Keyword_Succeeds 15s 2s ClusterManagement.Get_Leader_And_Followers_For_Shard ${shard_name} ${shard_type}
${time_to_finish} = Get_Seconds_To_Time ${date_end}
BuiltIn.Run_Keyword_If ${heal_timeout} < ${TRANSACTION_TIMEOUT} Leader_Isolation_Heal_Within_Tt
... ELSE Module_Leader_Isolation_Heal_Default ${leader} ${time_to_finish}
+ [Teardown] BuiltIn.Run_Keyword_If ${li_isolated} BuiltIn.Run_Keywords ClusterManagement.Rejoin_Member_From_List_Or_All ${leader}
+ ... AND BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
Leader_Isolation_PrefBasedShard_Test_Templ
[Arguments] ${heal_timeout} ${shard_name}=${PREF_BASED_SHARD} ${shard_type}=${SHARD_TYPE}
[Documentation] Implements leader isolation test scenario.
+ ${li_isolated} BuiltIn.Set_Variable ${False}
${producing_transactions_time} = BuiltIn.Set_Variable ${${heal_timeout}+60}
${all_indices} = ClusterManagement.List_All_Indices
${leader} ${follower_list} = ClusterManagement.Get_Leader_And_Followers_For_Shard shard_name=${shard_name}!! shard_type=${shard_type} member_index_list=${all_indices}
${date_start} = DateTime.Get_Current_Date
${date_end} = DateTime.Add_Time_To_Date ${date_start} ${producing_transactions_time}
ClusterManagement.Isolate_Member_From_List_Or_All ${leader}
- BuiltIn.Wait_Until_Keyword_Succeeds 10s 2s ClusterManagement.Verify_Shard_Leader_Elected ${shard_name}!! ${shard_type} ${True}
+ ${li_isolated} BuiltIn.Set_Variable ${True}
+ BuiltIn.Wait_Until_Keyword_Succeeds 45s 2s ClusterManagement.Verify_Shard_Leader_Elected ${shard_name}!! ${shard_type} ${True}
... ${leader} member_index_list=${follower_list}
BuiltIn.Sleep ${heal_timeout}
ClusterManagement.Rejoin_Member_From_List_Or_All ${leader}
- BuiltIn.Wait_Until_Keyword_Succeeds 20s 2s ClusterManagement.Get_Leader_And_Followers_For_Shard ${shard_name}!! ${shard_type}
+ ${li_isolated} BuiltIn.Set_Variable ${False}
+ BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
+ BuiltIn.Wait_Until_Keyword_Succeeds 15s 2s ClusterManagement.Get_Leader_And_Followers_For_Shard ${shard_name}!! ${shard_type}
${time_to_finish} = Get_Seconds_To_Time ${date_end}
BuiltIn.Run_Keyword_If ${heal_timeout} < ${TRANSACTION_TIMEOUT} Leader_Isolation_Heal_Within_Tt
... ELSE Prefix_Leader_Isolation_Heal_Default ${leader} ${time_to_finish}
+ [Teardown] BuiltIn.Run_Keyword_If ${li_isolated} BuiltIn.Run_Keywords ClusterManagement.Rejoin_Member_From_List_Or_All ${leader}
+ ... AND BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
Leader_Isolation_Heal_Within_Tt
[Documentation] The leader isolation test case end if the heal happens within transaction timeout. All write transaction
WaitForFailure.Confirm_Keyword_Fails_Within_Timeout 3s 1s Verify_Client_Aborted ${True}
[Teardown] BuiltIn.Run Keywords ClusterManagement.Rejoin_Member_From_List_Or_All ${client_node_dst}
... AND MdsalLowlevelPy.Wait_For_Transactions
+ ... AND BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
Client_Isolation_PrefBasedShard_Test_Templ
[Arguments] ${listener_node_role} ${trans_chain_flag} ${shard_name}=${PREF_BASED_SHARD} ${shard_type}=${SHARD_TYPE}
WaitForFailure.Confirm_Keyword_Fails_Within_Timeout 3s 1s Verify_Client_Aborted ${True}
[Teardown] BuiltIn.Run Keywords ClusterManagement.Rejoin_Member_From_List_Or_All ${client_node_dst}
... AND MdsalLowlevelPy.Wait_For_Transactions
+ ... AND BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
Ongoing_Transactions_Not_Failed_Yet
[Documentation] Verify that no write-transaction rpc finished, means they are still running.
Library Collections
Resource ${CURDIR}/../ClusterManagement.robot
Resource ${CURDIR}/../MdsalLowlevel.robot
+Resource ${CURDIR}/../ShardStability.robot
*** Variables ***
${CONSTANT_PREFIX} constant-
[Arguments] ${member_index}
[Documentation] Rejoin a member and update appropriate suite variables.
ClusterManagement.Rejoin_Member_From_List_Or_All ${member_index}
+ BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
DrbCommons__Upadte_Active_Nodes_List activate_idx=${member_index}
BuiltIn.Return_From_Keyword_If ${member_index} not in ${registered_indices}
DrbCommons__Add_Possible_Constant ${member_index}
Suite Teardown SSHLibrary.Close_All_Connections
Library SSHLibrary
Resource ${CURDIR}/../../../libraries/ClusterManagement.robot
+Resource ${CURDIR}/../../../libraries/ShardStability.robot
Resource ${CURDIR}/../../../libraries/SetupUtils.robot
*** Variables ***
${DATASTORE_CFG} /${WORKSPACE}/${BUNDLEFOLDER}/etc/org.opendaylight.controller.cluster.datastore.cfg
-${SHARD_NAME} default
-${SHARD_TYPE} config
*** Test Cases ***
Kill_All_Members
Start_All_And_Sync
[Documentation] Start each member and wait for sync.
ClusterManagement.Start_Members_From_List_Or_All
- BuiltIn.Wait_Until_Keyword_Succeeds 30s 5s ClusterManagement.Get_Leader_And_Followers_For_Shard shard_name=${SHARD_NAME} shard_type=${SHARD_TYPE}
+ BuiltIn.Wait_Until_Keyword_Succeeds 300s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
ClusterManagement.Run_Bash_Command_On_List_Or_All ps -ef | grep java
Suite Teardown SSHLibrary.Close_All_Connections
Library SSHLibrary
Resource ${CURDIR}/../../../libraries/ClusterManagement.robot
+Resource ${CURDIR}/../../../libraries/ShardStability.robot
Resource ${CURDIR}/../../../libraries/SetupUtils.robot
*** Variables ***
${DATASTORE_CFG} /${WORKSPACE}/${BUNDLEFOLDER}/etc/org.opendaylight.controller.cluster.datastore.cfg
-${SHARD_NAME} default
-${SHARD_TYPE} config
*** Test Cases ***
Kill_All_Members
Start_All_And_Sync
[Documentation] Start each member and wait for sync.
ClusterManagement.Start_Members_From_List_Or_All
- BuiltIn.Wait_Until_Keyword_Succeeds 30s 5s ClusterManagement.Get_Leader_And_Followers_For_Shard shard_name=${SHARD_NAME} shard_type=${SHARD_TYPE}
+ BuiltIn.Wait_Until_Keyword_Succeeds 300s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
ClusterManagement.Run_Bash_Command_On_List_Or_All ps -ef | grep java
Invoke_Rpc_On_Isolated_Node
[Documentation] Invoke rpc on isolated node. Because rpc is registered on this node, local constant
... is expected.
+ BuiltIn.Pass_Execution Invoking rpc on isolated node has several problems, see bugs 8207, 8214.
BuiltIn.Wait_Until_Keyword_Succeeds 3x 2s Verify_Local_Rpc_Invoked ${isolated_idx}
Invoke_Rpc_On_Remaining_Nodes
... cluster nodes, only this value is expected.
${index_list} = ClusterManagement.List_Indices_Minus_Member ${isolated_idx} ${all_indices}
: FOR ${index} IN @{index_list}
- \ ${constant} = Verify_Any_Remote_Rpc_Invoked ${index}
+ \ ${constant} = BuiltIn.Wait_Until_Keyword_Succeeds 45s 5s Verify_Any_Remote_Rpc_Invoked ${index}
\ BuiltIn.Should_Not_Be_Equal_As_Strings ${CONSTANT_PREFIX}${isolated_idx} ${constant}
Rejoin_Isolated_Member
Invoke_Rpc_On_Isolated_Node
[Documentation] Invoke rpc on isolated node. Because rpc is registered on this node, local constant
... is expected.
+ BuiltIn.Pass_Execution Aaa has a problem to authenticate http request as it is out of cluster too (see bug 8214), skipping for now.
BuiltIn.Wait_Until_Keyword_Succeeds 3x 2s DrbCommons.Verify_Constant_On_Registered_Node ${isolated_idx}
Invoke_Rpc_On_Remaining_Nodes
[Documentation] Invoke rpc on non-islolated nodes.
- DrbCommons.Verify_Constant_On_Active_Nodes
+ BuiltIn.Wait_Until_Keyword_Succeeds 45s 5s DrbCommons.Verify_Constant_On_Active_Nodes
Rejoin_Isolated_Member
[Documentation] Rejoin isolated node
... value is expected.
WaitForFailure.Verify_Keyword_Does_Not_Fail_Within_Timeout 20s 3s DrbCommons.Verify_Constant_On_Active_Nodes
-Isolate_Member_Without_Registered_Rpc
- [Documentation] Isolate one node with unregistered rpc.
- DrbCommons.Isolate_Node ${TESTED_MEMBER_WITHOUT_RPC_IDX}
-
-Verify_Rpc_Fails_On_Isolated_Member_Without_Rpc
- [Documentation] Rpc should fail as it is requested on isolated node without rpc instance.
- BuiltIn.Wait_Until_Keyword_Succeeds 15s 2s MdsalLowlevel.Get_Constant ${TESTED_MEMBER_WITHOUT_RPC_IDX} explicit_status_codes=${NON_WORKING_RPC_STATUS_CODE}
-
-Rejoin_Isolated_Member_Without_Registered_Rpc
- [Documentation] Rejoin isolated node.
- DrbCommons.Rejoin_Node ${TESTED_MEMBER_WITHOUT_RPC_IDX}
-
-Verify_Rpc_Again_Passes_On_Member_Without_Rpc
- [Documentation] Verify rpc works after the node rejoin.
- BuiltIn.Wait_Until_Keyword_Succeeds 10x 3s DrbCommons.Verify_Constant_On_Unregistered_Node ${TESTED_MEMBER_WITHOUT_RPC_IDX}
-
Unregister_Rpc_On_Each_Node
[Documentation] Inregister rpc on both nodes.
DrbCommons.Unregister_Rpc_On_Nodes ${INSTALLED_RPC_MEMEBER_IDX_LIST}
Resource ${CURDIR}/../../../libraries/ClusterManagement.robot
Resource ${CURDIR}/../../../libraries/KarafKeywords.robot
Resource ${CURDIR}/../../../libraries/SetupUtils.robot
+Resource ${CURDIR}/../../../libraries/ShardStability.robot
Resource ${CURDIR}/../../../libraries/TemplatedRequests.robot
Resource ${CURDIR}/../../../variables/Variables.robot
Resource ${CURDIR}/../../../libraries/WaitForFailure.robot
Verify_New_Basic_Rpc_Test_Owner_Elected
[Documentation] Verify new owner of the service is elected.
${idx}= Collections.Get_From_List ${old_brt_successors} 0
- BuiltIn.Wait_Until_Keyword_Succeeds 5x 2s Verify_Owner_Elected ${True} ${old_brt_owner} ${idx}
+ BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST} member_index_list=${old_brt_successors}
+ BuiltIn.Wait_Until_Keyword_Succeeds 10s 2s Verify_Owner_Elected ${True} ${old_brt_owner} ${idx}
Get_Present_Brt_Owner_And_Successors ${idx} store=${True}
Rpc_On_Isolated_Node
[Documentation] Run rpc on isolated cluster node.
${session} = Resolve_Http_Session_For_Member member_index=${old_brt_owner}
BuiltIn.Run_Keyword_And_Ignore_Error Get_And_Log_EOS_Output_To_Karaf_Log ${session}
+ BuiltIn.Pass_Execution Rpc on isolated node may work for some time(bug 8207), then will fail (bug 8214)
${resp} = RequestsLibrary.Post Request ${session} ${RPC_URL} data=${EMPTY}
BuiltIn.Should_Be_Equal_As_Numbers ${resp.status_code} ${RPC_STATUS_ISOLATED}
[Documentation] Rejoin isolated node
[Tags] @{NO_TAGS}
ClusterManagement.Rejoin_Member_From_List_Or_All ${old_brt_owner}
+ BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
Verify_New_Owner_Remained_After_Rejoin
[Documentation] Verify no owner change happened after rejoin.
Default Tags critical
Resource ${CURDIR}/../../../libraries/ClusterManagement.robot
Resource ${CURDIR}/../../../libraries/KarafKeywords.robot
+Resource ${CURDIR}/../../../libraries/ShardStability.robot
Resource ${CURDIR}/../../../libraries/SetupUtils.robot
Resource ${CURDIR}/../../../libraries/TemplatedRequests.robot
Resource ${CURDIR}/../../../libraries/WaitForFailure.robot
Verify_New_Basic_Rpc_Test_Owner_Elected
[Documentation] Verify new owner of the service is elected.
${idx}= Collections.Get_From_List ${old_brt_successors} 0
- BuiltIn.Wait_Until_Keyword_Succeeds 5x 2s Verify_Owner_Elected ${True} ${old_brt_owner} ${idx}
+ BuiltIn.Wait_Until_Keyword_Succeeds 30s 2s Verify_Owner_Elected ${True} ${old_brt_owner} ${idx}
Get_Present_Brt_Owner_And_Successors ${idx} store=${True}
Rpc_On_Remained_Cluster_Nodes
Verify_New_Owner_Remained_After_Rejoin
[Documentation] Verify no owner change happened after rejoin.
WaitForFailure.Verify_Keyword_Does_Not_Fail_Within_Timeout 15s 2s Verify_Owner_Elected ${False} ${brt_owner} ${brt_owner}
+ BuiltIn.Wait_Until_Keyword_Succeeds 60s 10s ShardStability.Shards_Stability_Get_Details ${DEFAULT_SHARD_LIST}
Rpc_After_Rejoin_On_New_Owner
[Documentation] Run rpc on the new service owner node.