Refactor netconf stress suite 23/60023/9
authorPeter Gubka <pgubka@cisco.com>
Thu, 6 Jul 2017 13:17:35 +0000 (15:17 +0200)
committerVratko Polák <vrpolak@cisco.com>
Fri, 7 Jul 2017 14:50:57 +0000 (14:50 +0000)
It implemnts
https://trello.com/c/SbOodkUt/490-int-nest-repair-netconf-cluster-stress-suite

Change-Id: I5bd1cabded7c0c09d7213dd82688d29ad67f6c07
Signed-off-by: Peter Gubka <pgubka@cisco.com>
csit/suites/netconf/clusteringscale/staggered_install.robot [deleted file]
csit/suites/netconf/clusteringscale/topology_leader_ha.robot
csit/suites/netconf/clusteringscale/topology_owner_ha.robot
csit/testplans/netconf-cluster-stress.txt

diff --git a/csit/suites/netconf/clusteringscale/staggered_install.robot b/csit/suites/netconf/clusteringscale/staggered_install.robot
deleted file mode 100644 (file)
index d2ea097..0000000
+++ /dev/null
@@ -1,131 +0,0 @@
-*** Settings ***
-Documentation     Suite for controlled installation of ${FEATURE_ONCT}
-...
-...               Copyright (c) 2016 Cisco Systems, Inc. and others. All rights reserved.
-...
-...               This program and the accompanying materials are made available under the
-...               terms of the Eclipse Public License v1.0 which accompanies this distribution,
-...               and is available at http://www.eclipse.org/legal/epl-v10.html
-...
-...
-...               This suite requires odl-netconf-ssh feature to be already installed,
-...               otherwise SSH bundle refresh will cause connection to drop and karaf command "fails".
-...
-...               Operation of clustered netconf topology relies on two key services.
-...               The netconf topology manager application, which runs on the member
-...               which owns "topology-manager" entity (of "netconf-topoogy" type);
-...               And config datastore shard for network-topology module,
-...               which is controlled by the Leader of the config topology shard.
-...               The Leader is providing the desired state (concerning Netconf connectors),
-...               the Owner consumes the state, performs necessary actions and updated operational view.
-...               In this suite, the common name for the Owner and the Leader is Manager.
-...
-...               In a typical cluster High Availability testing scenario,
-...               one cluster member is selected, killed (or isolated), and later re-started (re-joined).
-...               For Netconf cluster topology testing, there will be scenarios tragetting
-...               the Owner, and other scenarios targeting the Leader.
-...
-...               But both Owner and Leader selection is overned by the same RAFT algorithm,
-...               which relies on message ordering, so there are two typical cases.
-...               Either one member becomes both Owner and Leader,
-...               or the two Managers are located at random.
-...
-...               As the targeted scenarios require the two Managers to reside on different members,
-...               neither of the two case is beneficial for testing.
-...
-...               There are APIs in place which should allow relocation of Leader,
-...               but there are no system tests for them yet.
-...               TODO: Study those APIs and create the missing system tests.
-...
-...               This suite helps with the Manager placement situation
-...               by performing feature installation in runtime, aplying the following strategy:
-...
-...               A N-node cluster is started (without ${FEATURE_ONCT} installed),
-...               and it is verified one node has become the Leader of topology config shard.
-...               As ${FEATURE_ONCT} is installed on the (N-1) follower members
-...               (but not on the Leader yet), it is expected one of the members
-...               becomes Owner of topology-manager entity.
-...               After verifying that, ${FEATURE_ONCT} is installed on the Leader.
-...               If neither Owner nor Leader has moved, the desired placement has been created.
-...
-...               More specifically, this suite assumes the cluster has been started,
-...               it has been stabilized, and ${FEATURE_ONCT} is not installed anywhere.
-...               After successful run of this suite, the feature is installed on each member,
-...               and the Owner is verified to be placed on different member than the Leader.
-...
-...               Note that stress tests may cause Akka delays, which may move the Managers around.
-Suite Setup       Setup_Everything
-Suite Teardown    Teardown_Everything
-Test Setup        SetupUtils.Setup_Test_With_Logging_And_Without_Fast_Failing
-Test Teardown     SetupUtils.Teardown_Test_Show_Bugs_If_Test_Failed
-Default Tags      clustering    netconf    critical
-Resource          ${CURDIR}/../../../libraries/CarPeople.robot
-Resource          ${CURDIR}/../../../libraries/ClusterManagement.robot
-Resource          ${CURDIR}/../../../libraries/SetupUtils.robot
-Resource          ${CURDIR}/../../../libraries/WaitForFailure.robot
-
-*** Variables ***
-${FEATURE_ONCT}    odl-netconf-clustered-topology    # the feature name is mentioned multiple times, this is to prevent typos
-${OWNER_ELECTION_TIMEOUT}    180s    # very large value to allow for -all- jobs with many feature installations taking up time
-
-*** Test Cases ***
-Locate_Leader
-    [Documentation]    Set suite variables based on where the Leader is.
-    ...    As this test may get executed just after cluster restart, WUKS is used to give ODL chance to elect Leaders.
-    BuiltIn.Comment    FIXME: Migrate Set_Variables_For_Shard to ClusterManagement.robot
-    BuiltIn.Wait_Until_Keyword_Succeeds    3m    15s    CarPeople.Set_Variables_For_Shard    shard_name=topology    shard_type=config
-
-Install_Feature_On_Followers
-    [Documentation]    Perform feature installation on follower members, one by one.
-    ...    As first connection attempt may fail (coincidence with ssh bundle refresh), WUKS is used.
-    # Make sure this works, alternative is to perform the installation in parallel.
-    BuiltIn.Wait_Until_Keyword_Succeeds    3x    1s    ClusterManagement.Install_Feature_On_List_Or_All    feature_name=${FEATURE_ONCT}    member_index_list=${topology_follower_indices}    timeout=60s
-
-Locate_Owner
-    [Documentation]    Wait for Owner to appear, store its index to suite variable.
-    BuiltIn.Wait_Until_Keyword_Succeeds    ${OWNER_ELECTION_TIMEOUT}    3s    Single_Locate_Owner_Attempt    member_index_list=${topology_follower_indices}
-
-Install_Feature_On_Leader
-    [Documentation]    Perform feature installation on the Leader member.
-    ...    This seem to be failing, so use TRACE log.
-    ClusterManagement.Install_Feature_On_Member    feature_name=${FEATURE_ONCT}    member_index=${topology_leader_index}    timeout=60s
-
-Verify_Managers_Are_Stationary
-    [Documentation]    Keep checking that Managers do not move for a while.
-    WaitForFailure.Verify_Keyword_Does_Not_Fail_Within_Timeout    ${OWNER_ELECTION_TIMEOUT}    1s    Check_Manager_Positions
-
-*** Keywords ***
-Setup_Everything
-    [Documentation]    Initialize libraries and set suite variables.
-    SetupUtils.Setup_Utils_For_Setup_And_Teardown
-    ClusterManagement.ClusterManagement_Setup
-
-Teardown_Everything
-    [Documentation]    Teardown the test infrastructure, perform cleanup and release all resources.
-    RequestsLibrary.Delete_All_Sessions
-
-Single_Locate_Owner_Attempt
-    [Arguments]    ${member_index_list}=${EMPTY}
-    [Documentation]    Performs actions on given (or all) members, one by one:
-    ...    For the first member listed: Get the actual owner, check candidates, store owner to suite variable.
-    ...    (If the list has less then one item, this Keyword will fail.)
-    ...    For other nodes: Get actual owner, check candidates, compare to the first listed member results.
-    ...    TODO: Move to an appropriate Resource.
-    BuiltIn.Comment    FIXME: Work with sorted candidte list instead of candidate list length.
-    ${index_list} =    ClusterManagement.List_Indices_Or_All    ${member_index_list}
-    ${require_candidate_list} =    BuiltIn.Create_List    @{index_list}
-    ${first_index_listed} =    Collections.Remove_From_List    ${index_list}    ${0}
-    # Now ${index_list} contains only the rest of indices.
-    ${netconf_manager_owner_index}    ${candidates} =    ClusterManagement.Get_Owner_And_Candidates_For_Type_And_Id    type=topology-netconf    id=/general-entity:entity[general-entity:name='topology-manager']    member_index=${first_index_listed}    require_candidate_list=${require_candidate_list}
-    BuiltIn.Set_Suite_Variable    \${netconf_manager_owner_index}
-    : FOR    ${index}    IN    @{index_list}
-    \    ${new_owner}    ${new_candidates} =    ClusterManagement.Get_Owner_And_Candidates_For_Type_And_Id    type=topology-netconf    id=/general-entity:entity[general-entity:name='topology-manager']    member_index=${index}
-    \    ...    require_candidate_list=${require_candidate_list}
-    \    BuiltIn.Should_Be_Equal    ${new_owner}    ${netconf_manager_owner_index}    Member-${index} owner ${new_owner} is not ${netconf_manager_owner_index}
-
-Check_Manager_Positions
-    [Documentation]    For each Manager, locate its current position and check it is the one stored in suite variable.
-    ${new_leader}    ${followers} =    ClusterManagement.Get_Leader_And_Followers_For_Shard    shard_name=topology    shard_type=config
-    BuiltIn.Should_Be_Equal    ${topology_leader_index}    ${new_leader}
-    ${new_owner}    ${candidates} =    ClusterManagement.Get_Owner_And_Candidates_For_Type_And_Id    type=topology-netconf    id=/general-entity:entity[general-entity:name='topology-manager']    member_index=${topology_first_follower_index}
-    BuiltIn.Should_Be_Equal    ${netconf_manager_owner_index}    ${new_owner}
index c1eae0456eaedced6b3539ffdfe5fd5b81281b12..3200e1492adac4d8a247bd426698a2da38fdbde5 100644 (file)
@@ -1,5 +1,5 @@
 *** Settings ***
-Documentation     Suite for High Availability testing config topology shard Leader under stress.
+Documentation     Suite for High Availability testing config topology shard leader under stress.
 ...
 ...               Copyright (c) 2016 Cisco Systems, Inc. and others. All rights reserved.
 ...
@@ -9,8 +9,8 @@ Documentation     Suite for High Availability testing config topology shard Lead
 ...
 ...
 ...               This is close analogue of topology_owner_ha.robot, see Documentation there.
-...               The difference is that here the requests are sent towards Owner,
-...               and the Leader node is rebooted.
+...               The difference is that here the requests are sent towards entity-ownership shard leader,
+...               and the topology shard leader node is rebooted.
 ...
 ...               No real clustering Bugs are expected to be discovered by this suite,
 ...               except maybe some Restconf ones.
@@ -23,10 +23,11 @@ Default Tags      @{TAGS_CRITICAL}
 Library           OperatingSystem
 Library           SSHLibrary    timeout=10s
 Library           String    # for Get_Regexp_Matches
+Resource          ${CURDIR}/../../../libraries/ClusterAdmin.robot
 Resource          ${CURDIR}/../../../libraries/ClusterManagement.robot
 Resource          ${CURDIR}/../../../libraries/KarafKeywords.robot
 Resource          ${CURDIR}/../../../libraries/NetconfKeywords.robot
-Resource          ${CURDIR}/../../../libraries/RemeoteBash.robot
+Resource          ${CURDIR}/../../../libraries/RemoteBash.robot
 Resource          ${CURDIR}/../../../libraries/SetupUtils.robot
 Resource          ${CURDIR}/../../../libraries/SSHKeywords.robot
 Resource          ${CURDIR}/../../../libraries/TemplatedRequests.robot
@@ -44,7 +45,7 @@ ${DEVICE_SET_SIZE}    30
 
 *** Test Cases ***
 Locate_Managers
-    [Documentation]    Detect location of Leader and Owner and store related data into suite variables.
+    [Documentation]    Detect location of topology(config) and entity-ownership(operational) leaders and store related data into suite variables.
     ...    This cannot be part of Suite Setup, as Utils.Get_Index_From_List_Of_Dictionaries calls BuiltIn.Set_Test_Variable.
     ...    WUKS are used, as location failures are probably due to booting process, not bugs.
     ${topology_config_leader_index}    ${candidates} =    BuiltIn.Wait_Until_Keyword_Succeeds    3x    2s    ClusterManagement.Get_Leader_And_Followers_For_Shard    shard_name=topology
@@ -54,13 +55,12 @@ Locate_Managers
     BuiltIn.Set_Suite_Variable    \${topology_config_leader_ip}
     ${topology_config_leader_http_session} =    Resolve_Http_Session_For_Member    ${topology_config_leader_index}
     BuiltIn.Set_Suite_Variable    \${topology_config_leader_http_session}
-    ${netconf_manager_owner_index}    ${candidates} =    BuiltIn.Wait_Until_Keyword_Succeeds    3x    2s    ClusterManagement.Get_Owner_And_Candidates_For_Type_And_Id    type=topology-netconf
-    ...    id=/general-entity:entity[general-entity:name='topology-manager']    member_index=1
-    BuiltIn.Set_Suite_Variable    \${netconf_manager_owner_index}
-    ${netconf_manager_owner_ip} =    ClusterManagement.Resolve_Ip_Address_For_Member    ${netconf_manager_owner_index}
-    BuiltIn.Set_Suite_Variable    \${netconf_manager_owner_ip}
-    ${netconf_manager_owner_http_session} =    Resolve_Http_Session_For_Member    ${netconf_manager_owner_index}
-    BuiltIn.Set_Suite_Variable    \${netconf_manager_owner_http_session}
+    ${entity_ownership_leader_index}    Change_Entity_Ownership_Leader_If_Needed    ${topology_config_leader_index}
+    BuiltIn.Set_Suite_Variable    \${entity_ownership_leader_index}
+    ${entity_ownership_leader_ip} =    ClusterManagement.Resolve_Ip_Address_For_Member    ${entity_ownership_leader_index}
+    BuiltIn.Set_Suite_Variable    \${entity_ownership_leader_ip}
+    ${entity_ownership_leader_http_session} =    Resolve_Http_Session_For_Member    ${entity_ownership_leader_index}
+    BuiltIn.Set_Suite_Variable    \${entity_ownership_leader_http_session}
 
 Start_Testtool
     [Documentation]    Deploy and start test tool on its separate SSH session.
@@ -75,7 +75,7 @@ Start_Configurer
     ${log_filename} =    Utils.Get_Log_File_Name    configurer
     BuiltIn.Set_Suite_Variable    \${log_filename}
     # TODO: Should things like restconf port/user/password be set from Variables?
-    ${command} =    BuiltIn.Set_Variable    python configurer.py --odladdress ${netconf_manager_owner_ip} --deviceaddress ${TOOLS_SYSTEM_IP} --devices ${DEVICE_SET_SIZE} --disconndelay ${CONFIGURED_DEVICES_LIMIT} --basename ${DEVICE_BASE_NAME} --connsleep ${CONNECTION_SLEEP} &> "${log_filename}"
+    ${command} =    BuiltIn.Set_Variable    python configurer.py --odladdress ${entity_ownership_leader_ip} --deviceaddress ${TOOLS_SYSTEM_IP} --devices ${DEVICE_SET_SIZE} --disconndelay ${CONFIGURED_DEVICES_LIMIT} --basename ${DEVICE_BASE_NAME} --connsleep ${CONNECTION_SLEEP} &> "${log_filename}"
     SSHLibrary.Write    ${command}
     ${status}    ${text} =    BuiltIn.Run_Keyword_And_Ignore_Error    SSHLibrary.Read_Until_Prompt
     BuiltIn.Log    ${text}
@@ -88,7 +88,7 @@ Wait_For_Config_Items
     BuiltIn.Wait_Until_Keyword_Succeeds    ${timeout}    1s    Check_Config_Items_Lower_Bound
 
 Reboot_Topology_Leader
-    [Documentation]    Kill and restart member where topology shard Leader was, including removal of persisted data.
+    [Documentation]    Kill and restart member where topology shard leader was, including removal of persisted data.
     ...    After cluster sync, sleep additional time to ensure manager processes requests with the rebooted member fully rejoined.
     [Tags]    @{TAGS_NONCRITICAL}    # To avoid long WUKS list expanded in log.html
     ClusterManagement.Kill_Single_Member    ${topology_config_leader_index}
@@ -100,7 +100,7 @@ Reboot_Topology_Leader
 
 Stop_Configurer
     [Documentation]    Write ctrl+c, download the log, read its contents and match expected patterns.
-    RemeoteBash.Write_Bare_Ctrl_C
+    RemoteBash.Write_Bare_Ctrl_C
     ${output} =    SSHLibrary.Read_Until_Prompt
     BuiltIn.Log    ${output}
     SSHLibrary.Get_File    ${log_filename}
@@ -147,12 +147,12 @@ Count_Substring_Occurence
 
 Get_Config_Device_Count
     [Documentation]    Count number of items in config netconf topology matching ${DEVICE_BASE_NAME}
-    ${item_data} =    TemplatedRequests.Get_As_Json_From_Uri    ${CONFIG_API}/network-topology:network-topology/topology/topology-netconf    session=${netconf_manager_owner_http_session}
+    ${item_data} =    TemplatedRequests.Get_As_Json_From_Uri    ${CONFIG_API}/network-topology:network-topology/topology/topology-netconf    session=${entity_ownership_leader_http_session}
     BuiltIn.Run_Keyword_And_Return    Count_Substring_Occurence    substring=${DEVICE_BASE_NAME}    main_string=${item_data}
 
 Get_Operational_Device_Count
     [Documentation]    Count number of items in operational netconf topology matching ${DEVICE_BASE_NAME}
-    ${item_data} =    TemplatedRequests.Get_As_Json_From_Uri    ${OPERATIONAL_API}/network-topology:network-topology/topology/topology-netconf    session=${netconf_manager_owner_http_session}
+    ${item_data} =    TemplatedRequests.Get_As_Json_From_Uri    ${OPERATIONAL_API}/network-topology:network-topology/topology/topology-netconf    session=${entity_ownership_leader_http_session}
     BuiltIn.Run_Keyword_And_Return    Count_Substring_Occurence    substring=${DEVICE_BASE_NAME}    main_string=${item_data}
 
 Check_Config_Items_Lower_Bound
@@ -169,3 +169,15 @@ Get_Typical_Time
     [Arguments]    ${coefficient}=1.0
     [Documentation]    Return number of seconds typical for given scale variables.
     BuiltIn.Run_Keyword_And_Return    BuiltIn.Evaluate    ${coefficient} * ${CONNECTION_SLEEP} * ${CONFIGURED_DEVICES_LIMIT}
+
+Change_Entity_Ownership_Leader_If_Needed
+    [Arguments]    ${topology_config_leader_idx}
+    [Documentation]    Move entity-ownership (operational) shard leader if it is on the same node as topology (config) shard leader.
+    ${entity_ownership_leader_index_old}    ${candidates} =    BuiltIn.Wait_Until_Keyword_Succeeds    3x    2s    ClusterManagement.Get_Leader_And_Followers_For_Shard    shard_name=entity-ownership
+    ...    shard_type=operational
+    BuiltIn.Return_From_Keyword_If    ${topology_config_leader_idx} != ${entity_ownership_leader_index_old}    ${entity_ownership_leader_index_old}
+    ${idx}=    Collections.Get_From_List    ${candidates}    0
+    ClusterAdmin.Make_Leader_Local    ${idx}    entity-ownership    operational
+    ${entity_ownership_leader_index}    ${candidates} =    BuiltIn.Wait_Until_Keyword_Succeeds    60s    3s    ClusterManagement.Verify_Shard_Leader_Elected    entity-ownership
+    ...    operational    ${True}    ${entity_ownership_leader_index_old}    verify_restconf=False
+    BuiltIn.Return_From_Keyword    ${entity_ownership_leader_index}
index 63a26a03b8ff58f1fe6e255260896e74a69d2ff4..ddde3a9dbcc744d2dce9a5c269fbdfeed3c085e7 100644 (file)
@@ -13,19 +13,20 @@ Documentation     Suite for High Availability testing netconf topology owner und
 ...
 ...               This suite uses a Python utility to continuously configure/deconfigure
 ...               device connections against devices simulated by testtool.
-...               The utility sends requests to the member which is Leader for topology config shard.
+...               The utility sends requests to the member which is leader for topology config shard.
 ...
 ...               To avoid excessive resource consumption, the utility deconfigures old devices.
 ...               In a stationary state, number of config items oscillates between
 ...               ${CONFIGURED_DEVICES_LIMIT} and 1 + ${CONFIGURED_DEVICES_LIMIT}.
 ...
 ...               The only tested HA event so far is reboot of the member
-...               which is Owner of netconf topology-manager entity.
-...               This suite assumes the Owner and the Leader are not co-located.
+...               which is the leader of entity-ownership operational shard.
+...               This suite assumes the entity-ownership operational shard leader and
+...               topology config shard leader are not co-located.
 ...
 ...               Number of devices is configurable, wait times are computed from that,
 ...               as it takes some time to initialize connections.
-...               Ideally, the utility should go through half of devices during Owner downtime.
+...               Ideally, the utility should go through half of devices during entity-ownership leader downtime.
 ...
 ...               If there is a period when netconf manager ignores deletions in config datastore,
 ...               the devices created previously could "leak", meaning the number of
@@ -46,10 +47,11 @@ Default Tags      @{TAGS_CRITICAL}
 Library           OperatingSystem
 Library           SSHLibrary    timeout=10s
 Library           String    # for Get_Regexp_Matches
+Resource          ${CURDIR}/../../../libraries/ClusterAdmin.robot
 Resource          ${CURDIR}/../../../libraries/ClusterManagement.robot
 Resource          ${CURDIR}/../../../libraries/KarafKeywords.robot
 Resource          ${CURDIR}/../../../libraries/NetconfKeywords.robot
-Resource          ${CURDIR}/../../../libraries/RemeoteBash.robot
+Resource          ${CURDIR}/../../../libraries/RemoteBash.robot
 Resource          ${CURDIR}/../../../libraries/SetupUtils.robot
 Resource          ${CURDIR}/../../../libraries/SSHKeywords.robot
 Resource          ${CURDIR}/../../../libraries/TemplatedRequests.robot
@@ -66,8 +68,8 @@ ${DEVICE_SET_SIZE}    30
 @{TAGS_NONCRITICAL}    clustering    netconf
 
 *** Test Cases ***
-Locate_Managers
-    [Documentation]    Detect location of Leader and Owner and store related data into suite variables.
+Setup_Leaders_Location
+    [Documentation]    Detect location of topology(config) and entity-ownership(operational) leaders and store related data into suite variables.
     ...    This cannot be part of Suite Setup, as Utils.Get_Index_From_List_Of_Dictionaries calls BuiltIn.Set_Test_Variable.
     ...    WUKS are used, as location failures are probably due to booting process, not bugs.
     ${topology_config_leader_index}    ${candidates} =    BuiltIn.Wait_Until_Keyword_Succeeds    3x    2s    ClusterManagement.Get_Leader_And_Followers_For_Shard    shard_name=topology
@@ -77,13 +79,12 @@ Locate_Managers
     BuiltIn.Set_Suite_Variable    \${topology_config_leader_ip}
     ${topology_config_leader_http_session} =    Resolve_Http_Session_For_Member    ${topology_config_leader_index}
     BuiltIn.Set_Suite_Variable    \${topology_config_leader_http_session}
-    ${netconf_manager_owner_index}    ${candidates} =    BuiltIn.Wait_Until_Keyword_Succeeds    3x    2s    ClusterManagement.Get_Owner_And_Candidates_For_Type_And_Id    type=topology-netconf
-    ...    id=/general-entity:entity[general-entity:name='topology-manager']    member_index=1
-    BuiltIn.Set_Suite_Variable    \${netconf_manager_owner_index}
-    ${netconf_manager_owner_ip} =    ClusterManagement.Resolve_Ip_Address_For_Member    ${netconf_manager_owner_index}
-    BuiltIn.Set_Suite_Variable    \${netconf_manager_owner_ip}
-    ${netconf_manager_owner_http_session} =    Resolve_Http_Session_For_Member    ${netconf_manager_owner_index}
-    BuiltIn.Set_Suite_Variable    \${netconf_manager_owner_http_session}
+    ${entity_ownership_leader_index}    Change_Entity_Ownership_Leader_If_Needed    ${topology_config_leader_index}
+    BuiltIn.Set_Suite_Variable    \${entity_ownership_leader_index}
+    ${entity_ownership_leader_ip} =    ClusterManagement.Resolve_Ip_Address_For_Member    ${entity_ownership_leader_index}
+    BuiltIn.Set_Suite_Variable    \${entity_ownership_leader_ip}
+    ${entity_ownership_leader_http_session} =    Resolve_Http_Session_For_Member    ${entity_ownership_leader_index}
+    BuiltIn.Set_Suite_Variable    \${entity_ownership_leader_http_session}
 
 Start_Testtool
     [Documentation]    Deploy and start test tool on its separate SSH session.
@@ -110,20 +111,20 @@ Wait_For_Config_Items
     ${timeout} =    Get_Typical_Time
     BuiltIn.Wait_Until_Keyword_Succeeds    ${timeout}    1s    Check_Config_Items_Lower_Bound
 
-Reboot_Manager_Owner
-    [Documentation]    Kill and restart member where netconf topology manager was, including removal of persisted data.
-    ...    After cluster sync, sleep additional time to ensure manager processes requests with the rebooted member fully rejoined.
+Reboot_Entity_Ownership_Leader
+    [Documentation]    Kill and restart member where entity-ownership shard leader was, including removal of persisted data.
+    ...    After cluster sync, sleep additional time to ensure entity-ownership shard processes requests with the rebooted member fully rejoined.
     [Tags]    @{TAGS_NONCRITICAL}    # To avoid long WUKS list expanded in log.html
-    ClusterManagement.Kill_Single_Member    ${netconf_manager_owner_index}
-    ${owner_list} =    BuiltIn.Create_List    ${netconf_manager_owner_index}
-    ClusterManagement.Start_Single_Member    ${netconf_manager_owner_index}
+    ClusterManagement.Kill_Single_Member    ${entity_ownership_leader_index}
+    ${owner_list} =    BuiltIn.Create_List    ${entity_ownership_leader_index}
+    ClusterManagement.Start_Single_Member    ${entity_ownership_leader_index}
     BuiltIn.Comment    FIXME: Replace sleep with WUKS when it becomes clear what to wait for.
     ${sleep_time} =    Get_Typical_Time    coefficient=3.0
     BuiltIn.Sleep    ${sleep_time}
 
 Stop_Configurer
     [Documentation]    Write ctrl+c, download the log, read its contents and match expected patterns.
-    RemeoteBash.Write_Bare_Ctrl_C
+    RemoteBash.Write_Bare_Ctrl_C
     ${output} =    SSHLibrary.Read_Until_Prompt
     BuiltIn.Log    ${output}
     SSHLibrary.Get_File    ${log_filename}
@@ -192,3 +193,16 @@ Get_Typical_Time
     [Arguments]    ${coefficient}=1.0
     [Documentation]    Return number of seconds typical for given scale variables.
     BuiltIn.Run_Keyword_And_Return    BuiltIn.Evaluate    ${coefficient} * ${CONNECTION_SLEEP} * ${CONFIGURED_DEVICES_LIMIT}
+
+Change_Entity_Ownership_Leader_If_Needed
+    [Arguments]    ${topology_config_leader_idx}
+    [Documentation]    Move entity-ownership (operational) shard leader if it is on the same node as topology (config) shard leader.
+    ...    TODO: move keyword to a common resource, e.g. ShardStability
+    ${entity_ownership_leader_index_old}    ${candidates} =    BuiltIn.Wait_Until_Keyword_Succeeds    3x    2s    ClusterManagement.Get_Leader_And_Followers_For_Shard    shard_name=entity-ownership
+    ...    shard_type=operational
+    BuiltIn.Return_From_Keyword_If    ${topology_config_leader_idx} != ${entity_ownership_leader_index_old}    ${entity_ownership_leader_index_old}
+    ${idx}=    Collections.Get_From_List    ${candidates}    0
+    ClusterAdmin.Make_Leader_Local    ${idx}    entity-ownership    operational
+    ${entity_ownership_leader_index}    ${candidates} =    BuiltIn.Wait_Until_Keyword_Succeeds    60s    3s    ClusterManagement.Verify_Shard_Leader_Elected    entity-ownership
+    ...    operational    ${True}    ${entity_ownership_leader_index_old}    verify_restconf=False
+    BuiltIn.Return_From_Keyword    ${entity_ownership_leader_index}
index 189da6da6a139b222f5a79d1df678d906b66fe45..ffa09e26d2d36585e2d3c10b6cdf78c19688400d 100644 (file)
@@ -6,9 +6,6 @@
 
 # Place the suites in run order:
 
-# Install feature in controlled way.
-integration/test/csit/suites/netconf/clusteringscale/staggered_install.robot
-
 # Make sure ODL is ready.
 integration/test/csit/suites/netconf/ready
 
@@ -19,7 +16,6 @@ integration/test/csit/suites/netconf/clusteringscale/topology_owner_ha.robot
 
 # Reset in order to run more suites.
 integration/test/csit/suites/test/cluster_reset.robot
-integration/test/csit/suites/netconf/clusteringscale/staggered_install.robot
 integration/test/csit/suites/netconf/ready
 
 # More suites.