2 Documentation Suite for High Availability testing config topology shard leader under stress.
4 ... Copyright (c) 2016 Cisco Systems, Inc. and others. All rights reserved.
6 ... This program and the accompanying materials are made available under the
7 ... terms of the Eclipse Public License v1.0 which accompanies this distribution,
8 ... and is available at http://www.eclipse.org/legal/epl-v10.html
11 ... This is close analogue of topology_owner_ha.robot, see Documentation there.
12 ... The difference is that here the requests are sent towards entity-ownership shard leader,
13 ... and the topology shard leader node is rebooted.
15 ... No real clustering Bugs are expected to be discovered by this suite,
16 ... except maybe some Restconf ones.
17 ... But as this suite was easy to create, it may as well be run.
18 Suite Setup Setup_Everything
19 Suite Teardown Teardown_Everything
20 Test Setup SetupUtils.Setup_Test_With_Logging_And_Without_Fast_Failing
21 Test Teardown ${DEFAULT_TEARDOWN_KEYWORD}
22 Default Tags @{TAGS_CRITICAL}
23 Library OperatingSystem
24 Library SSHLibrary timeout=10s
25 Library String # for Get_Regexp_Matches
26 Resource ${CURDIR}/../../../libraries/ClusterAdmin.robot
27 Resource ${CURDIR}/../../../libraries/ClusterManagement.robot
28 Resource ${CURDIR}/../../../libraries/KarafKeywords.robot
29 Resource ${CURDIR}/../../../libraries/NetconfKeywords.robot
30 Resource ${CURDIR}/../../../libraries/RemoteBash.robot
31 Resource ${CURDIR}/../../../libraries/SetupUtils.robot
32 Resource ${CURDIR}/../../../libraries/SSHKeywords.robot
33 Resource ${CURDIR}/../../../libraries/TemplatedRequests.robot
34 Resource ${CURDIR}/../../../libraries/Utils.robot
35 Variables ${CURDIR}/../../../variables/Variables.py
38 ${CONFIGURED_DEVICES_LIMIT} 20
39 ${CONNECTION_SLEEP} 1.2
40 ${DEFAULT_TEARDOWN_KEYWORD} SetupUtils.Teardown_Test_Show_Bugs_If_Test_Failed
41 ${DEVICE_BASE_NAME} netconf-test-device
43 @{TAGS_CRITICAL} critical @{TAGS_NONCRITICAL}
44 @{TAGS_NONCRITICAL} clustering netconf
48 [Documentation] Detect location of topology(config) and entity-ownership(operational) leaders and store related data into suite variables.
49 ... This cannot be part of Suite Setup, as Utils.Get_Index_From_List_Of_Dictionaries calls BuiltIn.Set_Test_Variable.
50 ... WUKS are used, as location failures are probably due to booting process, not bugs.
51 ${topology_config_leader_index} ${candidates} = BuiltIn.Wait_Until_Keyword_Succeeds 3x 2s ClusterManagement.Get_Leader_And_Followers_For_Shard shard_name=topology
53 BuiltIn.Set_Suite_Variable \${topology_config_leader_index}
54 ${topology_config_leader_ip} = ClusterManagement.Resolve_Ip_Address_For_Member ${topology_config_leader_index}
55 BuiltIn.Set_Suite_Variable \${topology_config_leader_ip}
56 ${topology_config_leader_http_session} = Resolve_Http_Session_For_Member ${topology_config_leader_index}
57 BuiltIn.Set_Suite_Variable \${topology_config_leader_http_session}
58 ${entity_ownership_leader_index} Change_Entity_Ownership_Leader_If_Needed ${topology_config_leader_index}
59 BuiltIn.Set_Suite_Variable \${entity_ownership_leader_index}
60 ${entity_ownership_leader_ip} = ClusterManagement.Resolve_Ip_Address_For_Member ${entity_ownership_leader_index}
61 BuiltIn.Set_Suite_Variable \${entity_ownership_leader_ip}
62 ${entity_ownership_leader_http_session} = Resolve_Http_Session_For_Member ${entity_ownership_leader_index}
63 BuiltIn.Set_Suite_Variable \${entity_ownership_leader_http_session}
66 [Documentation] Deploy and start test tool on its separate SSH session.
67 SSHLibrary.Switch_Connection ${testtool_connection_index}
68 NetconfKeywords.Install_And_Start_Testtool device-count=${DEVICE_SET_SIZE} schemas=${CURDIR}/../../../variables/netconf/CRUD/schemas
69 # TODO: Introduce NetconfKeywords.Safe_Install_And_Start_Testtool to avoid teardown maniputation.
70 [Teardown] BuiltIn.Run_Keywords SSHLibrary.Switch_Connection ${configurer_connection_index}
71 ... AND ${DEFAULT_TEARDOWN_KEYWORD}
74 [Documentation] Launch Python utility (while copying output to log file) and verify it does not stop by itself.
75 ${log_filename} = Utils.Get_Log_File_Name configurer
76 BuiltIn.Set_Suite_Variable \${log_filename}
77 # TODO: Should things like restconf port/user/password be set from Variables?
78 ${command} = BuiltIn.Set_Variable python configurer.py --odladdress ${entity_ownership_leader_ip} --deviceaddress ${TOOLS_SYSTEM_IP} --devices ${DEVICE_SET_SIZE} --disconndelay ${CONFIGURED_DEVICES_LIMIT} --basename ${DEVICE_BASE_NAME} --connsleep ${CONNECTION_SLEEP} &> "${log_filename}"
79 SSHLibrary.Write ${command}
80 ${status} ${text} = BuiltIn.Run_Keyword_And_Ignore_Error SSHLibrary.Read_Until_Prompt
82 BuiltIn.Run_Keyword_If "${status}" != "FAIL" BuiltIn.Fail Prompt happened, see Log.
83 # Session is kept active.
86 [Documentation] Make sure configurer is in phase when old devices are being deconfigured; or fail on timeout.
87 ${timeout} = Get_Typical_Time
88 BuiltIn.Wait_Until_Keyword_Succeeds ${timeout} 1s Check_Config_Items_Lower_Bound
90 Reboot_Topology_Leader
91 [Documentation] Kill and restart member where topology shard leader was, including removal of persisted data.
92 ... After cluster sync, sleep additional time to ensure manager processes requests with the rebooted member fully rejoined.
93 [Tags] @{TAGS_NONCRITICAL} # To avoid long WUKS list expanded in log.html
94 ClusterManagement.Kill_Single_Member ${topology_config_leader_index}
95 ${owner_list} = BuiltIn.Create_List ${topology_config_leader_index}
96 ClusterManagement.Start_Single_Member ${topology_config_leader_index}
97 BuiltIn.Comment FIXME: Replace sleep with WUKS when it becomes clear what to wait for.
98 ${sleep_time} = Get_Typical_Time coefficient=3.0
99 BuiltIn.Sleep ${sleep_time}
102 [Documentation] Write ctrl+c, download the log, read its contents and match expected patterns.
103 RemoteBash.Write_Bare_Ctrl_C
104 ${output} = SSHLibrary.Read_Until_Prompt
105 BuiltIn.Log ${output}
106 SSHLibrary.Get_File ${log_filename}
107 ${output} = OperatingSystem.Get_File ${log_filename}
108 ${list_any_matches} = String.Get_Regexp_Matches ${output} delete|put
109 ${number_any_matches} = BuiltIn.Get_Length ${list_any_matches}
110 BuiltIn.Should_Be_Equal ${2} ${number_any_matches} Unexpected status seen: ${output}
111 ${list_strict_matches} = String.Get_Regexp_Matches ${output} delete:200|put:201
112 ${number_strict_matches} = BuiltIn.Get_Length ${list_strict_matches}
113 BuiltIn.Should_Be_Equal ${2} ${number_strict_matches} Expected status not seen: ${output}
115 Check_For_Connector_Leak
116 [Documentation] Check that number of items in operational netconf topology is not higher than expected.
117 # FIXME: Are separate keywords necessary?
118 Check_Operational_Items_Upper_Bound
122 [Documentation] Initialize libraries and set suite variables..
123 ClusterManagement.ClusterManagement_Setup
124 SetupUtils.Setup_Utils_For_Setup_And_Teardown
125 NetconfKeywords.Setup_Netconf_Keywords create_session_for_templated_requests=False
126 ${testtool_connection_index} = SSHKeywords.Open_Connection_To_Tools_System
127 BuiltIn.Set_Suite_Variable \${testtool_connection_index}
128 ${configurer_connection_index} = SSHKeywords.Open_Connection_To_Tools_System
129 BuiltIn.Set_Suite_Variable \${configurer_connection_index}
130 SSHKeywords.Require_Python
131 SSHKeywords.Assure_Library_Counter
132 SSHLibrary.Put_File ${CURDIR}/../../../../tools/netconf_tools/configurer.py
133 SSHLibrary.Put_File ${CURDIR}/../../../libraries/AuthStandalone.py
136 [Documentation] Teardown the test infrastructure, perform cleanup and release all resources.
137 SSHLibrary.Switch_Connection ${testtool_connection_index}
138 NetconfKeywords.Stop_Testtool
139 RequestsLibrary.Delete_All_Sessions
141 Count_Substring_Occurence
142 [Arguments] ${substring} ${main_string}
143 [Documentation] Apply the length_of_split method for counting how many times ${substring} occures within ${main_string}.
144 ... The method is reliable only if triple-double quotes are not present in either argument.
145 BuiltIn.Comment TODO: Migrate this keyword into an appropriate Resource.
146 BuiltIn.Run_Keyword_And_Return Builtin.Evaluate len("""${main_string}""".split("""${substring}""")) - 1
148 Get_Config_Device_Count
149 [Documentation] Count number of items in config netconf topology matching ${DEVICE_BASE_NAME}
150 ${item_data} = TemplatedRequests.Get_As_Json_From_Uri ${CONFIG_API}/network-topology:network-topology/topology/topology-netconf session=${entity_ownership_leader_http_session}
151 BuiltIn.Run_Keyword_And_Return Count_Substring_Occurence substring=${DEVICE_BASE_NAME} main_string=${item_data}
153 Get_Operational_Device_Count
154 [Documentation] Count number of items in operational netconf topology matching ${DEVICE_BASE_NAME}
155 ${item_data} = TemplatedRequests.Get_As_Json_From_Uri ${OPERATIONAL_API}/network-topology:network-topology/topology/topology-netconf session=${entity_ownership_leader_http_session}
156 BuiltIn.Run_Keyword_And_Return Count_Substring_Occurence substring=${DEVICE_BASE_NAME} main_string=${item_data}
158 Check_Config_Items_Lower_Bound
159 [Documentation] Count items matching ${DEVICE_BASE_NAME}, fail if less than ${CONFIGURED_DEVICES_LIMIT}
160 ${device_count} = Get_Config_Device_Count
161 BuiltIn.Run_Keyword_If ${device_count} < ${CONFIGURED_DEVICES_LIMIT} BuiltIn.Fail Found ${device_count} config items, should be at least ${CONFIGURED_DEVICES_LIMIT}
163 Check_Operational_Items_Upper_Bound
164 [Documentation] Count items matching ${DEVICE_BASE_NAME}, fail if more than 1 + ${CONFIGURED_DEVICES_LIMIT}
165 ${device_count} = Get_Operational_Device_Count
166 BuiltIn.Run_Keyword_If ${device_count} > 1 + ${CONFIGURED_DEVICES_LIMIT} BuiltIn.Fail Found ${device_count} config items, should be at most 1 + ${CONFIGURED_DEVICES_LIMIT}
169 [Arguments] ${coefficient}=1.0
170 [Documentation] Return number of seconds typical for given scale variables.
171 BuiltIn.Run_Keyword_And_Return BuiltIn.Evaluate ${coefficient} * ${CONNECTION_SLEEP} * ${CONFIGURED_DEVICES_LIMIT}
173 Change_Entity_Ownership_Leader_If_Needed
174 [Arguments] ${topology_config_leader_idx}
175 [Documentation] Move entity-ownership (operational) shard leader if it is on the same node as topology (config) shard leader.
176 ${entity_ownership_leader_index_old} ${candidates} = BuiltIn.Wait_Until_Keyword_Succeeds 3x 2s ClusterManagement.Get_Leader_And_Followers_For_Shard shard_name=entity-ownership
177 ... shard_type=operational
178 BuiltIn.Return_From_Keyword_If ${topology_config_leader_idx} != ${entity_ownership_leader_index_old} ${entity_ownership_leader_index_old}
179 ${idx}= Collections.Get_From_List ${candidates} 0
180 ClusterAdmin.Make_Leader_Local ${idx} entity-ownership operational
181 ${entity_ownership_leader_index} ${candidates} = BuiltIn.Wait_Until_Keyword_Succeeds 60s 3s ClusterManagement.Verify_Shard_Leader_Elected entity-ownership
182 ... operational ${True} ${entity_ownership_leader_index_old} verify_restconf=False
183 BuiltIn.Return_From_Keyword ${entity_ownership_leader_index}