Netconf Scaling test for multithreaded GET requests 30/28730/21
authorJozef Behran <jbehran@cisco.com>
Fri, 23 Oct 2015 12:42:14 +0000 (14:42 +0200)
committerVratko Polák <vrpolak@cisco.com>
Fri, 4 Dec 2015 16:31:55 +0000 (16:31 +0000)
This test suite first emits a batch of "single device mount"
requests to the netconf (via restconf), then executes a wait
loop on each of them (to ensure each of them is connected)
and then uses a Python tool to issue a GET requests for
config data on them. Each device gets one request for its
config data but there are multiple requests being made
concurrently (the count of these concurrent requests is
configurable via a Robot variable). Then it finally issues a
batch of "deconfigure single device" requests. A link to a
known bug is also included as especially the "deconfigure"
part of the test appears to be hit by it pretty badly.

Change-Id: I924d9a72921d29580d54fcf0317aec4137174fb2
Signed-off-by: Jozef Behran <jbehran@cisco.com>
csit/suites/netconf/scale/getmulti.robot [new file with mode: 0644]
csit/testplans/netconf-scale.txt
tools/netconf_tools/getter.py [new file with mode: 0644]

diff --git a/csit/suites/netconf/scale/getmulti.robot b/csit/suites/netconf/scale/getmulti.robot
new file mode 100644 (file)
index 0000000..74d4042
--- /dev/null
@@ -0,0 +1,132 @@
+*** Settings ***
+Documentation     netconf-connector scaling test suite (multi-threaded GET requests).
+...
+...               Copyright (c) 2015 Cisco Systems, Inc. and others. All rights reserved.
+...
+...               This program and the accompanying materials are made available under the
+...               terms of the Eclipse Public License v1.0 which accompanies this distribution,
+...               and is available at http://www.eclipse.org/legal/epl-v10.html
+...
+...
+...               Performs scaling tests:
+...               - Send configurations of the devices one by one (via restconf).
+...               - Wait for the devices to become connected.
+...               - Send requests for configuration data using ${WORKER_COUNT} worker threads
+...               (using external Python tool).
+...               - Deconfigure the devices one by one.
+Suite Setup       Setup_Everything
+Suite Teardown    Teardown_Everything
+Library           Collections
+Library           String
+Library           SSHLibrary    timeout=10s
+Resource          ${CURDIR}/../../../libraries/KarafKeywords.robot
+Resource          ${CURDIR}/../../../libraries/NetconfKeywords.robot
+Resource          ${CURDIR}/../../../libraries/SetupUtils.robot
+Resource          ${CURDIR}/../../../libraries/Utils.robot
+Variables         ${CURDIR}/../../../variables/Variables.py
+
+*** Variables ***
+${DEVICE_COUNT}    500
+${WORKER_COUNT}    10
+${device_name_base}    netconf-scaling-device
+${base_port}      17830
+
+*** Test Cases ***
+Configure_Devices_On_Netconf
+    [Documentation]    Make requests to configure the testtool devices.
+    ${timeout}=    BuiltIn.Evaluate    ${DEVICE_COUNT}*10
+    NetconfKeywords.Perform_Operation_On_Each_Device    Configure_Device    timeout=${timeout}
+
+Wait_For_Devices_To_Connect
+    [Documentation]    Wait for the devices to become connected.
+    ${timeout}=    BuiltIn.Evaluate    ${DEVICE_COUNT}*10
+    NetconfKeywords.Perform_Operation_On_Each_Device    Wait_Connected    timeout=${timeout}
+
+Issue_Requests_On_Devices
+    [Documentation]    Spawn the specified count of worker threads to issue a GET request to each of the devices.
+    ${current_ssh_connection}=    SSHLibrary.Get Connection
+    SSHLibrary.Open_Connection    ${TOOLS_SYSTEM_IP}
+    Utils.Flexible_Mininet_Login
+    SSHLibrary.Write    python getter.py --odladdress=${ODL_SYSTEM_IP} --count=${DEVICE_COUNT} --name=${device_name_base} --workers=${WORKER_COUNT}
+    : FOR    ${number}    IN RANGE    1    ${DEVICE_COUNT}+1
+    \    Read_Python_Tool_Operation_Result    ${number}
+    SSHLibrary.Read_Until_Prompt
+    SSHLibrary.Close_Connection
+    Restore Current SSH Connection From Index    ${current_ssh_connection.index}
+
+Deconfigure_Devices
+    [Documentation]    Make requests to deconfigure the testtool devices.
+    ${timeout}=    BuiltIn.Evaluate    ${DEVICE_COUNT}*10
+    NetconfKeywords.Perform_Operation_On_Each_Device    Deconfigure_Device    timeout=${timeout}
+    [Teardown]    Report_Failure_Due_To_Bug    4547
+
+Check_Devices_Are_Deconfigured
+    [Documentation]    Check there are no netconf connectors or other stuff related to the testtool devices.
+    ${timeout}=    BuiltIn.Evaluate    ${DEVICE_COUNT}*10
+    NetconfKeywords.Perform_Operation_On_Each_Device    Check_Device_Deconfigured    timeout=${timeout}
+
+*** Keywords ***
+Setup_Everything
+    [Documentation]    Setup everything needed for the test cases.
+    # Setup resources used by the suite.
+    RequestsLibrary.Create_Session    operational    http://${ODL_SYSTEM_IP}:${RESTCONFPORT}${OPERATIONAL_API}    auth=${AUTH}
+    SSHLibrary.Set_Default_Configuration    prompt=${TOOLS_SYSTEM_PROMPT}
+    SetupUtils.Setup_Utils_For_Setup_And_Teardown
+    NetconfKeywords.Setup_Netconf_Keywords
+    # Connect to the tools machine
+    SSHLibrary.Open_Connection    ${TOOLS_SYSTEM_IP}
+    Utils.Flexible_Mininet_Login
+    # Deploy testtool on it
+    NetconfKeywords.Install_And_Start_Testtool    device-count=${DEVICE_COUNT}
+    SSHLibrary.Put_File    ${CURDIR}/../../../../tools/netconf_tools/getter.py
+    SSHLibrary.Put_File    ${CURDIR}/../../../libraries/AuthStandalone.py
+
+Teardown_Everything
+    [Documentation]    Teardown the test infrastructure, perform cleanup and release all resources.
+    Teardown_Netconf_Via_Restconf
+    RequestsLibrary.Delete_All_Sessions
+    NetconfKeywords.Stop_Testtool
+
+Configure_Device
+    [Arguments]    ${current_name}
+    [Documentation]    Operation for configuring the device.
+    KarafKeywords.Log_Message_To_Controller_Karaf    Configuring device ${current_name} to Netconf
+    NetconfKeywords.Configure_Device_In_Netconf    ${current_name}    device_port=${current_port}
+    KarafKeywords.Log_Message_To_Controller_Karaf    Device ${current_name} configured
+
+Wait_Connected
+    [Arguments]    ${current_name}
+    [Documentation]    Operation for waiting until the device is connected.
+    KarafKeywords.Log_Message_To_Controller_Karaf    Waiting for device ${current_name} to connect
+    NetconfKeywords.Wait_Device_Connected    ${current_name}    period=0.5s    timeout=120s
+    KarafKeywords.Log_Message_To_Controller_Karaf    Device ${current_name} connected
+
+Read_Python_Tool_Operation_Result
+    [Arguments]    ${number}
+    [Documentation]    Read and process a report line emitted from the Python tool that corresponds to the device with the given number.
+    ${test}=    SSHLibrary.Read_Until_Regexp    \\n
+    ${test}=    String.Split_String    ${test}    |
+    ${response}=    Collections.Get_From_List    ${test}    0
+    ${message}=    Collections.Get_From_List    ${test}    1
+    BuiltIn.Run_Keyword_If    '${response}' == 'ERROR'    Fail    Error getting data: ${message}
+    ${start}=    Collections.Get_From_List    ${test}    1
+    ${stop}=    Collections.Get_From_List    ${test}    2
+    ${ellapsed}=    Collections.Get_From_List    ${test}    3
+    BuiltIn.Log    DATA REQUEST RESULT: Device=${number} StartTime=${start} StopTime=${stop} EllapsedTime=${ellapsed}
+    ${data}=    Collections.Get_From_List    ${test}    4
+    ${expected}=    BuiltIn.Set_Variable    '<data xmlns="${ODL_NETCONF_NAMESPACE}"></data>'
+    BuiltIn.Should_Be_Equal_As_Strings    ${data}    ${expected}
+
+Deconfigure_Device
+    [Arguments]    ${current_name}
+    [Documentation]    Operation for deconfiguring the device.
+    KarafKeywords.Log_Message_To_Controller_Karaf    Deconfiguring device ${current_name}
+    NetconfKeywords.Remove_Device_From_Netconf    ${current_name}
+    KarafKeywords.Log_Message_To_Controller_Karaf    Device ${current_name} deconfigured
+
+Check_Device_Deconfigured
+    [Arguments]    ${current_name}
+    [Documentation]    Operation for making sure the device is really deconfigured.
+    KarafKeywords.Log_Message_To_Controller_Karaf    Waiting for device ${current_name} to disappear
+    NetconfKeywords.Wait_Device_Fully_Removed    ${current_name}    period=0.5s    timeout=120s
+    KarafKeywords.Log_Message_To_Controller_Karaf    Device ${current_name} removed
index 7fd410969329e19506fbccd817da7d0f82eb445b..08d21b2e1a0c20e7b34d8aaa1d6025f9e51eaffa 100644 (file)
@@ -6,4 +6,5 @@
 
 # Place the suites in run order:
 integration/test/csit/suites/netconf/ready
+integration/test/csit/suites/netconf/scale/getmulti.robot
 integration/test/csit/suites/netconf/scale/getsingle.robot
diff --git a/tools/netconf_tools/getter.py b/tools/netconf_tools/getter.py
new file mode 100644 (file)
index 0000000..8289209
--- /dev/null
@@ -0,0 +1,182 @@
+"""Multithreaded utility for rapid Netconf device GET requesting.
+
+This utility sends GET requests to ODL Netconf through Restconf to get a
+bunch of configuration data from Netconf mounted devices and then checks the
+results against caller provided content. The requests are sent via a
+configurable number of workers. Each worker issues a bunch of blocking
+restconf requests. Work is distributed in round-robin fashion. The utility
+waits for the last worker to finish, or for time to run off.
+
+The responses are checked for status (200 OK is expected) and content
+(provided by user via the "--data" command line option). Results are written
+to collections.Counter and printed at exit. If collections does not contain
+Counter, "import Counter" is attempted.
+
+It is advised to pin the python process to single CPU for optimal performance
+as Global Interpreter Lock prevents true utilization on more CPUs (while
+overhead of context switching remains).
+"""
+
+# Copyright (c) 2015 Cisco Systems, Inc. and others.  All rights reserved.
+#
+# This program and the accompanying materials are made available under the
+# terms of the Eclipse Public License v1.0 which accompanies this distribution,
+# and is available at http://www.eclipse.org/legal/epl-v10.html
+
+__author__ = "Vratko Polak"
+__copyright__ = "Copyright(c) 2015, Cisco Systems, Inc."
+__license__ = "Eclipse Public License v1.0"
+__email__ = "vrpolak@cisco.com"
+
+
+import argparse
+import collections  # For deque and Counter.
+import threading
+import time
+import AuthStandalone
+
+
+def str2bool(text):
+    """Utility converter, based on http://stackoverflow.com/a/19227287"""
+    return text.lower() in ("yes", "true", "y", "t", "1")
+
+
+def parse_arguments():
+    parser = argparse.ArgumentParser()
+
+    # Netconf and Restconf related arguments.
+    parser.add_argument('--odladdress', default='127.0.0.1',
+                        help='IP address of ODL Restconf to be used')
+    parser.add_argument('--restconfport', default='8181',
+                        help='Port on which ODL Restconf to be used')
+    parser.add_argument('--user', default='admin',
+                        help='Username for ODL Restconf authentication')
+    parser.add_argument('--password', default='admin',
+                        help='Password for ODL Restconf authentication')
+    parser.add_argument('--scope', default='sdn',
+                        help='Scope for ODL Restconf authentication')
+    parser.add_argument('--count', type=int,
+                        help='Count of devices to query')
+    parser.add_argument('--name',
+                        help='Name of device without the ID suffix')
+    parser.add_argument('--reuse', default='True', type=str2bool,
+                        help='Should single requests session be re-used')
+
+    # Work related arguments.
+    parser.add_argument('--workers', default='1', type=int,
+                        help='number of blocking http threads to use')
+    parser.add_argument('--timeout', default='300', type=float,
+                        help='timeout in seconds for all jobs to complete')
+    parser.add_argument('--refresh', default='0.1', type=float,
+                        help='seconds to sleep in main thread if nothing to do')
+
+    return parser.parse_args()  # arguments are read
+
+
+class TRequestWithResponse(object):
+
+    def __init__(self, uri, kwargs):
+        self.uri = uri
+        self.kwargs = kwargs
+        self.response_ready = threading.Event()
+
+    def set_response(self, runtime, status, content):
+        self.status = status
+        self.runtime = runtime
+        self.content = content
+        self.response_ready.set()
+
+    def wait_for_response(self):
+        self.response_ready.wait()
+
+
+def queued_send(session, queue_messages):
+    """Pop from queue, Post and append result; repeat until empty."""
+    while 1:
+        try:
+            request = queue_messages.popleft()
+        except IndexError:  # nothing more to send
+            break
+        start = time.time()
+        response = AuthStandalone.Get_Using_Session(session, request.uri, **request.kwargs)
+        stop = time.time()
+        status = int(response.status_code)
+        content = repr(response.content)
+        runtime = stop - start
+        request.set_response((start, stop, runtime), status, content)
+
+
+def collect_results(request_list, response_queue):
+    for request in request_list:
+        request.wait_for_response()
+        response = (request.status, request.runtime, request.content)
+        response_queue.append(response)
+
+
+def watch_for_timeout(timeout, response_queue):
+    time.sleep(timeout)
+    response_queue.append((None, 'Time is up!'))
+
+
+def run_thread(thread_target, *thread_args):
+    thread = threading.Thread(target=thread_target, args=thread_args)
+    thread.daemon = True
+    thread.start()
+    return thread
+
+
+# Parse the command line arguments
+args = parse_arguments()
+
+# Construct the work for the workers.
+url_start = (
+    'config/'
+    "network-topology:network-topology/topology/topology-netconf/node/"
+    + args.name + "-"
+)
+url_end = "/yang-ext:mount"
+headers = {'Content-Type': 'application/xml', "Accept": "application/xml"}
+kwargs = {"headers": headers}
+requests = []
+for device_number in range(args.count):
+    device_url = url_start + str(device_number + 1) + url_end
+    request = TRequestWithResponse(device_url, kwargs)
+    requests.append(request)
+
+# Organize the work into the work queues.
+list_q_msg = [collections.deque() for _ in range(args.workers)]
+index = 0
+for request in requests:
+    queue = list_q_msg[index]
+    queue.append(request)
+    index += 1
+    if index == len(list_q_msg):
+        index = 0
+
+# Spawn the workers, giving each a queue.
+threads = []
+for queue_messages in list_q_msg:
+    session = AuthStandalone.Init_Session(args.odladdress, args.user, args.password, args.scope, args.reuse)
+    thread = run_thread(queued_send, session, queue_messages)
+    threads.append(thread)
+
+# Spawn the results collector worker
+responses = collections.deque()
+collector = run_thread(collect_results, requests, responses)
+
+# Spawn the watchdog thread
+watchdog = run_thread(watch_for_timeout, args.timeout, responses)
+
+# Watch the response queue, outputting the lines
+request_count = args.count
+while request_count > 0:
+    if len(responses) > 0:
+        result = responses.popleft()
+        if result[0] is None:
+            print "ERROR|" + result[1]+"|"
+            break
+        runtime = "%5.3f|%5.3f|%5.3f" % result[1]
+        print "%03d|%s|%s|" % (result[0], runtime, result[2])
+        request_count -= 1
+        continue
+    time.sleep(args.refresh)