BUG-8494: fix throttling during reconnect ReconnectForwarder is called from differing code-paths: the one is during replay when we are dealing with late requests (those which have been waiting while we replaying), the other is subsequent user requests. The first one should not be waiting on the queue, as the requests have already entered it, hence have payed the cost of entry. The latter needs to pay for entering the queue, as otherwise we do not exert backpressure. This patch differentiates the two code paths, so they behave as they should. Also add more debug information in timer paths. Change-Id: I609be2332b13868ef1b9511399e2827d7f3d5b7d Signed-off-by: Robert Varga <robert.varga@pantheon.tech> (cherry picked from commit 851fb56fba015c9fee3f0f9235c5c631a492ce59)
BUG-8403: guard against ConcurrentModificationException Using TransmitQueue.asIterable() offers slight advantage of not dealing with a big list, but exposes us to the risk of the Iterable being changed. The point missed by the fix to BUG 8491 is that there is an avenue for the old connection to be touched during replay, as we are completing entries, for example reads when we are switching from remote to local connection. In this case the callback will be invoked in the actor thread, with all the locks being reentrant and held, hence it can break through to the old connection's queue. If that happens we will see a ConcurrentModificationException and enter a buggy territory, where the client fails to work properly. Document this caveat and turn asIterable() into drain(), which removes all the entries in the queue, allowing new entries to be enqueued. The late-comer entries are accounted for when we set the forwarder. Change-Id: Idf29c1e565e12aaed917ac94c21c552daf169d4d Signed-off-by: Robert Varga <robert.varga@pantheon.tech> (cherry picked from commit 930747a6ba5d888d2fbe54473132680e4621d858)
BUG-8372: improve forward/replay naming There is a bit of confusion between 'replay' and 'forward' methods. They serve two distinct purposes: - 'replay' happens during reconnect, i.e. for requests that have already entered the connection queue and have paid the delay cost, so they should not pay it again. - 'forward' happens after reconnect for requests that have raced with the reconnect process, i.e. they need to hop from the old connection to the new one. These need to enter the queue and pay the delay cost. This patch cleans the codepaths up to use consistent naming, making it clearer that the problem we are seeing is in the 'replay' path. Change-Id: Id854e09a0308f8d0a9144d59f41e31950cd58665 Signed-off-by: Robert Varga <robert.varga@pantheon.tech> (cherry picked from commit cc21df8ade11f41843dc558e8fc93d5be92ed151)
BUG-5280: expose queue messages during reconnect This patch reworks the internals of AbstractClientConnection to isolate the TransmitQueue from the rest of the logic, so we have proper split between implementation and interface exposed to the users. Furthermore the public interface is slightly reworked so the individual Proxies have access to the (locked) queue contents, which is needed to correctly replay transaction state within transaction chains. Change-Id: I1c08fa06eec4dd581e07002059c5142e6b0c1ed4 Signed-off-by: Robert Varga <rovarga@cisco.com>
BUG-5280: add AbstractClientConnection Introduce a connection concept. This is a replacement for the request queue, as it turns out we do need the concept of a full connection (e.g. generational logic). This comes from the need to sensibly switch behaviors as the locality of the backend leader changes. This patch implements two sets of strategies for dealing with reconnect: The first one assumes long-lived state and is used for proxies dealing with histories. Here we make sure to reinstantiate and replace them in a map, as we want new transactions to follow the new semantic and we do not want to tear histories down or follow inefficient paths. The second one assumes short-lived state and is used for proxies dealing with individual transactions. Transactions are assumed to come and go rapidly and therefore we do not replace the proxies in maps (as they will be short-lived), but rather forwards operations to successors. The first strategy has a higher access cost, but its state is always fully uptodate when reconnect finishes, while the second strategy favor access time, but operations end up "trailing" and will be forwarded (and hence inefficient) until the transaction completes. Change-Id: I7fd9e21c749f55b91229bf0b671c8dcf2e4d5982 Signed-off-by: Robert Varga <rovarga@cisco.com>