Teach sal-remoterpc-connector to route actions sal-remoterpc-connector already handles routing of RPC registrations and invocations across a cluster. Actions are very similar to RPCs, hence it is natural to keep both in the same component. This patch refactors common bits that go into tracking both, so that we share common actors and concepts. JIRA: CONTROLLER-1894 Change-Id: I0b9005bc3560b4dd5977a280d83eceebe132bec9 Signed-off-by: EmmettCox <emmett.cox@est.tech>
BUG-7556: update version tracking This patch adds better version tracking, so it does not rely on calendar time, but rather is monotonically increasing. The 64bit version field is logically split into an incarnation number (31 bits) and a version number (32 bits). Change-Id: Ie0e1f4089cc1ee582037982d9837490348158975 Signed-off-by: Robert Varga <rovarga@cisco.com>
BUG-3128: rework sal-remoterpc-connector This patch reworks the implementation to take advantage of the services provided by DOMRpcService, notifying us of locally-available services. Previously we have registered all routed RPCs known in the SchemaContext for global routing context, which has causes lookups for routed RPCs not otherwise bound to call back to the remote connector, which then performed a router lookup. This approach is slow as each RPC invocation incurs an additional round-trip to the RpcRegistry to lookup the appropriate router before the request is sent to it. It also does not work for global RPCs, because they only ever have a global context -- hence the routing decision needs to be made solely in DOMRpcService. With this patch we maintain a single higher-cost implementation registered towards each remote node, handling RPCs discovered via gossiping RoutingTables. The implementation dispatches requests directly to that node's RpcInvoker (formerly known as RpcBroker). That way DOMRpcService will perform internal routing based on cost and invoke our service only if there are no local implementations registered. RpcRegistry no longer performs delayed router discovery and instead dispatches RoutingTable bucket updates to a new actor, RpcRegistrator, whose sole job is to maintain registrations of RemoteRpcImplementation instances for each remote node with DOMRpcProviderService. Because of DOMRpcService's ability to filter registration notifications, our RpcListener will never be notified of our registrations, which precludes routing loops. We can therefore remote RemoteRpcInput, whose sole purpose was to act as a loop detector. Futher cleanup is done to RpcManager and RemoteRpcProvider lifecycle, as these now correctly terminate their children and remove registrations on both restarts and shutdowns. RpcManager's children startup is also moved from the constructor to preStart(), so as correctly plug into the actor lifecycle. Gossiper is updated to forward node removals to the BucketStore, so that buckets from unreachable nodes are removed as soon as possible. BucketStore is updated to pass down changed buckets to the subclass whenever a bucket is removed or updated. It also requires the subclass to provide the initial bucket data item -- which makes it obvious that a bucket's data can never be null. This was previously achieved by sending updating the bucket data from the subclass constructor. BucketStore/Gossiper messages are updated to be immutable, which simplifies their instantiation and ensures that they do not contain nulls (which is required anyway). Change-Id: I4efb3ddd8ea46ae5be1eb59f1d4fe508f2bc5763 Signed-off-by: Robert Varga <rovarga@cisco.com>
Fix FindBugs warnings in sal-remoterpc-connector and enable enforcement Warnings fixed: - RemoteRpcImplementation: use of 'error' known to be null - RpcBroker, RpcRegistry: The Creator class has non-Serializable field. Removed the Creator class and used Props that creates by reflection. - RpcBroker: use of 'result' that is marked as @Nullable - RpcBroker: redundant check of 'result.getErrors()' that is known to be non-null (marked as @Nonnull). - Gossiper, RemoteRpcRegistryMXBeanImpl: use entrySet iterator instead of keySet and get. - Messages: redundant specification of implements Serializable - LatestEntryRoutingLogic: Comparator should also implement Serializable in case TreeSet is serialized. This isn't the case here but it doesn't hurt to implement Serializable in lieu of supressing the warning. - LatestEntryRoutingLogic: Fixed potential null pointer de-reference in 'compare'. Change-Id: I8930c8975e1dd9179d78e74087b3994a365b90f8 Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
Fix CS warnings in sal-remoterpc-connector and enable enforcement Fixed checkstyle warnings and enabled enforcement. Most of the warnings/changes were for: - white space before if/for/while/catch - white space before beginning brace - line too long - illegal catching of Exception (suppressed) - variable name too short - indentation - local vars/params hiding a field - remove unused vars - convert functional interfaces to lambdas (eclipse save action) - missing period after first sentence in javadoc - adding final for locals declared too far from first usage Change-Id: I222d003cb07810434cb7f62420b4a9157f1d3027 Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
Add blueprint wiring to sal-remoterpc-connector Change-Id: I23877888fd49e7dbe4568a7b7a51409589d5ff63 Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
Bug 4866: Add wait/retries for routed RPCs If a routed RPC is registered on one node it takes a little time for the route to propagate via gossip to other nodes. If another node tries to invoke the RPC prior to propagation it fails. To alleviate this timing issue, I added wait/retries via a timer in the RpcRegistry for the FindRouters message. As routes are updated via gossip, it retries the FindRouters request. If the timer triggers, it sends back an empty list. The timer period is 10 times the gossip tick interval (500ms * 10 = 5s). Change-Id: Iaafcfb4c93cde44f62f6645c8b8684102ac0d0db Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
BUG 4151 : Create a shared actor system This patch adds an ActorSystemProvider interface in clustering commons with a method to get a shard ActorSystem instance which uses the clustered data store configuration as it contains more configuration options than the rpc connector which pretty much uses stock configuration. I added a config yang to define an actor-system-provider-service. I added the ActorSystemProvider implementation and actor-system-provider-impl config yang in the distributed datastore bundle. I tried it in sal-clustering-commmons originally but ran into akka errors re: missing config properties and it also couldn't find the ReadyLocalTransactionSerializer class. So to avoid chasing down those errors I put the implementation in sal-distributed-datastore. I think this makes sense as it is the prime user of the actor system. I added a dependency for the ActorSystemProvider service in both datastores modules so the ActorSystem is now injected in and passed to the DistributedDataStoreFactory. The dependency was also added to the RPC mpdule. Elements for the new actor system provider service and impl were added to the 05-clustering.xml file along with the wiring changes for the data stores and RPC modules. Change-Id: I79c14f84c992a2d5ac9c1f1856efbaeba3cc2b77 Signed-off-by: Moiz Raja <moraja@cisco.com>
Metrics and Configuration 1. Adds a new abstract class AbstractMeteredUntypedActored that extends AbstractUntypedActor. This adds metrics capture capability which can be turned on using Config Subsystem. By default its turned off. 2. Updates Shard actor; adds metrics capture capability which can be turned on via Config Subsystem. 3. In remote-rpc-connector module, we can now pass configuration obtained from config subsystem to actor system so that its available to all actors. This obviates the need to manually pass the configuration via constructors or other *Constant or *Util classes. It brings all configuration items in the module at one place and makes it easier to move them to config subsystem, if its required. 4. In spirit of DRY, moves common code to clustering-commons 5. Minor code inspection fixes. 6. Makes mailbox-capacity configurable via config subsystem. Patch 9: Adds a new behaviour (MeteringBehavior.java). AbstractUntypedActorWithMetering and Shard actor exhibit this behavior current. This patch also refactors unified configuration (config subsystem + typesafe config) pieces. Note that in subsequent commits distributed-datastore will have its configuration added to actor system configuration. Also few more configuration items in remote-rpc-connector will go to config subsystem. I wanted to limit the amount of changes going into this already large commit. Change-Id: I383ec813c16ed09ed0e68ee59179f454c0d174cf Signed-off-by: Abhishek Kumar <abhishk2@cisco.com>