controller.git
6 years agoBUG-8309: Add message identity information 44/56044/5
Robert Varga [Wed, 26 Apr 2017 08:46:15 +0000 (10:46 +0200)]
BUG-8309: Add message identity information

We have encountered an attempt to serialize a local request across
a remote connection. Since this is hit by the akka serializer, we
have lost the identity of the call site and of the message, because
all akka is seeing is the Envelope and the exception's stack trace,
which only indicates class hierarchy up to and including
AbstractLocalTransactionRequest.

This patch enriches the exception message so we know what the actual
request was, hopefully pinpointing the offending call site. Since
the problem revolves around the reconnect process, bump critical
transitions to info instead of debug.

Change-Id: I6d6d6e702d4b5baff7b707242583e923708e7637
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
6 years agoNest id-ints list inside a container 95/55995/2
Tomas Cere [Tue, 25 Apr 2017 10:50:52 +0000 (12:50 +0200)]
Nest id-ints list inside a container

Needs to be nested to be able to refer to the whole list via restconf
and instance-identifier yang element, so update the model and the handlers
to account for this change.

Change-Id: Idf50de5e6faa9757f45ec68e9b796ae0742f6aa9
Signed-off-by: Tomas Cere <tcere@cisco.com>
6 years agoBug 8301: Disable DistributedShardedDOMDataTreeRemotingTest for now 20/56020/2
Tom Pantelis [Tue, 25 Apr 2017 17:56:50 +0000 (13:56 -0400)]
Bug 8301: Disable DistributedShardedDOMDataTreeRemotingTest for now

Change-Id: I24068c5ee92533cdc23174d17cc1805328df7c4d
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
6 years agoFix intermittent failure in testLeadershipTransferOnShutdown 17/56017/2
Tom Pantelis [Tue, 25 Apr 2017 17:26:27 +0000 (13:26 -0400)]
Fix intermittent failure in testLeadershipTransferOnShutdown

10:03:06 java.util.concurrent.ExecutionException: ReadFailedException{message=Error executeRead ReadData for path /(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test:cars?revision=2014-03-13)cars/car, errorList=[RpcError [message=Error executeRead ReadData for path /(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test:cars?revision=2014-03-13)cars/car, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.md.sal.common.api.data.DataStoreUnavailableException: Shard member-2-shard-cars-testLeadershipTransferOnShutdown currently has no leader. Try again later.]]}
10:03:06  at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.wrapInExecutionException(MappingCheckedFuture.java:64)
10:03:06  at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.get(MappingCheckedFuture.java:92)
10:03:06  at org.opendaylight.controller.cluster.datastore.DistributedDataStoreRemotingIntegrationTest.verifyCars(DistributedDataStoreRemotingIntegrationTest.java:215)
10:03:06  at org.opendaylight.controller.cluster.datastore.DistributedDataStoreRemotingIntegrationTest.testLeadershipTransferOnShutdown(DistributedDataStoreRemotingIntegrationTest.java:928)

From the logs it seems member-2 hadn't gotten MemberUp for member-3 after the
leader transfer and by the time it tried to read. I added calls to wait for members
to be up. After the change it ran 333 times w/o failure.

Change-Id: Ifbbf304230292f69429d3086867679effb8db01c
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
6 years agoHandle AbortLocalTransactionRequest 08/56008/1
Robert Varga [Tue, 25 Apr 2017 14:58:47 +0000 (16:58 +0200)]
Handle AbortLocalTransactionRequest

When local transactions are aborted from the frontend, it is done
via a dedicated message which we failed to account for. This can
happen only as an alternative to CommitLocalTransactionRequest,
hence needs to be handled only in FrontendReadWriteTransaction.

Change-Id: I350a103f132da473d397a7d5f7de7e45850911f3
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
6 years agoImprove logging around transaction lifecycle 76/55976/1
Robert Varga [Tue, 25 Apr 2017 08:39:21 +0000 (10:39 +0200)]
Improve logging around transaction lifecycle

Testing has shown that we have a gap in request handling and we
have a lot of unclosed transactions. Add logging of code paths
which trigger unsupported request.

Change-Id: I013ba8a141d5a1a9e311a8bca7842ac77064d277
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
6 years agoImprove orphan transaction logging 30/55930/1
Robert Varga [Mon, 24 Apr 2017 19:51:43 +0000 (21:51 +0200)]
Improve orphan transaction logging

This patch improves logging when we perform last-resort cleanup
from garbage collector, so that the type of client handle is also
logged. This allows us to discern snapshots and snapshots.

Also lower the logging level to INFO, as this is something that
should be fixed by whoever is causing it, but it does not pose
serious threat to stability.

Change-Id: Iad55c49de87ca73f9671f04f569be7eae0e4f885
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
6 years agoBUG-8219: Cleanup CompositeDataTreeCohort 19/55819/6
Robert Varga [Fri, 21 Apr 2017 14:44:10 +0000 (16:44 +0200)]
BUG-8219: Cleanup CompositeDataTreeCohort

This patch reworks the logic so we can track which cohort times
out in case that happens. We also instantiate shortcuts so we do
not go through asynchronous processing if there are no cohorts
at all.

Change-Id: I9493b768c86e8d6b2d0f4f1d13f53b13ff98fe7b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
6 years agoFix checkstyle problems not detected by the current version 14/55814/2
David [Thu, 20 Apr 2017 22:42:07 +0000 (00:42 +0200)]
Fix checkstyle problems not detected by the current version

This change is required for overall move to new Checkstyle version, see
https://git.opendaylight.org/gerrit/#/q/topic:bumpCheckstyle-stable/carbon

Most of the changes are redundant "final" modifiers.

Change-Id: I637dd46617ca144f0ed33bd705c6357493b887fe
Signed-off-by: David <david.suarez.fuentes@ericsson.com>
6 years agoBUG-8159: fix local transaction history tracking 39/55739/5
Robert Varga [Thu, 20 Apr 2017 14:37:02 +0000 (16:37 +0200)]
BUG-8159: fix local transaction history tracking

ShardCommitCoordinator needs to make sure ShardDataTree tracks
the histories involved with local transaction being submitted
via ReadyLocalTransaction. This is consistent with what we are
doing for the BatchedModifications message.

Change-Id: I02cc61476b5e02fb45f1482c4a9693bc77335793
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
6 years agoRelax visibility on FrontendReadWriteTransaction methods 18/55818/2
Robert Varga [Fri, 21 Apr 2017 13:38:03 +0000 (15:38 +0200)]
Relax visibility on FrontendReadWriteTransaction methods

We are invoking these methods from anonymous subclasses, hence
keeping them private forces redirection via synthetic accessors:

 at org.opendaylight.controller.cluster.datastore.FrontendReadWriteTransaction.successfulDirectCanCommit
 at org.opendaylight.controller.cluster.datastore.FrontendReadWriteTransaction.access$300
 at org.opendaylight.controller.cluster.datastore.FrontendReadWriteTransaction$5.onSuccess

This patch makes the methods package-private, which will eliminate
the accessor, improving the stack trace.

Change-Id: Idbd803c43d7ed7333fc392a17edaf61c9721d76f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
6 years agoBug 8274: add missing configfile dependency 12/55812/2
Stephen Kitt [Fri, 21 Apr 2017 12:35:04 +0000 (14:35 +0200)]
Bug 8274: add missing configfile dependency

odl-jolokia's configfile was missing its corresponding dependency in
the POM; this patch adds it.

Change-Id: I4e5420978020b19de58b65d06c4b2482f55351d0
Signed-off-by: Stephen Kitt <skitt@redhat.com>
6 years agoRpcRegistrar unit test 85/55785/1
matus.kubica [Wed, 5 Apr 2017 15:16:46 +0000 (17:16 +0200)]
RpcRegistrar unit test

Change-Id: I90403cb3c5fb98854c9e7dcd80ba0ce6e5f944f4
Signed-off-by: matus.kubica <matus.kubica@pantheon.tech>
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
6 years agoBug 7747: Reply to the leader before applying previous state 35/55735/1
Tom Pantelis [Thu, 20 Apr 2017 13:15:43 +0000 (09:15 -0400)]
Bug 7747: Reply to the leader before applying previous state

Applying state to the data tree can be expensive so the follower
should reply to the leader before applying any previous state so
as not to hold up leader consensus.

Change-Id: Ic92ae2ac30d72d6a401bdc36fda900a0a7fb21d3
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
6 years agoBUG-5280: unwrap RuntimeRequestExceptions 33/55233/3
Robert Varga [Wed, 19 Apr 2017 13:59:22 +0000 (15:59 +0200)]
BUG-5280: unwrap RuntimeRequestExceptions

This patch adds the primitive to unwrap RuntimeRequestExceptions,
so the underlying cause is propagated.

Change-Id: I77771867a48eb5f63d35a6402aca6ad0bc5b12e3
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoFix intermittent testAddShardReplicaWithAddServerReplyFailure failure 99/55699/1
Tom Pantelis [Wed, 19 Apr 2017 20:34:54 +0000 (16:34 -0400)]
Fix intermittent testAddShardReplicaWithAddServerReplyFailure failure

ShardManagerTest#testAddShardReplicaWithAddServerReplyFailure failed:

java.lang.AssertionError: assertion failed: timeout (3 seconds) during expectMsgClass waiting for class org.opendaylight.controller.cluster.raft.messages.AddServer
20:14:24  at scala.Predef$.assert(Predef.scala:170)
20:14:24  at akka.testkit.TestKitBase$class.expectMsgClass_internal(TestKit.scala:472)
20:14:24  at akka.testkit.TestKitBase$class.expectMsgClass(TestKit.scala:459)
20:14:24  at akka.testkit.TestKit.expectMsgClass(TestKit.scala:814)
20:14:24  at akka.testkit.JavaTestKit.expectMsgClass(JavaTestKit.java:415)
20:14:24  at org.opendaylight.controller.cluster.datastore.shardmanager.ShardManagerTest$33.<init>(ShardManagerTest.java:1637)

The log shows:

08:14:06,302 PM [main] [INFO] ShardManagerTest - testAddShardReplicaWithAddServerReplyFailure starting
08:14:06,325 PM [main] [INFO] ShardManager - Starting ShardManager shard-manager-config22
08:14:06,329 PM [test-akka.actor.default-dispatcher-7] [INFO] ShardManager - Recovery complete : shard-manager-config22
08:14:09,339 PM [main] [INFO] TestActorFactory - Killing actor TestActor[akka://test/user/member-1-shard-astronauts-config]
08:14:09,340 PM [main] [INFO] TestActorFactory - Killing actor TestActor[akka://test/user/shardmanager-config22]
08:14:09,340 PM [main] [DEBUG] ShardManager - Got updated SchemaContext: # of modules 1
08:14:09,340 PM [main] [DEBUG] ShardManager - shard-manager-config22: onAddShardReplica: AddShardReplica[ShardName=astronauts]
08:14:09,340 PM [main] [INFO] ShardManager - Stopping ShardManager shard-manager-config22

So the ShardManager got the onAddShardReplica message but after the test timed out
after 3 seconds. The problem is that the test is using the default dispatcher for
TestActor which is the calling thread dispatcher which is problematic for persistent
actors. Either not use TestActor where we don't need access to the underlying actor
instance or use the system default dispatcher, which is async.

Change-Id: Ib6521c345bd0db9502d0078928f8d0e5dcd7f747
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
7 years agoFix a typo 43/55243/2
Robert Varga [Wed, 19 Apr 2017 15:55:46 +0000 (17:55 +0200)]
Fix a typo

transacion -> transaction

Change-Id: I30b5b387dc9d21774798286984f67e46a2471e95
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBUG-5280: fix snapshot accounting 38/55238/1
Robert Varga [Wed, 19 Apr 2017 15:13:58 +0000 (17:13 +0200)]
BUG-5280: fix snapshot accounting

The following warning is emitted under testing:

2017-04-19 08:49:34,707 | WARN  | ... | AbstractClientHistory            | ... | Could not find aborting transaction member-2-datastore-operational-fe-0-txn-19-0

Which is indicating that we cannot find the open transaction
inside AbstractClientHistory.

The problem is mis-routed invocation when we are taking a snapshot:
instead of going directy to subclass doCreateSnapshot() which only
allocates the transaction, invoke takeSnapshot(), which actually does
the appropriate book-keeping.

Change-Id: I07473f381d3147a7fc7d355afede254a781a3094
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBug 8231: Fix testChangeListenerRegistration failure 80/55180/2
Tom Pantelis [Fri, 14 Apr 2017 13:03:51 +0000 (09:03 -0400)]
Bug 8231: Fix testChangeListenerRegistration failure

As described in Bug 8231, the sharing of the ListenerTree between the
ShardDataTree and the ShardDataTreeNotificationPublisherActor is
problematic. Therefore the ListenerTree (wrapped by the
DefaultShardDataTreeChangeListenerPublisher) is now owned by the
ShardDataTreeNotificationPublisherActor. On registration, a RegisterListener
messages is sent to the ShardDataTreeNotificationPublisherActor to perform
the on-boarding of the new listener, ie it atomically generates and sends
the initial notification and then adds the listener to the ListenerTree.

This change necessitated some refactoring of the DataChangeListenerSupport
class et al wrt to how the ListenerRegistration is handled. Prior the
ListenerRegistration was passed on creation of the registration actor. This
is now done indirectly by sending a SetRegistration message to the
registration actor via a Consumer callback passed in the RegisterListener
message. When the ListenerRegistration is obtained by the
ShardDataChangePublisherActor, it invokes the Consumer callback.

When a registration is initially delayed due to no leader, the
DelayedListenerRegistration is sent to the registration actor. When the
leader is elected later on, the actual ListenerRegistration is sent and
replaces the DelayedListenerRegistration.

The DOMDataTreeChangeListener registration classes were changed/refactored
similarly.

In addition, the 2 specific registration actor classes were replaced by a
generic reusable DataTreeNotificationListenerRegistrationActor that handles
both listener types. Also the 2 CloseData*ListenerRegistration and
CloseData*ListenerRegistrationReply messages were consolidated.

Change-Id: I79ac76b8044609351e5dd8367b691b589ea35075
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
7 years agoBUG-5280: update transaction statistics 49/55149/1
Robert Varga [Tue, 18 Apr 2017 10:50:20 +0000 (12:50 +0200)]
BUG-5280: update transaction statistics

This patch adds statistics-keeping to tell-based protocol code.

Change-Id: I377cd4d9075f96dc69dd74011458fdcf53a65add
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBUG-5280: handle NotLeaderException 64/55064/4
Robert Varga [Fri, 14 Apr 2017 18:45:20 +0000 (20:45 +0200)]
BUG-5280: handle NotLeaderException

NotLeaderException is indicative of leader movement, in which
case we need to tear down the connection and resolve the new
leader.

Change-Id: I068e97f9a7feb75cc30afb5f5449f0adf00aa217
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBUG-5280: activate testTransactionRetryWithInitialAskTimeoutExOnCreateTx 63/55063/2
Robert Varga [Fri, 14 Apr 2017 18:16:41 +0000 (20:16 +0200)]
BUG-5280: activate testTransactionRetryWithInitialAskTimeoutExOnCreateTx

This test should work reliably, re-enable it.

Change-Id: I401983ea3579b95a3b37d2144a7085f132eba640
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBUG-5280: fix invalid local transaction replay 43/54843/4
Robert Varga [Wed, 5 Apr 2017 16:41:14 +0000 (18:41 +0200)]
BUG-5280: fix invalid local transaction replay

When we transition from a connecting to connected local connection,
we may encounter operations which are invalid and these violations
are detected during transaction replay.

If such replay fails, we need to suppress reporting the error until
the user initiates canCommit or directCommit, at which point we need
to report the delayed failure.

For reasons of consistency, we perform this suppression even under
normal connected circumstances.

Change-Id: I2018498afff0e463dbdceaec5c50e8ebf088001b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoUnit test for RemoteRpcRegistryMXBeanImpl class 24/54824/6
Ivan Hrasko [Wed, 5 Apr 2017 12:54:06 +0000 (14:54 +0200)]
Unit test for RemoteRpcRegistryMXBeanImpl class

Change-Id: Ic00c607f3f66b327336b49f92afe6eb29c144a92
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
7 years agoFix intermittent failure in ClusterAdminRpcServiceTest.testModuleShardLeaderMovement 82/55082/1
Tom Pantelis [Sat, 15 Apr 2017 02:00:57 +0000 (22:00 -0400)]
Fix intermittent failure in ClusterAdminRpcServiceTest.testModuleShardLeaderMovement

java.lang.AssertionError: Rpc failed with error: RpcError [message=leadership transfer failed, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.cluster.raft.LeadershipTransferFailedException: Failed to transfer leadership to member-2-shard-cars-config_testModuleShardLeaderMovement. Follower is not ready to become leader]
  at org.opendaylight.controller.cluster.datastore.admin.ClusterAdminRpcServiceTest.verifySuccessfulRpcResult(ClusterAdminRpcServiceTest.java:461)
  at org.opendaylight.controller.cluster.datastore.admin.ClusterAdminRpcServiceTest.doMakeShardLeaderLocal(ClusterAdminRpcServiceTest.java:450)
  at org.opendaylight.controller.cluster.datastore.admin.ClusterAdminRpcServiceTest.testModuleShardLeaderMovement(ClusterAdminRpcServiceTest.java:263)

It failed when trying to make member-2 the leader for a couple reasons. One is that
member-2 hadn't yet received the MemberUp event for member-3 from akka clustering and
thus didn't have its address when it started the election and tried to send
RequestVote.

The second problem is a result of the first - since member-2 couldn't get a vote
from member-3, it needed the vote from member-1, which was in the process of stepping
down as leader. When member-1 received the RequestVote with the higher term, it
switched to Follower. Therefore member-2 didn't receive any votes for that election
term. The request to transfer leadership, which was issued on member-1, then timed out
and failed.

The wait period for the new leader to be elected is 2 sec. This was chosen b/c
originally leadership transfer was only used on shutdown and we don't want to
block shutdown for too long. However, when requesting leadership outside of shutdown,
we should wait at least one election timeout period (plus some cushion to take into
account the variance).

This alleviates the time out but it still failed sometimes if member-1 timed out
in the Follower state and started a new election before member-2 timed out in
Candidate state. member-1 would then win the election and grab leadership back.
To alleiviate this, it would be ideal if member-1 replied to the RequestVote from
member-2 prior to switching to Follower. Normally when it receives a RaftRPC with
a higher term, the Leader is supposed to immediately switch to Follower and not
process and reply to the RaftRPC, as per raft. However if it's in the process of
transferring leadership it makes sense to process the RequestVote and make every
effort to get the requesting node elected.

I also fixed a couple issues in the test code, mainly adding waitForMembersUp.

Change-Id: Ibb1b00f03065680fe1fd338c3d26161ec6336d5a
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
7 years agoFix incorrect last history update 78/55078/1
Robert Varga [Sat, 15 Apr 2017 01:37:16 +0000 (03:37 +0200)]
Fix incorrect last history update

This is a thinko -- the codepath will never trigger, eventhough
it should normally trigger all the time.

Change-Id: I29b24a3823c08c64c8c8a74e7be3b96e07672313
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoChange DistributedShardedDOMDataTree's ctor signature 78/54878/3
Jakub Morvay [Wed, 12 Apr 2017 14:12:29 +0000 (16:12 +0200)]
Change DistributedShardedDOMDataTree's ctor signature

We should inject DistributedShardedDOMDataTree with AbstractDataStore
instead of DistributedDataStore, so we can allow different
implementations of distributed DOM store

Change-Id: I11d1b49e1413dcc233350a3c853b283df176bffa
Signed-off-by: Jakub Morvay <jmorvay@cisco.com>
7 years agoBUG-8159: add payload debugs 84/54884/5
Robert Varga [Wed, 12 Apr 2017 15:59:33 +0000 (17:59 +0200)]
BUG-8159: add payload debugs

This patch adds debugging of metadata snapshot application
and recovery operations.

Change-Id: I9498f53af6ddc8fecf42eb239c7da7da08d3f0c6
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoRemoteRpcProviderFactory and RpcErrorsException unit tests 77/54877/3
matus.kubica [Tue, 4 Apr 2017 12:04:04 +0000 (14:04 +0200)]
RemoteRpcProviderFactory and RpcErrorsException unit tests

Change-Id: Ife8c638d43810baede654cccac22fa8efccae1d0
Signed-off-by: matus.kubica <matus.kubica@pantheon.tech>
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
7 years agoRemove akka-distributed-data-experimental 76/54976/1
Robert Varga [Thu, 13 Apr 2017 13:28:56 +0000 (15:28 +0200)]
Remove akka-distributed-data-experimental

This module was used during development, but we stopped using it,
remove it from dependencies.

Change-Id: I415347f4e8a264a0daf604375815728f3a77837a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBug 8206: Fix IOException from initiateCaptureSnapshot 98/54898/3
Tom Pantelis [Wed, 12 Apr 2017 19:49:23 +0000 (15:49 -0400)]
Bug 8206: Fix IOException from initiateCaptureSnapshot

Modified the install snapshot chunking to be idempotent to avoid attempts
to send the same chunk twice. This fixes the error:

java.io.IOException: The # of bytes read from the imput stream, -1, does not match the expected # 3075

Change-Id: I5336c88125f226d0976f0d7fe17d03c0d181e12d
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
7 years agoBUG-8205: use updated DatastoreContext 48/54848/6
Robert Varga [Wed, 12 Apr 2017 10:15:30 +0000 (12:15 +0200)]
BUG-8205: use updated DatastoreContext

DatastoreContext is updated by the config admin overlay, which means
we cannot refer to the initial one passed in when we are deciding
which data store to instantiate.

This fixes up the protocol propagation and adds and initial info about
which protocol is in use.

Change-Id: I3c2f1a5eec1c7346fff3aca2d85609f47990723a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoImproved unit tests for AveragingProgressTracker class 23/54823/2
Ivan Hrasko [Fri, 10 Mar 2017 14:45:13 +0000 (15:45 +0100)]
Improved unit tests for AveragingProgressTracker class

Change-Id: I079b45304d82bfc9022321a1648fbdba13409c90
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
7 years agoBUG-7783: increase precision of execution times 54/54854/2
Robert Varga [Wed, 12 Apr 2017 11:12:34 +0000 (13:12 +0200)]
BUG-7783: increase precision of execution times

Document the time units we are using for measuring execution
and make sure they can hold any long.

Change-Id: I859349e27604c75d426ad7c4eec9d6870b081291
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBug 8206: Prevent decr follower next index beyong -1 83/54783/6
Tom Pantelis [Tue, 11 Apr 2017 13:39:49 +0000 (09:39 -0400)]
Bug 8206: Prevent decr follower next index beyong -1

If a follower's next index is already -1, we shouldn't decrement it
further, ie -1 is the lowest allowed value. This can result in AbstractLeader
continuously decrementing and logging an info message while in the
process of sending an install snapshot.

member-3-shard-default-config (Leader): follower member-1-shard-default-config last log term 2 conflicts with the leader's 3 - dec next index to -2

Modified decrNextIndex to return a boolean if next index was decremented
which is checked  by AbstractLeader.

Change-Id: I29454d4e71a7f9128b3b47f6a4e3403615c2c8d2
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoFix intermittent failure in testWriteTransactionWithSingleShard 22/54822/3
Tom Pantelis [Wed, 12 Apr 2017 06:20:18 +0000 (02:20 -0400)]
Fix intermittent failure in testWriteTransactionWithSingleShard

DistributedDataStoreRemotingIntegrationTest.testWriteTransactionWithSingleShard
fails intermittently with tell-based protocol in verifyCars after it has
reinstated the follower where it's expecting just car1 but the
data tree contains car 1 and car2. This is b/c the delete transaction for car2
prior to reinstatement wasn't applied on recovery due to the corresponding
ApplyJournalEntries message missing from the persisted journal. The test
expects 2 ApplyJournalEntries messages to be persisted corresponding to the
2 transactions but tell-based persists other payloads as well so there may
be 3 ApplyJournalEntries messages. I changed the code to handle this case.

Also the assertion failure in verifyCars caused it to bypass shutting down
the ActorSystem which resulted in several other failures in tests that try to
use the same port configuration due to the port already in use. So I made
changes to ensure ActorSystems are shutdown properly.

Change-Id: Id6316d71fcd9eb3e768c6b1f676fa0e9be1287a2
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
7 years agoFix intermittent failures in FollowerTest 15/54815/2 15/54815/3
Tom Pantelis [Tue, 11 Apr 2017 21:39:38 +0000 (17:39 -0400)]
Fix intermittent failures in FollowerTest

FollowerTest.testCaptureSnapshotOnLastEntryInAppendEntries:1152 Persisted journal entries size: [] expected:<1> but was:<0>

The test waits on the deletion of journal entries after the snapshot is saved
to occur and then checks the persistent journal for the remaining
ApplyJournalEntries. But occasionally the persisting of the ApplyJournalEntries
message occurs after the deletion so the assertion fails b/c the
ApplyJournalEntries wasn;t persisted yet. This is a little odd b/c the
sequencing in the raft code is that the ApplyJournalEntries write is done
before the delete so it should also be observed the same way in the
InMemoryJournal, even though it doesn't really matter either way.

To alleviate the problem I added a wait for the ApplyJournalEntries
message in the journal in the 3 similar tests.

I also made a couple other minor changes that I observed while running the
tests.

Change-Id: I67cbb8fd79c91cd1cc23c363b78e7f5e9b9f2bbe
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
7 years agoBug 5280: Enable tests for ClientBackedDatastore 09/54809/3
Andrej Mak [Tue, 4 Apr 2017 07:49:54 +0000 (09:49 +0200)]
Bug 5280: Enable tests for ClientBackedDatastore

Change-Id: I33d6312c9b18493e519b8607307c21c1b3a9bc75
Signed-off-by: Andrej Mak <andrej.mak@pantheon.tech>
7 years agoMake AbstractIdentifiablePayloadTest public 06/54806/1
Tom Pantelis [Tue, 11 Apr 2017 17:24:01 +0000 (13:24 -0400)]
Make AbstractIdentifiablePayloadTest public

For some reason, tests derived from AbstractIdentifiablePayloadTest
fail b/c AbstractIdentifiablePayloadTest isn't public when running from
eclipse - runs fine from command line.

Change-Id: Ie6ed1d6e0e130a1ffc5ad04db93e037ea6a79549
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
7 years agoBUG-5280: Correct reconnect retry logic 76/54776/1
Robert Varga [Thu, 30 Mar 2017 13:14:04 +0000 (15:14 +0200)]
BUG-5280: Correct reconnect retry logic

Our reconnect logic failed to account for various timers
during resolution. This patch makes the BackendInfoResolver
explicit about the type of failures it can report and fixes
AbstractShardBackendResolver to conform to them.

Change-Id: I610ddb6e062e223557d46e2950a552de6e7d3843
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoUpdate .gitreview to stable/carbon 39/54639/1
Anil Belur [Tue, 11 Apr 2017 01:26:29 +0000 (11:26 +1000)]
Update .gitreview to stable/carbon

Change-Id: I47a0426c3aad7f26c21bc954f098251e0e75deb0
Signed-off-by: Anil Belur <abelur@linuxfoundation.org>
7 years agoHandle odl-mdsal-common with Karaf 4.0.9 07/54607/1
Stephen Kitt [Mon, 10 Apr 2017 15:00:17 +0000 (17:00 +0200)]
Handle odl-mdsal-common with Karaf 4.0.9

Karaf 4.0.9 simplifies feature dependencies (so that dependencies
specified in feature.xml can be completed from the POM), but that
causes issues with odl-mdsal-common since features in Karaf are
identified by their name only:

* if both the controller odl-mdsal-common and the mdsal
  odl-mdsal-common are encountered in the dependency tree, whichever
  one came first (as dependencies are resolved) is the one that ends
  up being kept;
* in some circumstances, the mdsal repository replaces controller's
  even when the controller dependency is retained (this is a Karaf bug
  which I'll submit a patch for, but we can work around it).

Change-Id: I5400a829560ae96cb2f264e103020cccd1d225c3
Signed-off-by: Stephen Kitt <skitt@redhat.com>
7 years agoBUG-5280: add the concept of a recorded failure 71/54371/4
Robert Varga [Wed, 5 Apr 2017 16:36:48 +0000 (18:36 +0200)]
BUG-5280: add the concept of a recorded failure

This patch reworks LocalReadWriteProxyTransaction to be defensive
of its internal modification and introduces the concept of a delayed
recorded error (currently unused).

The defensiveness checks allow us to get rid of FailedDataTreeModification,
as we do not give out our modification at all in the codepaths which
would leak this implementation.

Change-Id: I5f91218ac308f7450a3b59252d44f953be54626c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBug 7805: Add make-leader-local rpc for module based shard. 00/54100/13
Tomas Cere [Thu, 30 Mar 2017 12:44:32 +0000 (14:44 +0200)]
Bug 7805: Add make-leader-local rpc for module based shard.

csit testing scenarios require movement of the shard leader for module
based shards aswell so add this into ClusterAdminRpcService.

Change-Id: Ib8a310cdba728c0a42d8850703740bf4698adbe0
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoBug 7806 - Implement agent RPCs for shard replica manipulation testing 38/54038/9
Tomas Cere [Wed, 29 Mar 2017 10:14:33 +0000 (12:14 +0200)]
Bug 7806 - Implement agent RPCs for shard replica manipulation testing

These can be implemented as a part of ClusterAdminRpcService instead
of creating new rpcs that would be part of the lowlevel suite.

Change-Id: I891f9d3703a9357e829159691cbf18f95523d529
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoBUG-5280: log a message when tell-based protocol is active 20/54320/2
Robert Varga [Tue, 4 Apr 2017 16:27:34 +0000 (18:27 +0200)]
BUG-5280: log a message when tell-based protocol is active

Discerning the two access modes is critical to understanding
when failures occur. Add an explicit note when the tell-based
protocol is enabled on a data store.

Change-Id: I3e2b1d2f84a73ce1a3759d419176c47a6dd0ad12
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoPass no op callback instead of null during replay 22/54022/5
Andrej Mak [Wed, 29 Mar 2017 09:32:25 +0000 (11:32 +0200)]
Pass no op callback instead of null during replay

Change-Id: Ife964481dc225bbc1d5b312035384f8bd597d740
Signed-off-by: Andrej Mak <andrej.mak@pantheon.tech>
7 years agoAdd AbstractIdentifiablePayload unit tests 52/54252/3
Andrej Mak [Mon, 3 Apr 2017 09:01:04 +0000 (11:01 +0200)]
Add AbstractIdentifiablePayload unit tests

Change-Id: I884f3c35d1767ed02accabc3b9a775ef9c667716
Signed-off-by: Andrej Mak <andrej.mak@pantheon.tech>
7 years agoUnit test for ClientBackedTransaction derived classes 88/53988/21
Ivan Hrasko [Tue, 28 Mar 2017 15:21:21 +0000 (17:21 +0200)]
Unit test for ClientBackedTransaction derived classes

Change-Id: I2967a0e224fc783ffac73a994def666e86a423a6
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
7 years agoUnit tests for ClientBackedTransactionChain class 18/54018/10
Ivan Hrasko [Wed, 29 Mar 2017 08:53:23 +0000 (10:53 +0200)]
Unit tests for ClientBackedTransactionChain class

Change-Id: I97953cfdc32619c31295cfed2584b7466d48aa5d
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
7 years agoAbstractClientHistory derived classes tests 43/54043/17
matus.kubica [Wed, 29 Mar 2017 14:13:57 +0000 (16:13 +0200)]
AbstractClientHistory derived classes tests

Change-Id: I1261eb764c730bbce6eb833644db99e4bbf0605c
Signed-off-by: matus.kubica <matus.kubica@pantheon.tech>
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
7 years agoAbstractDOMDataBroker fix annotations 02/54202/2
Jie Han [Sat, 1 Apr 2017 02:54:49 +0000 (10:54 +0800)]
AbstractDOMDataBroker fix annotations

Change-Id: I7938965d805ba3e4228e9fc5a36c75c52f6ec881
Signed-off-by: Jie Han <han.jie@zte.com.cn>
7 years agoBug 7805 - Implement agent RPCs for shard leader movement testing 01/53901/12
Tomas Cere [Mon, 27 Mar 2017 14:30:27 +0000 (16:30 +0200)]
Bug 7805 - Implement agent RPCs for shard leader movement testing

Change-Id: Ic19d1867f3c54ec22d600e9b80c6490d5a4b99bb
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoBUG-5280: Close client history after all histories are closed 37/54037/9
Ivan Hrasko [Wed, 29 Mar 2017 13:51:33 +0000 (15:51 +0200)]
BUG-5280: Close client history after all histories are closed

Make sure record history state as closed once we are done with
it.

Change-Id: Icbdf947ad166b082e06df896741e618e801ecf2e
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
7 years agoBUG-5280: switch tests to ClientBackedDataStore 18/48718/102
Robert Varga [Mon, 12 Dec 2016 18:34:38 +0000 (19:34 +0100)]
BUG-5280: switch tests to ClientBackedDataStore

Enable integration tests to run
on the new frontend code with parametrized JUNIT.

Not working tests for new code are ignored.
For old code all tests run and pass.

Change-Id: Ib5656ecd2333a56d5c466e633fbdd477accc4095
Signed-off-by: Robert Varga <rovarga@cisco.com>
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
7 years agoBUG 7801: prevent OptimisticLockFailedExceptions in write-transactions. 84/53784/7
Tomas Cere [Fri, 24 Mar 2017 10:29:16 +0000 (11:29 +0100)]
BUG 7801: prevent OptimisticLockFailedExceptions in write-transactions.

When multiple instances of this rpc are running concurrently in paralel
we would run into an optimistic lock since every instance tries to write
the topmost parent list first.
When these happen handle these failures as expected and resume with the
next stage of the rpc.

Change-Id: I43efaea3315b04272113eb86733e68609e434984
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoBug 7803: Implement agent RPCs for data tree change listener testing 78/53278/13
Tomas Cere [Thu, 9 Mar 2017 13:12:45 +0000 (14:12 +0100)]
Bug 7803: Implement agent RPCs for data tree change listener testing

Change-Id: Id2d53d3765fb9d518d4b052792d716d2b2b4c976
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoBug 7804: Implement agent RPCs for DOMDataTreeListener testing 63/53563/13
Tomas Cere [Mon, 27 Mar 2017 11:17:11 +0000 (13:17 +0200)]
Bug 7804: Implement agent RPCs for DOMDataTreeListener testing

Change-Id: I9e57e169fc3151a12914b2f370e0c97f41395992
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoBUG 7802: split out shard creation from produce transactions 95/53895/10
Tomas Cere [Mon, 27 Mar 2017 11:16:41 +0000 (13:16 +0200)]
BUG 7802: split out shard creation from produce transactions

Change-Id: I33fa46791a6c80477f57badf3bd44c3d6c5a2f9e
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoBug 7802 : Implement agent RPCs for transaction producer testing 78/53478/13
Tomas Cere [Fri, 17 Mar 2017 09:40:38 +0000 (10:40 +0100)]
Bug 7802 : Implement agent RPCs for transaction producer testing

Change-Id: I56d89093bd292032f92cdc98f25056822d93e628
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoBug 7407 - CDS: allow applications to request Leader movement 43/53543/27
Jakub Morvay [Mon, 6 Mar 2017 12:54:31 +0000 (13:54 +0100)]
Bug 7407 - CDS: allow applications to request Leader movement

This patch provides the routing from cds-dom-api CDSShardAccess
to the backend RaftActor.

Change-Id: I9fa315034d95a1896393a6152147a7bc50829b2a
Signed-off-by: Jakub Morvay <jmorvay@cisco.com>
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBug 7407 - Add request leadership functionality to shards 42/53542/21
Jakub Morvay [Mon, 20 Mar 2017 08:58:11 +0000 (09:58 +0100)]
Bug 7407 - Add request leadership functionality to shards

This adds a new MakeLeaderLocal message to Shard class API.
MakeLeaderLocal message is sent to a local shard replica to request
the shard leader to be moved to the local node. Local shard will
contact the current leader with RequestLeadership message to initiate
leadeship transfer to itself. Original sender of MakeLeaderLocal
message will be notified about result of this operation.

Change-Id: I2b0ee7caf772457e31250d1bdddd5fc77b16fc53
Signed-off-by: Jakub Morvay <jmorvay@cisco.com>
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoDOMDataBrokerTransactionChainImpl fix parameter name in annotation 01/54201/2
Jie Han [Sat, 1 Apr 2017 01:18:02 +0000 (09:18 +0800)]
DOMDataBrokerTransactionChainImpl fix parameter name in annotation

Change-Id: I8c229c57490f6d78fc46e6c2a7db745d9adc80ad
Signed-off-by: Jie Han <jeong_hyun@msn.com>
7 years agoClarify javadocs related to ProgressTracker 47/53147/3
Vratko Polak [Fri, 10 Mar 2017 15:17:15 +0000 (16:17 +0100)]
Clarify javadocs related to ProgressTracker

Change-Id: Ie208037ec2759d15c4eff86315389968e76c07bc
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
7 years agoBUG-5280: make sure we have metadata for standalone history 88/54088/3
Robert Varga [Thu, 30 Mar 2017 08:45:50 +0000 (10:45 +0200)]
BUG-5280: make sure we have metadata for standalone history

With metadata propagation in place, we get a ton of warnings
in the form of:

Unknown history for aborted transaction member-2-datastore-config-fe-0-txn-1261-0, ignoring

which is indicative of our failure to populate metadata builder
with the history for standalone transactions. This patch fixes
that and adds recovery for the case when we fail to find the
history in recovered journal.

Change-Id: I338666dbd910ec683a44a814deed7382eb255218
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBUG-5222: remove xsql from archetype 57/54157/2
Robert Varga [Fri, 31 Mar 2017 09:05:27 +0000 (11:05 +0200)]
BUG-5222: remove xsql from archetype

XSQL should not be here, kill it.

Change-Id: I68bafa8961598f3407763661c1c3a294c6209774
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
7 years agoBUG-2138: Create blueprint wiring for cds shard manager. 95/48795/59
Jakub Morvay [Fri, 3 Mar 2017 11:32:17 +0000 (12:32 +0100)]
BUG-2138: Create blueprint wiring for cds shard manager.

Change-Id: I504c294db111944c8e2047e58c3e1ef1aa81aee8
Signed-off-by: Jakub Morvay <jmorvay@cisco.com>
7 years agoBUG 2138 - Do not fail on module-based default shard 34/53734/14
Jakub Morvay [Thu, 23 Mar 2017 13:56:14 +0000 (14:56 +0100)]
BUG 2138 - Do not fail on module-based default shard

Currently, DistributedShardedDOMDataTree will try to create default
shards on its start. However, this can collide with module-based default
shards. If present in modules.conf, modules-based default shards will be
created on DistributedDatastore's start.

If already present, do no create default shards. Create just
DistributedShardFrontend for them.

Change-Id: I05857f520e3467116e8748e6ae231ab9dc39f44c
Signed-off-by: Jakub Morvay <jmorvay@cisco.com>
7 years agoBUG-7965 Switch distributed-data backend to a separate shard 09/50609/46
Tomas Cere [Tue, 10 Jan 2017 11:44:57 +0000 (12:44 +0100)]
BUG-7965 Switch distributed-data backend to a separate shard

The shard needs to be present on all nodes and replicated across
the cluster. Making this into shard allows us to leverage the current
datastore api's and also persistence so we have the sharding layout
persisted.

The shard is started on all nodes once DistributedShardedDOMDataTree is
created.

Change-Id: I697be9b7134a27720e23e3e56f9fddc71301ec1e
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoClosedTransactionException unit test 94/54094/1
matus.kubica [Thu, 30 Mar 2017 11:08:36 +0000 (13:08 +0200)]
ClosedTransactionException unit test

Change-Id: Ib86fe2e8e6ece0c5cc41efdcae02223cf94d362b
Signed-off-by: matus.kubica <matus.kubica@pantheon.tech>
7 years agoUnit tests for ClientBackedDataStore class 82/53982/4
Ivan Hrasko [Tue, 28 Mar 2017 13:56:44 +0000 (15:56 +0200)]
Unit tests for ClientBackedDataStore class

Change-Id: Ieba1004283905b82730b1ee23c2afeb4eb98f963
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
7 years agoFailedDataTreeModification JUnit test 01/53601/3
matus.kubica [Tue, 21 Mar 2017 11:35:40 +0000 (12:35 +0100)]
FailedDataTreeModification JUnit test

Change-Id: Icac1405b8a17ce18119fbefb55e71898dc90082f
Signed-off-by: matus.kubica <matus.kubica@pantheon.tech>
7 years agoBindingDOMRpcImplementationAdapter code clean-up 70/53970/3
Martin Ciglan [Tue, 28 Mar 2017 11:01:41 +0000 (13:01 +0200)]
BindingDOMRpcImplementationAdapter code clean-up

- warnings
- package-private access
- lambda expression
- typo
- white-spaces
- not used method parameter

Change-Id: I495e38037379f43553d723706ef87c1fb967aff6
Signed-off-by: Martin Ciglan <mciglan@cisco.com>
7 years agoBUG-5280: make sure we arm the request timer 00/53900/4
Robert Varga [Mon, 27 Mar 2017 14:19:25 +0000 (16:19 +0200)]
BUG-5280: make sure we arm the request timer

The timer which is supposed to timeout requests and detect
overall badness of the backeend was not being armed. Fix that
by scheduling it whenever we make the queue non-empty.

Change-Id: I9d8be694e3ed5154b66baca76c0788840a38c2f7
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoJUnit test for ModuleShardBackendResolver 96/53596/5
matus.kubica [Tue, 21 Mar 2017 10:38:11 +0000 (11:38 +0100)]
JUnit test for ModuleShardBackendResolver

Change-Id: I1fd7b77873d56f02eb024e27f2bcd4e42ff7c10d
Signed-off-by: matus.kubica <matus.kubica@pantheon.tech>
Signed-off-by: Ivan Hrasko <ivan.hrasko@pantheon.tech>
7 years agoRework CDS commit cohort impl to handle yang lists 84/51584/11
Tom Pantelis [Wed, 8 Feb 2017 18:52:12 +0000 (13:52 -0500)]
Rework CDS commit cohort impl to handle yang lists

If a cohort registers for yang list entries, it works fine if the
transaction only contains a write or delete of one list entry.
However the DataTreeCohortActor throws an UnsupportedOperationException
from CohortBehaviour#handle if more then one list entry is written. In
that case multiple CanCommit messages are sent to the DataTreeCohortActor,
a DOMDataTreeCandidate for each entry, but the CohortBehaviour is set up
to only handle one message, after which it expects to transition to the
PostCanCommit step.

It seems the DOMDataTreeCommitCohort#canCommit API really should take
a collection of DOMDataTreeCandidates. Howeever in lieu of an API change,
I modified the CanCommit message to contain a collection of
DOMDataTreeCandidates. The DataTreeCohortActor invokes canCommit for
each one and uses the last PostCanCommitStep returned. This *may*
be OK although there doesn't seem to be an alternative at this point.
We probably should note this behavior in the DOMDataTreeCommitCohort
API.

Change-Id: I17c4d2f477ffc6c6c3921217e5f6c13bcdafde8f
Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
7 years agoBUG-8056: make doCommit/finishCommit package-private 38/53738/2
Robert Varga [Thu, 23 Mar 2017 14:46:16 +0000 (15:46 +0100)]
BUG-8056: make doCommit/finishCommit package-private

This is not a complete fix for the issue, but it eliminates
the need for synthetic accessor methods:

at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.doCommit(ShardCommitCoordinator.java:296)
at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.access$200(ShardCommitCoordinator.java:49)
at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator$2.onSuccess(ShardCommitCoordinator.java:243)

at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.finishCommit(ShardCommitCoordinator.java:316)
at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.access$400(ShardCommitCoordinator.java:49)
at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator$3.onSuccess(ShardCommitCoordinator.java:299)

Leading to a leaner stack.

Change-Id: I825da37f91749016a4d4e64e7bfb75f03f9b450b
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-8073: Improve handling of temporary files 90/53790/6
Robert Varga [Fri, 24 Mar 2017 10:54:44 +0000 (11:54 +0100)]
BUG-8073: Improve handling of temporary files

This patch reworks the logic so we end up with atomic move operations
and non-overlapping file names.

Change-Id: I4383baf664e51d8e6acfaf51f9dc5f62d77f5c14
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoAdd SerializationUtils unit test 54/53654/2
Andrej Mak [Wed, 22 Mar 2017 06:23:26 +0000 (07:23 +0100)]
Add SerializationUtils unit test

Change-Id: I7e8533c8c54c6d2cab234e9ad7db6037a97bdbdc
Signed-off-by: Andrej Mak <andrej.mak@pantheon.tech>
7 years agoAdd case for READY in RemoteProxyTransaction 12/53712/4
Andrej Mak [Thu, 23 Mar 2017 10:01:47 +0000 (11:01 +0100)]
Add case for READY in RemoteProxyTransaction

Handling forwarded ModifyTransactionRequest with ready
protocol shouldn't cause failure, so add no op case.

Change-Id: Id8d69b49171588323ccd947b53f16576f57cb156
Signed-off-by: Andrej Mak <andrej.mak@pantheon.tech>
7 years agoBUG-8027: do not break actor encapsulation 47/53747/2
Robert Varga [Thu, 23 Mar 2017 17:10:58 +0000 (18:10 +0100)]
BUG-8027: do not break actor encapsulation

Invoking abort() from ShardTransaction means we are executing code
from one actor in the context of another one and since the code path
involves persistence, this breaks Akka rather thoroughly.

Introduce a dedicated method for the required upcall and send a request
to persist separately.

Change-Id: Ic994b5e5963e8c602844e283f34df8bfa3726705
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-5280: fix NPE during transaction purge 31/53731/2
Robert Varga [Thu, 23 Mar 2017 13:13:44 +0000 (14:13 +0100)]
BUG-5280: fix NPE during transaction purge

Read/write transactions which transition to ready state
throw away their open transaction, which causes the following
exception:

Shard - member-1-shard-people-testTransactionChainWithMultipleShards: request Envelope{sessionId=1, txSequence=11, message=TransactionPurgeRequest{target=member-2-datastore-testTransactionChainWithMultipleShards-fe-0-chn-1-txn-2-2, sequence=2, replyTo=Actor[akka://cluster-test@127.0.0.1:2559/user/$a#-493460599]}} caused failure
java.lang.NullPointerException
 at org.opendaylight.controller.cluster.datastore.FrontendReadWriteTransaction.purge(FrontendReadWriteTransaction.java:113)
 at org.opendaylight.controller.cluster.datastore.AbstractFrontendHistory.handleTransactionRequest(AbstractFrontendHistory.java:114)
 at org.opendaylight.controller.cluster.datastore.LeaderFrontendState.handleTransactionRequest(LeaderFrontendState.java:197)
 at org.opendaylight.controller.cluster.datastore.Shard.handleRequest(Shard.java:413)
 at org.opendaylight.controller.cluster.datastore.Shard.handleNonRaftCommand(Shard.java:277)

Rework purge logic to talk directly to the data tree, which
prevents this from happening and simplifies the code a bit.

Change-Id: I7cc08687648d2473a712c171944a06307e4d8f9f
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoAdd QuarantinedMonitorActor unit test 13/53613/4
Andrej Mak [Tue, 21 Mar 2017 14:08:38 +0000 (15:08 +0100)]
Add QuarantinedMonitorActor unit test

Change-Id: I4e007dd15b1a632b8204812a49a6149615901af4
Signed-off-by: Andrej Mak <andrej.mak@pantheon.tech>
7 years agoAdd RoleChangeNotifier unit test 10/53610/2
Andrej Mak [Tue, 21 Mar 2017 13:03:22 +0000 (14:03 +0100)]
Add RoleChangeNotifier unit test

Change-Id: Ib2082c15fb18094e22a5800265150044d4896ee3
Signed-off-by: Andrej Mak <andrej.mak@pantheon.tech>
7 years agoBug 8015, Bug 7800: Do not block when publishing notifications 76/53476/5
Vratko Polak [Fri, 17 Mar 2017 14:07:03 +0000 (15:07 +0100)]
Bug 8015, Bug 7800: Do not block when publishing notifications

+ Yang model edited.
+ check-publish-notificatons implemented.

Change-Id: I757269a61bb819d2abcb07f6106b5e2ed7a34dec
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoBug 7814: Add counter to make tx actor names unique 46/53646/1
Tom Pantelis [Wed, 22 Mar 2017 03:21:37 +0000 (23:21 -0400)]
Bug 7814: Add counter to make tx actor names unique

Appended an incrementing counter value to the actor name which
will guarantee uniqueness.

Change-Id: I0f36c4b96598c6035071ee2becb73ca9b18fee45
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoMove forwardToRemote() to LocalProxyTransaction 40/53540/2
Andrej Mak [Mon, 20 Mar 2017 08:04:45 +0000 (09:04 +0100)]
Move forwardToRemote() to LocalProxyTransaction

Method has the same body in both implementations, so
it can be moved to parrent class.

Change-Id: I25f7cb99cc3727f0cbb8da9e59343a663d776e11
Signed-off-by: Andrej Mak <andrej.mak@pantheon.tech>
7 years agoBUG-5280: make sure we propagate frontend metadata 63/49263/30
Robert Varga [Mon, 12 Dec 2016 17:59:23 +0000 (18:59 +0100)]
BUG-5280: make sure we propagate frontend metadata

This fixes an omission in initial metadata drop done as part
of I7e2c6755c3389dcb5284f17a9c6076fb9e7ac95e by registering
frontend metadata.

Change-Id: Iba85849333693484bd1870dc54d183ccc464a7ef
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoAdd AbstractProxyTransaction derived classes tests 89/53589/3
Andrej Mak [Tue, 21 Mar 2017 07:05:34 +0000 (08:05 +0100)]
Add AbstractProxyTransaction derived classes tests

Change-Id: Ie78c9213b9ca9a41066b34463557b6feb6f8b18d
Signed-off-by: Andrej Mak <andrej.mak@pantheon.tech>
7 years agoBug 6787: FeatureConfigPusher confusing WARN log removed 41/53341/5
Michael Vorburger [Wed, 15 Mar 2017 12:33:09 +0000 (13:33 +0100)]
Bug 6787: FeatureConfigPusher confusing WARN log removed

see analysis in bug: Current code already has a retry loop, and the
operation eventually succeeds; the ConcurrentModificationException
logged only created confusion and added no real value.

The code has been changed to log the last exception in the ERROR IFF
after N retries it does not succeed.

PS: We should create a generic utility helper for this ("Retryer").

Change-Id: I3c116e77f5a94366da15cc659b3a63d3a6e79f18
Signed-off-by: Michael Vorburger <vorburger@redhat.com>
7 years agoBUG-5280: expand design documentation 88/38588/4
Robert Varga [Mon, 9 May 2016 09:44:52 +0000 (11:44 +0200)]
BUG-5280: expand design documentation

Add some more documentation on how actors communicate on the backend.

Change-Id: I1d4e39d1cff508ed0ef10901e8a09fd8d89580d1
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoAdd AbstractTransactionCommitCohort unit tests 98/53398/9
Andrej Mak [Thu, 16 Mar 2017 12:24:32 +0000 (13:24 +0100)]
Add AbstractTransactionCommitCohort unit tests

Change-Id: I18036259e022bfb3d027c82757a8c840cebb8ded
Signed-off-by: Andrej Mak <andrej.mak@pantheon.tech>
7 years agoSeal only modified modifications 15/53515/2
Jakub Morvay [Sat, 18 Mar 2017 08:44:48 +0000 (09:44 +0100)]
Seal only modified modifications

Change-Id: I839eaebbf367a44b17595070fdea76d1f879f204
Signed-off-by: Jakub Morvay <jmorvay@cisco.com>
7 years agoBUG-5280: add frontend state lifecycle 65/49265/36
Robert Varga [Mon, 12 Dec 2016 18:34:38 +0000 (19:34 +0100)]
BUG-5280: add frontend state lifecycle

When transitioning between roles we need to take care of proper
handling of state known about frontend. This patch adds
the leader/non-leader transitions, creating the state
from FrontendMetadata and forgetting it when transactions are
committed.

Our replicated log needs to grow more entries to accurately
replicate the state of the conversation between the frontend
and backend, so if a member becomes the leader it has
an understanding of which transactions and transaction
chains have been completed (aborted, committed, purged). These
are replicated before a response is sent to the frontend, so
if a leader before they replicate successfully, the frontend
will see them as a timeout and retry them (and be routed to the
new leader).

Both leader and followers are expected to keep the metadata
handy: the leader keeps for the purpose of being able to generate
a summarized snapshot. The followers keep it so their metadata
view is consistent with the contents of the data tree.

Change-Id: I72eea91ee84716cdd8a6a3521b42cca9a9393aff
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-2138: Use correct actor context in shard lookup. 38/49738/25
Jakub Morvay [Wed, 8 Mar 2017 17:32:43 +0000 (18:32 +0100)]
BUG-2138: Use correct actor context in shard lookup.

Typo since we cannot have all lookups being routed into
config.

Change-Id:I708787d7e5e6136b6a22d6f402071702a6de412b
Signed-off-by: Tomas Cere <tcere@cisco.com>
Signed-off-by: Jakub Morvay <jmorvay@cisco.com>
7 years agoBUG-2138: Fix shard registration with ProxyProducers. 63/49663/26
Tomas Cere [Tue, 20 Dec 2016 16:21:55 +0000 (17:21 +0100)]
BUG-2138: Fix shard registration with ProxyProducers.

Change-Id: I42f8f3cfaf9c0ef20b247abff2bec966ce5eeaa4
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoBug 7801 - Implement agent RPCs for transaction writer testing 07/53007/8
Tomas Cere [Mon, 6 Mar 2017 15:11:26 +0000 (16:11 +0100)]
Bug 7801 - Implement agent RPCs for transaction writer testing

Change-Id: I75e62deb62f39869be07fcb82f3faee53f337a7d
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoUnit test for RequestSuccess.java and derived classes 35/53135/8
miroslav.kovac [Fri, 10 Mar 2017 11:57:15 +0000 (12:57 +0100)]
Unit test for RequestSuccess.java and derived classes

Change-Id: Ic8998c07ae714f862e18bddccda790fc8c90ee9e
Signed-off-by: miroslav.kovac <miroslav.kovac@pantheon.tech>
7 years agoBUG-2138: DistributedShardListeners support for nested shards 89/49189/39
Tomas Cere [Thu, 8 Dec 2016 10:54:07 +0000 (11:54 +0100)]
BUG-2138: DistributedShardListeners support for nested shards

Adds support for listeners in shards that have a subshard/s,
which re-asseble notifications received from subshards.

Change-Id: Icc7dfb971731d78c306a87335e54668f3bbc133e
Signed-off-by: Tomas Cere <tcere@cisco.com>
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-5280: Add use-tell-based-protocol config knob 97/53397/2
Robert Varga [Wed, 15 Mar 2017 10:44:04 +0000 (11:44 +0100)]
BUG-5280: Add use-tell-based-protocol config knob

The configuration knob was not documented in the corresponding configuration
file. Add it with a short explanation.

Change-Id: Ie0e866c9cf98a39568051705bbf0b10b9feaf582
Signed-off-by: Robert Varga <rovarga@cisco.com>