controller.git
7 years agoChange InstallSnapshot and reply to use Externalizable Proxy 38/42638/4
Tom Pantelis [Tue, 26 Jul 2016 22:36:06 +0000 (18:36 -0400)]
Change InstallSnapshot and reply to use Externalizable Proxy

This makes InstallSnapshot cleaner with no public no-arg constructor.

I also removed the InstallSnapshot protobuff message. In addition,
SerializableUtils is no longer needed as there's no more protobuff
messages.

Change-Id: I17aa4f7195cf09b798daee5587bbf50ccbc4bff0
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoBUG-6111: fix a thinko 28/43628/1
Robert Varga [Wed, 10 Aug 2016 09:24:31 +0000 (11:24 +0200)]
BUG-6111: fix a thinko

Failure to initialize isOpen leads to the codepath never being
triggered.

Change-Id: I20f1b76c9ada581edc1c92c61447fd97d0d1b2ea
Signed-off-by: Robert Varga <rovarga@cisco.com>
(cherry picked from commit 01ca6a43914fb8dd27a24da7476b835fc2570e40)

7 years agoMove ServerConfigurationPayload to cluster.raft.persisted 66/43266/7
Robert Varga [Fri, 5 Aug 2016 17:44:14 +0000 (19:44 +0200)]
Move ServerConfigurationPayload to cluster.raft.persisted

This introduces its mirror copy and modifies the old class
so that it readResolve()s to the new class. It also adjusts
all users to use the new class.

The new class uses Externalizable proxy pattern to allow the
class itself be evolved without breaking compatibility. Also
NoOpPayload is retrofitted this way, which makes all subclasses
of Payload not have their serialization format tied to Payload
itself.

Change-Id: I26010a9e1438dbc4cb1822e1c4dbb51e2b6e538e
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBug 6278: Switch to use odlparent's karaf-parent 44/43144/3
Ryan Goulding [Thu, 4 Aug 2016 09:24:19 +0000 (05:24 -0400)]
Bug 6278: Switch to use odlparent's karaf-parent

Switch archetypes so they lay down the correct karaf-parent implementation.

Change-Id: Ib9aa057f45141579a0d2bc895d373bd2882f975c
Signed-off-by: Ryan Goulding <ryandgoulding@gmail.com>
7 years agoBump ietf versions to ...10-SNAPSHOT 40/43540/2
Thanh Ha [Tue, 9 Aug 2016 17:43:50 +0000 (13:43 -0400)]
Bump ietf versions to ...10-SNAPSHOT

Bump versions according to:
https://lists.opendaylight.org/pipermail/release/2016-August/007731.html

Change-Id: I3038eaf68217d131bf867b1fcb2abb3abbf76663
Signed-off-by: Thanh Ha <thanh.ha@linuxfoundation.org>
7 years agoBug 6348 : car:stop-stress-test RPC to return success & failure counters 59/43259/5
Sai MarapaReddy [Tue, 19 Jul 2016 20:49:21 +0000 (13:49 -0700)]
Bug 6348 : car:stop-stress-test RPC to return success & failure counters

Current RPC car:stop-stress-test doesn't return how many
cars are created or failed. Adding success and failure counters
will help user to determine the number of cars created or failed
during the the process of creation of car tests using
car:stress-test. This patch enhances car:stop-stress-test RPC.

Change-Id: Iff054c8210ce49f06b4fa96ca5a437d9b82deddb
Signed-off-by: Sai MarapaReddy <sai.marapareddy@gmail.com>
Author: Sai MarapaReddy <sai.marapareddy@gmail.com>

7 years agoFix ietf-yang-types version 68/43468/1
Thanh Ha [Tue, 9 Aug 2016 04:54:26 +0000 (00:54 -0400)]
Fix ietf-yang-types version

The version bump script messed up bumping the ietf version. Fixing it
with this patch.

Change-Id: I9b3975582a3a2fe35ab5f4509b6d3a70bf5ca243
Signed-off-by: Thanh Ha <thanh.ha@linuxfoundation.org>
7 years agoFix relative paths for mdsal-it-parent 64/43464/1
Anil Belur [Tue, 9 Aug 2016 01:39:55 +0000 (11:39 +1000)]
Fix relative paths for mdsal-it-parent

Change-Id: Id321a86500216a8cf0ebe5ce7ebb698717996c7c
Signed-off-by: Anil Belur <abelur@linuxfoundation.org>
7 years agoBump versions by 0.1.0 for next dev cycle 09/43409/1
Thanh Ha [Mon, 8 Aug 2016 21:50:13 +0000 (17:50 -0400)]
Bump versions by 0.1.0 for next dev cycle

Change-Id: I9c4a4b1e6d0e101392fb19dc68f814e30de4fa5c
Signed-off-by: Thanh Ha <thanh.ha@linuxfoundation.org>
7 years agoBUG-6111: implement PingPongTransactionChain cancelation 58/43058/3
Robert Varga [Wed, 3 Aug 2016 15:15:57 +0000 (17:15 +0200)]
BUG-6111: implement PingPongTransactionChain cancelation

This patch implements transaction cancelation in PingPongDataBroker,
which has slightly different semantics -- if a transaction is canceled
while being in a batch, proper isolation of the batch is maintained
and after preceding batch completes, the transaction chain is aborted.

Since there is no transaction isolation within a batch, this is the
only course of action we can take.

Change-Id: I0058503165dbfba8748a17a9ef9272265f4bc1c9
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoAdd PMD exclusion for config-generated files 42/43242/2
Robert Varga [Fri, 5 Aug 2016 15:09:31 +0000 (17:09 +0200)]
Add PMD exclusion for config-generated files

Unfortunately PMD does not support wildcard
root exclusions, hence we have to match odlparent
configuration and extend it.

Change-Id: I4bc7a1b8c25b75cb5b348fb2a16f0e5b2c111359
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBug 5504: Add PreLeader raft state 28/42728/4
Tom Pantelis [Wed, 27 Jul 2016 19:52:53 +0000 (15:52 -0400)]
Bug 5504: Add PreLeader raft state

The following scenario can result in a "store tree and candidate base
differ" IllegalStateException on commit:

A follower receives a replicate and adds it to the log, say at index 1,
but the leader transfers or dies before committing and applying it to the
state. The follower becomes leader and when the next tx is applied, log
index 2, it has to first apply all log entries from the previous term that
hadn't been committed yet, in this case index 1. Since we got consensus for
index 2 that means index 1 has also been replicated to a majority. Therefore
ApplyState is sent for index 1 and then index 2. However index 1 is applied
as a "foreign" candidate while index 2 is in the pre-commit state. When
index 2 is applied the commit fails.

To prevent this scenario, we introduce a new raft state, PreLeader,
which is transitioned to from Candidate if there are uncommitted
entries, ie commit index < last log index. The PreLeader state performs all
the duties of Leader with the added behavior of attempting to commit all
uncommitted entries from the previous leader's term. Raft does not allow a
leader to commit entries from a previous term by simply counting replicas -
only entries from the leader's current term can be committed (§5.4.2). Rather
then waiting for a client interaction to commit a new entry, the PreLeader
state immediately appends a no-op entry (NoopPayload) to the log with the
leader's current term. Once the no-op entry is committed, all prior entries
are committed indirectly. Once all entries are committed, ie commitIndex matches
the last log index, it switches to the normal Leader state.

The PreLeader state is considered an inactive leader state and thus
client transactions are delayed until it transitions to Leader.

Change-Id: I20a541de0eba9b0075b9952dc6d5808943b7bb8f
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoBUG-5280: expand ShardDataTree to cover transaction mechanics 97/42497/27
Robert Varga [Mon, 25 Jul 2016 18:59:37 +0000 (20:59 +0200)]
BUG-5280: expand ShardDataTree to cover transaction mechanics

A chunk of ShardCommitCoordinator should actually be implemented
by ShardDataTree. This includes transaction queueing, commit timers,
interaction with user cohorts and persistence.

This patch implements the relevant operations in an message-agnostic,
callback-driven way.

Fix: ShardDataTreeTest (missing ShardStat MBean)

Change-Id: I353bacce8245df85c5f4d6b4cc0ce5416f2f0337
Signed-off-by: Robert Varga <rovarga@cisco.com>
Signed-off-by: Vaclav Demcak <vdemcak@cisco.com>
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoFix relativePath declaration 60/43060/2
Robert Varga [Wed, 3 Aug 2016 16:03:01 +0000 (18:03 +0200)]
Fix relativePath declaration

Since karaf-parent's parent is outside of controller,
relativePath has to be empty.

Change-Id: I491c73f2d42b8d5f3e159625f8ba01fdadd32497
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoReturn shortened string from TransactionIdentifier.toString 14/42314/3
Tom Pantelis [Fri, 22 Jul 2016 01:30:36 +0000 (21:30 -0400)]
Return shortened string from TransactionIdentifier.toString

For debug logging we need a shortened string for better readability and
grepping. The standard toString is way too long. I changed toString to a
similar compact form that we had before. adding in the frontend generation id
and type, eg

  member-1-datastore-config-fe-1-txn-3
  member-1-datastore-operational-fe-1-chn-2-txn-3

Change-Id: I942eaaa0e8ceedf42eed964f2a2e3a76d8c09806
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoEnable akka WeaklyUp feature 99/42799/3
Tom Pantelis [Fri, 29 Jul 2016 18:33:27 +0000 (14:33 -0400)]
Enable akka WeaklyUp feature

By enabling allow-weakly-up-members, akka will allow new nodes to join a
cluster if there are unreachable nodes. However, this only pertains to
new nodes that weren't previously in the cluster. Unfortunately it
doesn't pertain to node restarts where a node was in the cluster then
attempts to re-join with a new incarnation, which is what we really want.
Despite that, it will at least work for new nodes so I think it's worth
enabling. Akka might be further enhanced to broaden WeaklyUp to include
new incarnations (there's requests for that).

I also changed the ShardManager to handle MemberWeaklyUp events in
the same manner as MemberUp.

Change-Id: I5cf6c1967162b8a9bc6ffb59d34a50560699e4ca
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoBug 6278: Copy karaf-parent from controller to odlparent 49/42649/5
Ryan Goulding [Fri, 22 Jul 2016 07:52:08 +0000 (03:52 -0400)]
Bug 6278: Copy karaf-parent from controller to odlparent

As discussed in the MD-SAL call, there is an architectural need to move
karaf-parent from the controller project to the odlparent project.  This
is particularly useful for karaf upgrades, since right now a bump in karaf
version within odlparent requires a rebuild of controller to reflect the
change in karaf-parent, and our build jobs are not set up to support such
a process.

The move process will be handled in multiple steps:

1) Copy karaf-parent, karaf-branding and opendaylight-karaf-resources to
odlparent.  All three of these should belong in odlparent.  All three must
be moved since karaf-parent depends on the latter two artifacts.  Since
controller depends on odlparent (and not the other way around), they must
be moved upstream to odlparent.

2) Have controller's karaf-parent derive from odlparent's karaf-parent.
This preserves the ability for downstream consumers to derive from the
controller karaf-parent in the interim, while allowing changes to odlparent's
karaf-parent to be recognized since controller does not need to be rebuilt.
[THIS PATCH]

This also involves removing karaf-branding and opendaylight-karaf-resources
from the controller project, since they are no longer needed.  There are two
consumers that need to be patched:
lispflowmapping: https://git.opendaylight.org/gerrit/42647
vtn: https://git.opendaylight.org/gerrit/42648

3) Change all downstream projects to utilize odlparent's karaf-parent.  This
is future work and will be done in several patches.

4) Remove controller's karaf-parent once we feel all downstream consumers
are using the odlparent's karaf-parent.

Change-Id: Ib42ff5212bbfb93883346a19855544df4fb06d61
Signed-off-by: Ryan Goulding <ryandgoulding@gmail.com>
7 years agoDo not use ShardDataTree in PruningDataTreeModificationTest 75/42975/5
Robert Varga [Tue, 2 Aug 2016 13:33:25 +0000 (15:33 +0200)]
Do not use ShardDataTree in PruningDataTreeModificationTest

This test requires on a DataTree, hence use that.

Change-Id: I37697121f6686cdfe6b1d71ca87ff79281619532
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-5280: add client connect messages 66/42866/7
Robert Varga [Sun, 31 Jul 2016 21:55:28 +0000 (23:55 +0200)]
BUG-5280: add client connect messages

When a frontend is attempting to re-establish communication
with the backend it sends its coordinates and various other
information to a backend.

Sending ConnectClientRequest initiates a handshake, to which
the backend will respond either with a failure, or with an
adjusted ConnectClientSuccess.

Change-Id: I58ba9a2103f80e528654222f82f07416f7d7815e
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoHandle DeleteSnapshots response messages 00/42800/3
Tom Pantelis [Fri, 29 Jul 2016 19:23:23 +0000 (15:23 -0400)]
Handle DeleteSnapshots response messages

This is a follow-up to https://git.opendaylight.org/gerrit/#/c/42272/.
I didn't think deleteSnapshots returned a response but it does - I see
a warning for unhandled DeleteSnapshotsSuccess. I added handling for
DeleteSnapshotsSuccess and DeleteSnapshotsFailure. For the latter I log
a warning but don't fail the actor.

Change-Id: Ibb41e5124eb22530f98a5ef958abffc556dea4cf
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoFix missing LeaderStateChanged event 97/42597/4
Tom Pantelis [Tue, 26 Jul 2016 15:55:09 +0000 (11:55 -0400)]
Fix missing LeaderStateChanged event

In RaftActor, the logic to detect a leader state change compares the last
valid leader Id with the current behavior leader Id. Consider the
following leader Id change sequence:

  "member-1" -> null (goes leaderless)
  null -> "member-1" (member-1 becomes leader again)

The first state change will send a LeaderStateChanged event to the
ShardManager with null leader Id causing the ShardManager to clean its
primary shard info cache. However for the second state change, no
LeaderStateChanged event is sent b/c the new leader Id is the same as
the last valid/non-null leader Id. Therefore transactions fail due to no
shard leader.

I changed it to use the last leader Id (null or not) for the comparison
so every state change is detected.

Change-Id: I060872d4712e040b60acfc998914b394a40943af
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoImprove leader election convergence 69/42969/3
Tom Pantelis [Tue, 2 Aug 2016 02:23:33 +0000 (22:23 -0400)]
Improve leader election convergence

When 2 nodes startup with the first node's log behind the second node's,
it usually takes several election rounds to converge - I've seen
anywhere from 40 s to 3 min, depending on timing. What happens is that
the first node goes to Candidate first but it's RequestVote is rejected
by the seconds node. Shortly after the seconds node goes to Candidate -
the term is higher than the first which causes the first node to go back
to Follower. However it doesn't respond to the RequestVote. Then the
first node goes to Candidate and the cycle repeats. Eventually, due to
the election variance, the seconds node times out first and the second
node process the RequestVote and grants it. But it can take more than 10
cycles.

We can improve the convergence by allowing a Candidate to process and
respond to RequestVote when the sender's term is greater. It still
transitions to Follower as per the raft rules. The raft paper does not
say whether or not a Candidate can/should process a RequestVote in this
case but it seems to make sense. With this change, the first RequestVote
sent by the second node is granted and it converges quickly.

Change-Id: If9416ddf7bf0dfc1220a169be4174f440626a0dd
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoAlleviate premature elections in followers 64/42564/7
Tom Pantelis [Tue, 26 Jul 2016 04:07:02 +0000 (00:07 -0400)]
Alleviate premature elections in followers

If a follower actor is busy or some non-leader messages take longer to process,
leader messages may get backed up enough to cause the election timer to
expire, thereby resulting in an unwanted election and leader disruption. To
alleviate this scenario, I added a Stopwatch to keep the last time a leader
message was received, ie when a leader message is received it restarts
the Stopwatch. When ElectionTimeout is received, it checks if the
elapsed time of the Stopwatch has exceeded the election timeout
interval. Therefore if leader messages were occurring during the
election timeout interval but were delayed, they will be processed
before the ElectionTimeout message and restart the Stopwatch such that the
elapsed time will/should be less than the election timeout interval by the
time ElectionTimeout is received (unless the last leader message happened to
take longer than the election timeout interval).

There are cases where ElectionTimeout is manually sent to force an
election timeout (eg during leadership transfer). In these cases we
don't want to check the Stopwatch so I added an explicit TimeoutNow
message to distinguish the 2 messages.

Change-Id: I6b745288040da2fdcef1d29cb5ffc482c9e66003
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoBUG-5280: centralize ShardSnapshot operations 85/42785/11
Robert Varga [Fri, 29 Jul 2016 13:58:49 +0000 (15:58 +0200)]
BUG-5280: centralize ShardSnapshot operations

Current shard snapshotting mechanism does not allow for evolution
of the snapshot contents and contains only the root node. Also
the serialization and deserialization operations are scattered
in multiple places, making coordinated changes a bit troublesome.

This patch introduces a versioned snapshot abstraction and moves
serdes operations into a single place. A new serialization format
is introduced, which holds the root node and some additional
metadata. No concrete metadata is defined in this patch, but this
will be used to transfer frontend protocol state from shard leader
to shard follower.

It also moves the act of creating a snapshot into ShardDataTree
and creates a dedicated actor to handle the snapshotting task,
which is used for all snapshot requests for a particular Shard.
Also makes the actor message internal to the ShardSnapshotActor,
providing a convenience method to create and dispatch it.

Change-Id: I6d9680b6ef08672c363092a649255013980c0bd6
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoReduce missing feature warning to debug 01/42801/2
Tom Pantelis [Fri, 29 Jul 2016 19:37:35 +0000 (15:37 -0400)]
Reduce missing feature warning to debug

Many warn messages are emmitted for the "startup" feature:

  Feature: startup, 0.0.0 is missing from features service. Skipping

It seems this is an internal karaf feature and can't be retrieved. This
warning has been seen for other system features in the past. Since the
FeatureConfigPusher is only interested in ODL bundles for the purpose of
pushing CSS modules and since this warning has always been benign, I reduced
the log to debug to avoid the many warnings on startup.

Change-Id: Iad00acccfd0eabd55acb02493e689879e29646f0
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoMove generated sources to target/ directory 64/42864/2
Robert Varga [Sun, 31 Jul 2016 20:42:37 +0000 (22:42 +0200)]
Move generated sources to target/ directory

Generating bindings into src/ is a very bad idea, as that:
- does not conform the expectations
- does not get cleaned by maven
- makes the life for bindings hard (checkstyle, cpd, etc.)

Change-Id: I0d594aa849934c4ac88f8ebb0d4dcc8ea5a4e3e6
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoMake deserializeNormalizedNode more obvious 03/42803/2
Robert Varga [Fri, 29 Jul 2016 19:41:17 +0000 (21:41 +0200)]
Make deserializeNormalizedNode more obvious

Performing a direct return makes the code flow
more obvious, notably the fact that a null node
may only be received in case of new serialization.

Change-Id: Ib10c23d5990ca4452914b7f81647fd67b84863d2
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoAcquire SchemaContext from ShardDataTree 84/42784/3
Robert Varga [Fri, 29 Jul 2016 14:02:10 +0000 (16:02 +0200)]
Acquire SchemaContext from ShardDataTree

Instead of passing the SchemaContext explictly, make
ShardRecoveryCoordinator understand that it can obtain
the SchemaContext from ShardDataTree.

Change-Id: Id5ed521a96e8a741ad7da6199ba117b99a8f78e4
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoFix incorrect readResolve signatures 31/42731/2
Tom Pantelis [Thu, 28 Jul 2016 16:46:28 +0000 (12:46 -0400)]
Fix incorrect readResolve signatures

In several Serializable classes the return type of the readResolve
method is the class. However this is incorrect - the return type must
be Object or it is not recognized by the serialization framework.
Eclipse actually flags the incorrect signature with the "unused" warning
but the warning was suppressed in the code. Using the correct Object
return type, Eclipse doesn't issue the warning.

Change-Id: Id53182925fa48879f1f754c3f25361fb846b23ca
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoCDS Frontend client actor should delete prior snapshots 72/42272/4
Tom Pantelis [Thu, 21 Jul 2016 19:57:35 +0000 (15:57 -0400)]
CDS Frontend client actor should delete prior snapshots

The RecoveringClientActorBehavior increments the last generation id and
saves a new snapshot. However the prior snapshots remain in akka
persistence - every time the controller is restarted a new snapshot file
is created. We should delete the prior snaphsots.

The snapshot file names were very long with escape chars b/c
FrontendIdentifier.toString is used for the frontend client actor's
persistence ID. We should use a shorter, more readable ID, so I changed
it to the form:

  member-1-frontend-datastore-config
  member-1-frontend-datastore-operational

Change-Id: I1c77c826729ca1a36497a1236ac99f7cc77efb72
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoUse ActorSystem.terminate() 68/42668/2
Robert Varga [Thu, 28 Jul 2016 01:53:44 +0000 (03:53 +0200)]
Use ActorSystem.terminate()

ActorSystem.shutdown() has been deprecated, move on
to the replacement call.

Change-Id: I21cee3100c84003585afd9c95706c26f686d0eec
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoRemove src/main/yang as source folder 86/42586/2
Michael Vorburger [Tue, 26 Jul 2016 16:33:14 +0000 (18:33 +0200)]
Remove src/main/yang as source folder

The binding-parent now adds it as a resource folder.

Bug: 6252
Change-Id: I4b4d9b15b95b021f78e70dfb33c6f5287a0f44fe
Signed-off-by: Michael Vorburger <vorburger@redhat.com>
7 years agoConvert AppendEntries and reply to Externalizable proxy 79/42479/3
Tom Pantelis [Sun, 24 Jul 2016 20:59:35 +0000 (16:59 -0400)]
Convert AppendEntries and reply to Externalizable proxy

Converted the AppendEntries and AppendEntriesReply messages to use the
Externalizable proxy pattern. The classes remain Serializable but use an
internal Externalizable Proxy class with writeReplace and readResolve.
This reduces the serialized size to less than half.

Change-Id: Ica1a8ce09458b49b2993d3304ee2d80e38d4fc59
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoFix test failure in ShardTest 70/42570/2
Tom Pantelis [Tue, 26 Jul 2016 05:22:31 +0000 (01:22 -0400)]
Fix test failure in ShardTest

SHardTest#createSnapshotTest failed on jenkins due to using the
CallingThreadDispatcher, which is the default for test actors, instead
of the default dispatcher.

Change-Id: Id0d8186cb8bda356b81a056f4a0fd2bbecd3c7b4
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoRemove getModuleName()/getInstanceName() in IT 02/42502/2
Alexis de Talhouët [Mon, 25 Jul 2016 20:30:14 +0000 (16:30 -0400)]
Remove getModuleName()/getInstanceName() in IT

Since it was depreacted here https://git.opendaylight.org/gerrit/#/c/39891/

Change-Id: Iaaa6f4b89f4220c2fff01d216a97d785ab4ac3b5
Signed-off-by: Alexis de Talhouët <adetalhouet@inocybe.com>
7 years agoApply SchemaContext to dataTree first 10/42410/3
Robert Varga [Sun, 24 Jul 2016 19:05:30 +0000 (21:05 +0200)]
Apply SchemaContext to dataTree first

DataTree.setSchemaContext() can fail, hence do not update the schema
context before we propagate it to data tree.

Change-Id: I0170133177ac74280da2ccc367b3c447f9d4cdc9
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-5280: implement transaction dispatch 46/39946/63
Robert Varga [Tue, 7 Jun 2016 12:59:14 +0000 (14:59 +0200)]
BUG-5280: implement transaction dispatch

This patch adds the DOMStore interface in DistributedDataStoreClient
and defines the missing messages.

Change-Id: I6b0905fb97e3269c12a5cd8f2c681e4caeb14e3e
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoRestart BP container after dependency wait time out 20/42320/3
Tom Pantelis [Fri, 22 Jul 2016 04:11:44 +0000 (00:11 -0400)]
Restart BP container after dependency wait time out

The blueprint container first waits for all dependencies (ie OSGi
services, clustered app config etc). By default it waits 5 min after
which it fails the container. For a missing OSGi service this is
probably OK but it could take longer for a clustered-app-config if the
data store isn't available. Ideally we would use an infinite timeout but
unfortunately the timeout can't be configured globally - it can only be set
at the bundle level in the manifest and we don't want to have to
configure it in every bundle (although it may be possible with some maven
magic in odlparent). Therefore I added code in BlueprintBundleTracker to
listen for container FAILURE events and restart the container if it's
due to missing dependencies.

Change-Id: Ib8ebb1a02dfd601e48722f9fc3011df7391432cb
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoBug 6027 - Can't start karaf using symbolic link 82/39982/12
Alexis de Talhouët [Tue, 7 Jun 2016 23:04:15 +0000 (19:04 -0400)]
Bug 6027 - Can't start karaf using symbolic link

When executing the karaf script, it gets the DIRNAME based on $0
which is the path used to start the script. This DIRNAME is then
used to set the KARAF_HOME and multiple other KARAF_* evn variables.

Using a symbolic link, you would have, for instance, usr/bin/karaf
redirecting to /opt/opendaylight/bin/karaf.
So $0 would be usr/bin and not /opt/opendaylight/bin so the locateHome
function isn't setting the right path for the KARAF_HOME.

This ends up failing to start ODL with following ERROR:
Error: Could not find or load main class org.apache.karaf.main.Main

see:
https://github.com/opendaylight/controller/blob/master/karaf/opendaylight-karaf-resources/src/main/resources/bin/karaf#l114l126

Change-Id: I36eff657972768de7d7b90f6563addfc3dd96c0f
Signed-off-by: Alexis de Talhouët <adetalhouet@inocybe.com>
Co-Authored-By: Michael Vorburger <vorburger@redhat.com>
7 years agoRemove useless .gitignore 34/40934/5
Alexis de Talhouët [Tue, 28 Jun 2016 12:51:08 +0000 (08:51 -0400)]
Remove useless .gitignore

src/main/yang doesn't contain any models, nor nothing.
So it is safe to remove this gitignore

Change-Id: Ib97cb5b1443a02399d8ce3c0c8b56c7a44ee614b
Signed-off-by: Alexis de Talhouët <adetalhouet@inocybe.com>
7 years agoAdd again mdsal-singleton but remove prefix DOM 17/42317/4
Vaclav Demcak [Fri, 22 Jul 2016 10:54:16 +0000 (12:54 +0200)]
Add again mdsal-singleton but remove prefix DOM

We have only one implementation of ClusterSingletonServiceProvider
(DOM implementation) and we'd like to present it without DOM prefix.

* remove DOM prefix from ConfigSubsystem yang
* again add the odl-mdsal-singleton-dom feature
* again add CSSProvider in 06-clustered-entity-ownership.xml file

depends on: https://git.opendaylight.org/gerrit/#/c/42294/

Change-Id: Ieae0d462fe9fa523b2b1b18528759e0614b0225f
Signed-off-by: Vaclav Demcak <vdemcak@cisco.com>
7 years agoFix delete snapshots criteria 03/42303/2
Tom Pantelis [Thu, 21 Jul 2016 23:09:47 +0000 (19:09 -0400)]
Fix delete snapshots criteria

When a snapshot is saved, we attempt to delete prior snapshots as only
the last one is recovered from persistence. For the maxSequenceNr in the
criteria, we use snapshot sequenceNr - snapshot batch count. However
this assumes every snapshot is based on snapshot batch count. We may
snapshot for other reasons, eg install snapshot on follower. This can
leave multiple prior snapshots behind and use up significant disk space.

We should only retain the last snapshot saved so I changed the criteria
to use: maxSequenceNr = the saved snapshot sequenceNr (it's possible the
sequenceNr hasn't changed from th elast snapshot) and maxTimeStamp =
saved snapshot timestamp -1.

Change-Id: I35b1d71ed433d52ecff79ca07a81616e393a7b7f
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoAdd mdsal-eos-binding-adapter feature 66/42166/2
Tom Pantelis [Wed, 20 Jul 2016 06:54:59 +0000 (02:54 -0400)]
Add mdsal-eos-binding-adapter feature

Added the feature that contains the binding adapter implementation of
the EntityOwnershipService.

Change-Id: Iccdd99833938b827e69603412385392b98b62abf
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoComment out mdsal-signleton-service in controller 11/42311/1
Vaclav Demcak [Fri, 22 Jul 2016 12:11:44 +0000 (14:11 +0200)]
Comment out mdsal-signleton-service in controller

We'd like to have only one (DOM) ClusterSingletonServiceProvider in ODL.
So we have to commented out all actual reference for CSS DOM API in
controller and we'll able to clean MD-SAL CSS projects for DOM API
and stay with common API only.

So this patch has to be applay before
https://git.opendaylight.org/gerrit/#/c/42294/

Change-Id: I110b4554e4713802b9c261de8b1fd793eabb012a
Signed-off-by: Vaclav Demcak <vdemcak@cisco.com>
7 years agoChange default value of parameter "auto-down-unreachable-after" 94/42094/6
Sai MarapaReddy [Tue, 19 Jul 2016 20:49:21 +0000 (13:49 -0700)]
Change default value of parameter "auto-down-unreachable-after"

Akka documentation suggests not using auto-down feature
in production scenario.
Link - http://doc.akka.io/docs/akka/snapshot/java/cluster-usage.html

Change-Id: I24205a34e13c711791186b1e00d5203f623a0478
Signed-off-by: Sai MarapaReddy <sai.marapareddy@gmail.com>
Author: Sai MarapaReddy <sai.marapareddy@gmail.com>

7 years agoRemove global BindingToNormalizedNodeCodec instance 43/40743/7
Tom Pantelis [Wed, 22 Jun 2016 20:01:49 +0000 (16:01 -0400)]
Remove global BindingToNormalizedNodeCodec instance

The BindingToNormalizedNodeCodec was made a global static instance for
backwards compatibility for CSS users that inject the binding-dom-mapping-service
identity which defines the provided service as the concrete
BindingToNormalizedNodeCodec class instead of an interface. Therefore
the global static instance was created via blueprint and advertised via
its interfaces and was obtained via the static reference by the
RuntimeMappingModule for use by CSS users. The RuntimeMappingModule must
return an instance of BindingToNormalizedNodeCodec in order to provide
the binding-dom-mapping-service so obtaining the blueprint advertised OSGi
service via its interfaces and casting to BindingToNormalizedNodeCodec
failed b/c Aries creates a service proxy which loses the fact that it's a
BindingToNormalizedNodeCodec instance.

However the global BindingToNormalizedNodeCodec instance is not clean and
is problematic for supporting blueprint container restarts. Aries supports
concrete class proxies so I added an additional service export for the
BindingToNormalizedNodeCodec class in the binding-broker blueprint XML.
In addition, the blueprint XML now calls a new method, "newInstance", on the
BindingToNormalizedNodeCodecFactory to create a new instance and calls a
new method, "registerInstance", to register it with the SchemaService. The
returned ListenerRegistration instance is put into a bean so it can be
closed on destroy. The RuntimeMappingModule now obtains the
BindingToNormalizedNodeCodec instance from the OSGi registry as the other
blueprint-bridged CSS modules do. This eliminates the need for the global
instance.

Change-Id: I969ad5470967a81b37078393701c69d1898086cd
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoEnsure CSS modules are closed before blueprint containers on shutdown 01/40801/4
Tom Pantelis [Thu, 23 Jun 2016 14:35:07 +0000 (10:35 -0400)]
Ensure CSS modules are closed before blueprint containers on shutdown

Change-Id: I9be36a819423e904030540b161437b6f2ffd091d
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoExtend clustered-app-config to read default data from XML file 02/41902/4
Tom Pantelis [Fri, 15 Jul 2016 15:28:48 +0000 (11:28 -0400)]
Extend clustered-app-config to read default data from XML file

The default data can be specified in the clustered-app-config element
but it's also useful for scripting/automation or convenience to be able
to specify the default data in external XML file. The
clustered-app-config will now look for a file of the form

  <yang module name>_<container name>.xml

in a well-known location, etc/opendaylight/datastore/initial/config.

The XML file name can also be explicitly specified via the
"default-config-file-name" attribute.

Change-Id: Id310ef5ae121b8b9444a2102b93c3e382e421687
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoFix compiler error due to removal of InMemoryDataTreeFactory.create 60/42160/1
Tom Pantelis [Wed, 20 Jul 2016 05:38:12 +0000 (01:38 -0400)]
Fix compiler error due to removal of InMemoryDataTreeFactory.create

Change-Id: I016c1beeb55a438ba56b8076c7e792c79ac51294
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoAdd odl-mdsal-singleton-dom feature and CSS yang 93/42093/2
Tom Pantelis [Tue, 19 Jul 2016 01:30:39 +0000 (21:30 -0400)]
Add odl-mdsal-singleton-dom feature and CSS yang

Added the odl-mdsal-singleton-dom feature to the odl-mdsal-local-broker
feature.

Also added CSS yang and Module class for the
DOMClusterSingletonServiceProvider and added the XML to the existing
06-clustered-entity-ownership.xml file.

Change-Id: I69c7224fd7aa12742778670e7aec53118bf98332
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoConvert distributed EOS impl to use new DOM EOS interfaces 75/35475/14
Tom Pantelis [Fri, 26 Feb 2016 11:01:10 +0000 (06:01 -0500)]
Convert distributed EOS impl to use new DOM EOS interfaces

Change-Id: I5b2a6098a0c15f74ec2f16cb5451f3831ed913bf
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoAdd legacy pre-Boron EntityOwnershipService adapter 42/35442/15
Tom Pantelis [Fri, 26 Feb 2016 04:54:38 +0000 (23:54 -0500)]
Add legacy pre-Boron EntityOwnershipService adapter

Added a class that bridges between the legacy pre-Boron EntityOwnershipService
and DOMEntityOwnershipService interfaces. Also added the config yang and
Module class.

Change-Id: I77d02cb98a7dd5a713269907af7f269171c93fa8
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoAdded config model for mdsal.binding.codec. 15/41815/6
Tony Tkacik [Thu, 14 Jul 2016 09:56:44 +0000 (11:56 +0200)]
Added config model for mdsal.binding.codec.

Change-Id: I3ee74461e79c3332a0e8e41afe1d56af4b942a74
Signed-off-by: Tony Tkacik <tony.tkacik@gmail.com>
7 years agoSwitch to StandardCharsets 81/41981/2
Robert Varga [Mon, 18 Jul 2016 14:40:07 +0000 (16:40 +0200)]
Switch to StandardCharsets

Guava's Charsets should not be used when StandardCharsets are
available.

Change-Id: I7c52bd3070bb48857cbba82e8d4bc5993d7aea9d
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoUse mdsal.dom.codec instead of yangtools.data.codec 11/41811/2
Tony Tkacik [Thu, 14 Jul 2016 09:26:24 +0000 (11:26 +0200)]
Use mdsal.dom.codec instead of yangtools.data.codec

Change-Id: I90d7ccd2e9c994305931288fc39f0c990a28a866
Signed-off-by: Tony Tkacik <tony.tkacik@gmail.com>
7 years agoBUG-4167: fall back to unknown module for empty YangInstanceIdentifier 87/41887/2
Robert Varga [Fri, 15 Jul 2016 10:43:58 +0000 (12:43 +0200)]
BUG-4167: fall back to unknown module for empty YangInstanceIdentifier

When we encounter an empyt YangInstanceIdentifier (for
example during listener registration), we cannot extract
a module name -- fallback to unknown, which will cause
us to talk to the default shard.

Change-Id: I2162884c5ce0d2c2f714bb66afd82f699c52d789
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBug 6186 - fix testCandidateSerialization() 91/41691/5
Isaku Yamahata [Tue, 12 Jul 2016 04:15:53 +0000 (21:15 -0700)]
Bug 6186 - fix testCandidateSerialization()

The changeset of 97ff7dff8e58531065833736d5788808ca9e0396 make
LocalHistoryIdentifier#write() use WritableObjects#writeLongs()
instead of WritableObjects#writeLong(). In some situations, the
header length of object may be shorter.
As a result CommitTransactionPayloadTest#testCandidateSerialization()
failes. This patch fixes it by setting transaction id,
history transaction id to known value when setting up test.

Change-Id: I7fbd912564a25c92bc29f7e10bdae8ce1be52b8f
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
7 years agoReduce ConflictingVersionException log level to debug 47/41847/2
Sai MarapaReddy [Thu, 14 Jul 2016 16:50:11 +0000 (09:50 -0700)]
Reduce ConflictingVersionException log level to debug

In general it happens when there is  a ConflictingVersionException,
it retries and if it times out while retrying, it will log the error

The ConflictingVersionException is similar to the OptimisticLockFailuerEx
in the data store, i.e. the current config version is incremented and
recorded at the start of a push and if a second config is pushed before
 the first completes, the version changes and it detects that and
 re-pushes the first config. The CSS pushes one config at a time
 but this can happen during dependency resolution if it finds a
 dependent module that wasn't created yet or its config changed
 and needs to be dynamically recreated. The dependent module is
 pushed which results in a conflicting version. This
 happens with BGP.

Signed-off-by: Sai MarapaReddy <sai.marapareddy@gmail.com>
Author: Sai MarapaReddy <sai.marapareddy@gmail.com>
Change-Id: Ic1d4639625fa54ccc3d54331a960f421ad6fa1dd

7 years agoSet karaf.delay.console=true in etc/config.properties 62/40262/2
Michael Vorburger [Mon, 13 Jun 2016 23:18:11 +0000 (01:18 +0200)]
Set karaf.delay.console=true in etc/config.properties

$ ./karaf
Apache Karaf starting up. Press Enter to open the shell now...
 78% [=========================================>          ]
(...)
100% [====================================================]

Karaf started in 23s. Bundle stats: 276 active, 276 total

Change-Id: Iad04b7d03aa5be5dc17e28c4d60a56ded8a2f774
Signed-off-by: Michael Vorburger <vorburger@redhat.com>
7 years agoDo not fix JUnit version (to 4.12), but inherit 28/41328/2
Michael Vorburger [Tue, 5 Jul 2016 12:07:46 +0000 (14:07 +0200)]
Do not fix JUnit version (to 4.12), but inherit

Found this during analysis of
https://bugs.opendaylight.org/show_bug.cgi?id=6156 while grepping code
for JUnit 4.11 VS 4.12.  (This change does NOT fix bug 6156 of course,
it has nothing to do with it, but it still seemed like a sensible thing
to do.)

Change-Id: I5fb5f84ea3eda5d8ce0c21c5426e183f56762e2a
Signed-off-by: Michael Vorburger <vorburger@redhat.com>
7 years agoDo not override jolokia version 13/41713/2
Robert Varga [Tue, 12 Jul 2016 13:33:38 +0000 (15:33 +0200)]
Do not override jolokia version

Jolokia version is defined in odlparent (as 1.3.3), do not override that.

Change-Id: Ibaa663f9c91ce60cdd809bca1fa7d6a9262e3043
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBug 6102: Upgrade ietf-{inet,yang}-types to 2013-07-15 95/40795/8
Lorand Jakab [Thu, 23 Jun 2016 22:14:35 +0000 (17:14 -0500)]
Bug 6102: Upgrade ietf-{inet,yang}-types to 2013-07-15

Change-Id: Id434ae938946ff0b4a7b0798d538149f6bf6b15c
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
7 years agoBUG-5280: refactor CohortEntry 28/41428/3
Robert Varga [Wed, 6 Jul 2016 17:46:39 +0000 (19:46 +0200)]
BUG-5280: refactor CohortEntry

CohortEntry can be created in two transaction states: open and ready.
Make this explicit by hiding the two constructors and exposing two
explicit factory methods.

Change-Id: I33cb5a272828b23b8a6a2da5fad2ac3ead83ee7b
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-5280: do not pass SchemaContext to ShardCommitCoordinator 03/41403/3
Robert Varga [Wed, 6 Jul 2016 15:33:20 +0000 (17:33 +0200)]
BUG-5280: do not pass SchemaContext to ShardCommitCoordinator

ShardCommitCoordinator already has a reference to the ShardDataTree,
there is no point in passing the same SchemaContext in.

Change-Id: I307a0b1b744a3a134807799effbf434a111ff54b
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoGen. Abstract*ModuleFactory with WORKING handleChangedClass() 50/40950/4
Michael Vorburger [Tue, 28 Jun 2016 15:15:26 +0000 (17:15 +0200)]
Gen. Abstract*ModuleFactory with WORKING handleChangedClass()

Bug: 2855
Change-Id: I243da5822265db3913f6b0afb2f9393f78b0c24c
Signed-off-by: Michael Vorburger <vorburger@redhat.com>
7 years agoMinor clean-up: rm .checkstyle (these files are on .gitignore) 59/40959/4
Michael Vorburger [Tue, 28 Jun 2016 17:53:21 +0000 (19:53 +0200)]
Minor clean-up: rm .checkstyle (these files are on .gitignore)

Change-Id: I2db9b9a67f9cb47062b9409749a386e6559628d4
Signed-off-by: Michael Vorburger <vorburger@redhat.com>
7 years agoForce install snapshot when follower log is ahead 35/41535/2
Tom Pantelis [Fri, 1 Jul 2016 04:25:17 +0000 (00:25 -0400)]
Force install snapshot when follower log is ahead

It's possible for a follower's log to actually be ahead of the leader's log.
Normally this doesn't happen in raft as a node cannot become leader if its
log is behind another's. However, the non-voting semantics deviate a bit
from raft. Only voting members participate in elections and can become
leader so it's possible for a non-voting follower to be ahead of the leader.
This can happen if persistence is disabled and all voting members are
restarted. In this case, the voting leader will start out with an empty log
however the non-voting followers still retain the previous data in memory.
On the first AppendEntries, the non-voting follower returns a successful
reply b/c the prevLogIndex sent by the leader is -1 and thus the integrity
checks pass. However the follower's returned lastLogIndex may be higher in
which case we want to reset the follower by installing a snapshot.
Therefore I added a check in AbstractLeader#handeAppendEntriesReply if
the reply lastLogIndex > leader's last index.

Since the initial AppendEntries is sent immediately by the leader,
normally the follower will reply and this change works. However if a
follower happens to be disconnected and doesn't reply for some time, the
leader can still progress with new commits. If the leader has enough
commits such that its lastIndex matches or exceeds the lagging
non-voting follower, this check doesn't work. In this case, the
follower's integrity checks will fail since the leader's prevLogTerm
will differ. On reply the leader will start decrementing the follower's
nextIndex in an attempt to find where the logs match. During this
process the leader may trim its log via replicatedToAllIndex in which
case the follower's nextIndex may no longer be in the leader's log and
the leader will install a snapshot.

However if other nodes are down and prevent the log trimming then the
follower's nextIndex may be in the log until it eventually decrements to
0. The follower's integrity checks will pass in this case since the
leader's prevLogIndex will be -1. The follower will then attempt to add
the leader's log entries to its log. It first loops the log entries in
the AppendEntries with the intent of skipping matching entries in its
log (ie index and term the same) and stopping when it finds an entry
that doesn;t exist or finds one whose term doesn't match, in which case
it removes the entries beginning at this index. However I found some
issue in this code. First it was calling get on the getReplicatedLog
which doesn't take into account that the index may be part of the prior
snaphot and not actually in the log. I changed this check to
isLogEntryPresent which takes into account the snapshot. Second, if it
hits a conflicting entry it tries to remove it from the log. However,
as before, it may be in the snapshot and not in the log in which case
nothing gets removed. To alleviate this, I modified removeFromAndPersist
to return a boolean - false meaning it didn't find the index. In this
case I changed it to send back a reply to force a snapshot.

I added several tests in a new class NonVotingFollowerIntegrationTest
that runs thru various scenarios to cover the cases described above.

While testing I ran into some orthoganl issues that I also fixed.

- if a leader has only non-voting followers, on replicate, it should
  immediately commit and apply to state as it does when there's no
  followers since it doesn't need consensus from non-voting followers.
  So I added a method anyVotingPeers to RaftActorContext to handle this
  case.

- When calculating the prevLogIndex and prevLogTerm for the
  AppendEntries message, it calls get on the getReplicatedLog
  which doesn't take into account that the index may be the snaphot
  index/term. Follower does this check prevLogIndex/prevLogTerm so
  the leader should as well.

Change-Id: I3f92fc0b92ddc6d02dc6cb0e56b444a7c61035d7
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoAdd CLI bundle to the startup archetype 49/41049/16
Rashmi Pujar [Wed, 29 Jun 2016 20:08:52 +0000 (16:08 -0400)]
Add CLI bundle to the startup archetype

Change-Id: I19ef17236a25cc84a9ff4b94a990d324386d9b19
Signed-off-by: Rashmi Pujar <rpujar@inocybe.com>
7 years agoGen. Abstract*ModuleFactory handleChangedClass() with DependencyResolver 14/40514/5
Michael Vorburger [Sat, 18 Jun 2016 18:02:19 +0000 (20:02 +0200)]
Gen. Abstract*ModuleFactory handleChangedClass() with DependencyResolver

Bug: 2855
Change-Id: Ieb010d67983a4807bd1e5b55886ba0c4c3f13385
Signed-off-by: Michael Vorburger <vorburger@redhat.com>
7 years agoAdd option to enable/disable basic DCL and/or DTCL 72/41472/1
Ryan Goulding [Fri, 24 Jun 2016 15:50:00 +0000 (11:50 -0400)]
Add option to enable/disable basic DCL and/or DTCL

The cars stress test is a very appropriate place to measure the effects
of DCL and DTCL on a very long list.  This change adds a few RPC
implementations in order to do the following:

1) enable DCL
2) disable DCL
3) enable DTCL
4) disable DTCL

This change includes very basic DCL/DTCL implementations, which just log
a message at trace level (off by default but there for ensuring the
onData*Changed(...) method is actually called.

The existing clustering-test-app behavior doesn't change at all;  these
new RPC(s) do not need to be used, and the added Listener implementations
are not registered listeners by default.

Change-Id: I6fcec6cd8c0a082e815561e88b325a55022ad2af
Signed-off-by: Ryan Goulding <ryandgoulding@gmail.com>
(cherry picked from commit 7a53dd074428ce5c4be767a51c509b1b8cf0f05e)

7 years agoBUG-5280: introduce request/response Envelope 50/41150/10
Robert Varga [Thu, 30 Jun 2016 09:57:38 +0000 (11:57 +0200)]
BUG-5280: introduce request/response Envelope

This is a follow-up patch to move sequence information from
request/response structure and making it part of an Envelope,
which is allocated by the SequencedQueue.

Change-Id: I341118850d9c5835bab0b491f59b95264f31e5ef
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-5280: implement message queueing 61/39561/75
Robert Varga [Sat, 28 May 2016 23:27:24 +0000 (01:27 +0200)]
BUG-5280: implement message queueing

This patch implements the basic queueing and timeout retry mechanism
in ClientActorBehavior.

This implementation is not very efficient, as each send goes through
the actor's mailbox, but it gets the job done and is correct. It will
be optimized in a follow-up patch, which will refactor internal
workings so that SequencedQueue is fully thread-safe and correct with
regard to request enqueue, timeouts and retries.

Change-Id: I207a30877328dbdc08d42f76a0db55b5ae162de5
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoGenerate sal-binding-broker-impl-*-test-sources.jar 15/41315/2
Michael Vorburger [Tue, 5 Jul 2016 00:36:47 +0000 (02:36 +0200)]
Generate sal-binding-broker-impl-*-test-sources.jar

This is handy so that you can see the source of e.g.
AbstractDataBrokerTest in the IDE, just like other sources.  This did
not work before because of <type>test-jar.

Change-Id: I7f1e2516e326de3eb824a728146ef4287d8419f8
Signed-off-by: Michael Vorburger <vorburger@redhat.com>
7 years agoClear leaderId when election timeout occurs in non-voting follower 21/41321/2
Sai MarapaReddy [Wed, 29 Jun 2016 23:31:00 +0000 (16:31 -0700)]
Clear leaderId when election timeout occurs in non-voting follower

We need to enable election timeouts on non-voting follower and clear the
leaderId when it occurs to mimic the behavior when it goes to Candidate
on election timeout.

Signed-off-by: Sai MarapaReddy <sai.marapareddy@gmail.com>
Author: Sai MarapaReddy <sai.marapareddy@gmail.com>
Change-Id: I8b3316e14315a47e09b48af2e3ea16a391ec6c5a
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoAdd ServerConfigPayload to InstallSnapshot message 20/41320/3
Tom Pantelis [Wed, 29 Jun 2016 06:09:49 +0000 (02:09 -0400)]
Add ServerConfigPayload to InstallSnapshot message

When the leader installs a snapshot on a follower, it needs to include the
server config info as well. Otherwise if a server config change occurred
while a follower was down, it won't get the updated server config info
and will be out of sync with the rest of the cluster which causes other
issues.

Change-Id: Ic290ed162bf9fdf6b9fe55986ea0c9c9e83b29a9
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
(cherry picked from commit b8e21016b85e98c31d866de7b6db51691596c9f4)

7 years agoEliminate dead letters message when there's no sender 41/41241/2
Tom Pantelis [Thu, 23 Jun 2016 18:15:17 +0000 (14:15 -0400)]
Eliminate dead letters message when there's no sender

When sending a message to close a DTCL or DCL registration, if the caller
isn't interested in the reply, it passes ActorRef.noSender() (ie null). Internally
akka translates this to the dead letters actor so you see log messages
of the form "Message [...CloseDataTreeChangeListenerRegistrationReply]
from Actor[...] to Actor[akka://opendaylight-cluster-data/deadLetters] was not
delivered. [1] dead letters encountered.".

To alleviate this we should check if the sender is the dead letters actor
before sending the reply.

Change-Id: Idfaf280acf720cf5727393262638a7783c1af539
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoMove GlobalBundleScanningSchemaServiceImpl to its own bundle 58/40458/4
Tom Pantelis [Thu, 16 Jun 2016 01:04:02 +0000 (21:04 -0400)]
Move GlobalBundleScanningSchemaServiceImpl to its own bundle

Moved the GlobalBundleScanningSchemaServiceImpl and the associated
BundleActivator, SchemaServiceActivator, from sal-dom-broker to a new
bundle sal-schema-service.

A couple reasons for this. One is to break the circular service imports
between sal-dom-broker and sal-distributed-datastore, where
sal-distributed-datastore imports the SchemaService from sal-dom-broker
and sal-dom-broker imports the DOMDataBroker from
sal-distributed-datastore. The result of this was that if the
sal-dom-broker blueprint container was restarted, it would also cause
the sal-distributed-datastore container to restart, which isn't
necessary/desirable.

The other reason is that apps can register a SchemaContextListener as an
OSGi service which is picked up by the
GlobalBundleScanningSchemaServiceImpl. In terms of service usage this
makes sal-dom-broker a dependency of the app bundle so if the app
container restarts, it also restarts sal-dom-broker,
sal-distributed-datastore etc which isn't desirable.

So moving the GlobalBundleScanningSchemaServiceImpl to its own bundle
alleviates both issues.

Change-Id: I75d1009f6bfc1d80a19a61050703a1ca7e049575
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoFix intermittent failure in ClusterAdminRpcServiceTest 42/41242/3
Tom Pantelis [Sat, 2 Jul 2016 06:09:34 +0000 (02:09 -0400)]
Fix intermittent failure in ClusterAdminRpcServiceTest

testRemoveShardLeaderReplica(org.opendaylight.controller.cluster.datastore.admin.ClusterAdminRpcServiceTest)
Time elapsed: 8.187 sec  <<< FAILURE!
java.lang.AssertionError: Leader Id
Expected: (a string containing "member-2" or a string containing
"member-3")
     but: was "member-1-shard-cars-config_testRemoveShardLeaderReplica"
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:865)
at
org.opendaylight.controller.cluster.datastore.admin.ClusterAdminRpcServiceTest$2.verify(ClusterAdminRpcServiceTest.java:412)
at
org.opendaylight.controller.cluster.datastore.MemberNode.verifyRaftState(MemberNode.java:140)
at
org.opendaylight.controller.cluster.datastore.admin.ClusterAdminRpcServiceTest.testRemoveShardLeaderReplica(ClusterAdminRpcServiceTest.java:409)

member3 tried to become leader but hadn't gotten MemberUp for member2
yet so it didn't have its address when it sent out RequestVote. The
verification of new leader timed out before it coild try again. The call
to waitForMembersUp on line 397 should be for replica3 and not replica2.

Change-Id: I3a714c91ba974b16b2c310027b09f9658915a639
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoBUG-6140: controll karaf zip and tar.gz creation 02/41202/2
Michal Rehak [Fri, 1 Jul 2016 10:02:16 +0000 (12:02 +0200)]
BUG-6140: controll karaf zip and tar.gz creation

    - added 2 new properties in order to easy switch configuration
      via cli or child-pom

Change-Id: I617ac958e12097260264880e3c9f45fa1e1428a1
Signed-off-by: Michal Rehak <mirehak@cisco.com>
7 years agoAdd maven-metadata-local.xml to .gitignore 41/41141/2
Tomas Cere [Thu, 30 Jun 2016 12:36:54 +0000 (14:36 +0200)]
Add maven-metadata-local.xml to .gitignore

Change-Id: I1350644bf1e58462564eaa88c4ac6ab34b72cc3f
Signed-off-by: Tomas Cere <tcere@cisco.com>
7 years agoAdd blueprint wiring to opendaylight-archetype 61/41161/7
Alexis de Talhouët [Thu, 30 Jun 2016 18:29:51 +0000 (14:29 -0400)]
Add blueprint wiring to opendaylight-archetype

Change-Id: I0b219e8da4a1e58d6254c1dff993e8719aa14fd5
Signed-off-by: Alexis de Talhouët <adetalhouet@inocybe.com>
7 years agoBug 6106: Prevent flood of quarantine messages 33/41033/2
Tom Pantelis [Sat, 25 Jun 2016 02:04:02 +0000 (22:04 -0400)]
Bug 6106: Prevent flood of quarantine messages

Added a "quarantined" flag to the QuarantinedMonitorActor so it only
prints the warning and attempts to restart the karaf container once
(which is invoked indirectly via the caller's Effect callback).

Change-Id: I0a57af729280abded93d1b1a575df1672e52032e
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
(cherry picked from commit 3066f54d6d2c6206fa5fabc69a795993c68d2d77)

7 years agoFix intermittent test failures in CDS 28/41028/3
Tom Pantelis [Wed, 29 Jun 2016 07:04:47 +0000 (03:04 -0400)]
Fix intermittent test failures in CDS

Seeing intermittent failures on jenkins, eg

Failed tests:
  PartitionedLeadersElectionScenarioTest.runTest1:37->setupInitialMemberBehaviors:313->AbstractLeaderElectionScenarioTest.initializeLeaderBehavior:207
Missing messages of type class
org.opendaylight.controller.cluster.raft.messages.AppendEntriesReply

Sometimes the initial AppendEntries messages go to dead letters,
probably b/c the follower actors haven't been fully created/initialized by akka.
So added retries as a workaround.

Failed tests:
  ClusterAdminRpcServiceTest.testChangeMemberVotingStatesForShard:555->verifySuccessfulRpcResult:296
Rpc failed with error: RpcError [message=Failed to change member voting
states for shard cars: Shard
member-3-shard-cars-config_testChangeMemberVotingStatusForShard
currently has no leader. Try again later., severity=ERROR,
errorType=RPC, tag=operation-failed, applicationTag=null, info=null,
cause=null]

Needs to ensure node3's datastore shards are ready with leaders.

Change-Id: I5031c2a7b3e6eeddbf80b8eb346492acd11d664c
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoFix karaf regression introduced in Gerrit 40775 89/40889/2
Lorand Jakab [Mon, 27 Jun 2016 19:16:33 +0000 (14:16 -0500)]
Fix karaf regression introduced in Gerrit 40775

This should still allow for a Java installation in a folder containing
spaces.

Change-Id: I8d4d51e39bde6d2b237a755ff3c82a045d5e2629
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
7 years agoFix serialVersionUID 35/40735/4
Robert Varga [Wed, 22 Jun 2016 15:04:53 +0000 (17:04 +0200)]
Fix serialVersionUID

This fixes serialVersionUID not being final. Since it was not final,
as per Serializable contract, it had no effect. To retain compatibility
we must use a generated value.

Also remove use of a deprecated method.

Change-Id: I720dcd2613481eb474072ef29e7190cb0f5a28b6
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoDisable the version of Xalan bundled in Karaf 3.0.7 64/40864/2
Stephen Kitt [Mon, 27 Jun 2016 11:39:34 +0000 (13:39 +0200)]
Disable the version of Xalan bundled in Karaf 3.0.7

This patch extends the Xalan clean-up to disable the 2.7.2_3
ServiceMix bundle included in Karaf 3.0.7. It works with both 3.0.6
and 3.0.7.

Change-Id: I555266efb5c3437830024303083ba1dc982fbcb7
Signed-off-by: Stephen Kitt <skitt@redhat.com>
7 years agoDeprecate TransactionStatus 35/39735/2
Michael Vorburger [Wed, 1 Jun 2016 21:00:36 +0000 (23:00 +0200)]
Deprecate TransactionStatus

just because all other APIs where this enum is used are already marked
@Deprecated, so it would appear that this one was simply forgotten to be
marked as such.

Change-Id: Id8448d60a63d4a72a75ae0d25ebe7ff51db865c8
Signed-off-by: Michael Vorburger <vorburger@redhat.com>
7 years agoChange count type in the cars model 36/40836/4
Ryan Goulding [Fri, 24 Jun 2016 15:57:00 +0000 (11:57 -0400)]
Change count type in the cars model

The count type is changed from uint16 to uint32.  For some performance/stress
tests, it is desirable to issue 1E7 transactions to provide an adequate sample
size.  Prior to this change, it was impossible to issue a million transactions
without either invoking the RPC several times or using count=0 and stopping
based on log messages.  This makes perf testing easier.

Change-Id: Icf125e45bd85e14df6ed5ad91ddad92a8dd2151b
Signed-off-by: Ryan Goulding <ryandgoulding@gmail.com>
7 years agoAdd a description to "rate" in the cars model 35/40835/3
Ryan Goulding [Fri, 24 Jun 2016 15:54:20 +0000 (11:54 -0400)]
Add a description to "rate" in the cars model

While using this model for some performance testing, I realized I had no idea
what rate meant initially.  This change adds an appropriate description to the
rate leaf.

Change-Id: Idfd613f91e00de912784da55076ec7b13812fdd2
Signed-off-by: Ryan Goulding <ryandgoulding@gmail.com>
7 years agoEnable Java installation in folder with spaces 75/40775/3
danipeon [Thu, 23 Jun 2016 15:04:02 +0000 (17:04 +0200)]
Enable Java installation in folder with spaces

It has been quoted the variable $JAVA in the karaf script in order
to allow that the Java installation is in a folder which name
contains any space.

Change-Id: I6305204c872552c4e52ec7000720c67340cf0b88
Signed-off-by: danipeon <daniel.peon.quiros@ericsson.com>
7 years agoAdd "static-reference" blueprint extension 21/40421/5
Tom Pantelis [Wed, 22 Jun 2016 17:11:50 +0000 (13:11 -0400)]
Add "static-reference" blueprint extension

Added a blueprint extension, "static-reference", that obtains an OSGi
service and returns the actual instance. This differs from the standard
"reference" element that returns a dynamic proxy whose underlying
service instance can come and go. This is useful especially in cases
where the service exists for the life of the karaf container and you
don't need/want the overhead of the proxy.

Change-Id: I4cbcc7e2b5a85b0a22e50e12f3946d29bfb36c7d
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoInject BindingAwareBroker as it is now provided by OSGi 78/40778/1
Alexis de Talhouët [Thu, 23 Jun 2016 15:24:52 +0000 (11:24 -0400)]
Inject BindingAwareBroker as it is now provided by OSGi

Since the wiring is done via blueprint, BindingAwareBroker is not
registered using blueprint, thus it is no longer provided by CSS.

Change-Id: I87c4d21d51b243b9d9dcd4178556a98390ebc6ec
Signed-off-by: Alexis de Talhouët <adetalhouet@inocybe.com>
7 years agoBUG-5280: move AbstractDataTreeModificationCursor 40/39840/38
Robert Varga [Fri, 3 Jun 2016 12:28:42 +0000 (14:28 +0200)]
BUG-5280: move AbstractDataTreeModificationCursor

AbstractDataTreeModificationCursor functionality is useful for wide
range of users, move it to sal-clustering-commons.

Also eliminate the explicit stack, because YangInstanceIdentifier
already has O(1) methods to maintain a logical stack.

Change-Id: Ia0f8d24f32afd67c059e72cc967949f4c609fd7c
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-5280: add BackendInfo/BackendInfoResolver 58/39758/44
Robert Varga [Thu, 2 Jun 2016 09:20:51 +0000 (11:20 +0200)]
BUG-5280: add BackendInfo/BackendInfoResolver

Client actor needs to be able to resolve a particular backend
so it can implement retry logic with request adaptation. Add
the baseline class and an implementation for current sharding.

Change-Id: Ic7b679b1cadaff130b3a266606fe48cad5c20614
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-5280: introduce cookie in LocalHistoryIdentifier 07/39607/45
Robert Varga [Mon, 30 May 2016 14:19:12 +0000 (16:19 +0200)]
BUG-5280: introduce cookie in LocalHistoryIdentifier

Frontend transactions can map onto multiple backend shards,
hence the current form is not sufficient to identify responses.

Introduce anopaque cookie, which will be assigned to frontend
subtransactions and hence provide identification.

Change-Id: I442dcfa1a6f04330c608f3328a7e10c6aeb90bb0
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-5280: introduce base Transaction request/success 06/39506/59
Robert Varga [Thu, 26 May 2016 23:13:56 +0000 (01:13 +0200)]
BUG-5280: introduce base Transaction request/success

Change-Id: I23b83c3912975497f6ab2fac73451f51e613bc2e
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-865: remove reference to URLSchemaContextResolver 98/40698/2
Robert Varga [Wed, 22 Jun 2016 12:44:33 +0000 (14:44 +0200)]
BUG-865: remove reference to URLSchemaContextResolver

This has been superseded by YangTextSchemaContextResolver.

Change-Id: I40559fbd79ff7aff59585b60d65fd6e53da695c6
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoAdd "specific-reference-list" blueprint extension 67/40267/4
Tom Pantelis [Mon, 13 Jun 2016 22:37:58 +0000 (18:37 -0400)]
Add "specific-reference-list" blueprint extension

Added a blueprint extension, "specific-reference-list", that obtains a specific
list of service instances from the OSGi registry for a given interface. The
specific list is learned by first extracting the list of expected service types
by inspecting RESOLVED bundles for a resource file under META-INF/services with
the same name as the given interface. The type(s) listed in the resource file
must match the "type" property of the advertised service(s). In this manner, an
app bundle announces the service type(s) that it will advertise so that the
extension knows which services to expect up front. Once all the expected services
are obtained, the container is notified that all dependencies are satisfied.

This new extension will initially be used by the bgpcep project.

Change-Id: I3bc6a72134b33c744fbb48fd645dd3a0ca54673d
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>
7 years agoBUG-5903: do not rely on primary info on failure 27/40627/2
Robert Varga [Tue, 21 Jun 2016 16:09:03 +0000 (18:09 +0200)]
BUG-5903: do not rely on primary info on failure

This makes sure we check for failure before touching the result,
which is null if a failure occurs.

In order to keep disagnosti information we add a reference
to the message class being broadcast.

Change-Id: I26ab31a45916d11b61b990020bed89ae87233b14
Signed-off-by: Robert Varga <rovarga@cisco.com>
7 years agoBUG-5280: use a lambda for createLocalHistory()/close() 74/39574/38
Robert Varga [Sun, 29 May 2016 21:34:14 +0000 (23:34 +0200)]
BUG-5280: use a lambda for createLocalHistory()/close()

These are internal commands, which can be efficiently implemented
using a simple delayed execution primitive.

Introduce ClientActorContext#executeInActor(), which will wrap
a specialized subclass of Runnable and send it to the actor.

This can be used to dispatch lambdas to methods, reducing the need
for specialized messages and instanceof checks.

Change-Id: Id5cd388657a274d551892a6c943b062d70c7bea7
Signed-off-by: Robert Varga <rovarga@cisco.com>