git.opendaylight Code Review - controller.git/log

Provide the relativePath for benchmark modules

Change-Id: I7f5a0c74ba8698762155c7e1d48945fafcc2e0e1
Signed-off-by: Stephen Kitt <skitt@redhat.com>

BUG-1014: expose a proper ShardDataTree constructor

This patch exposes the proper constructor, deprecating the previous one
(which defaults to TreeType.OPERATIONAL). Furthermore convert all tests
to explicitly use OPERATIONAL tree.

The final bit which remains to be figured out is instantiation inside a
Shard instance, which is marked with a FIXME.

Change-Id: Ic8941c8fa5782b162e6faed7bc2d34920debc46e
Signed-off-by: Robert Varga <rovarga@cisco.com>

Make ModuleConfig immutable

The ModuleConfig class has mutator methods for
ModuleShardConfigProviders to initially construct instances. However
once supplied to the CondifuratonImpl they are intended to be immutable
yet the mutator methods expose loopholes around it. Therefore I added a
Builder to ModuleConfig and made ModuleConfig truly immutable.

Change-Id: I0b8070ff3db1563427a6a70ff174053b2a66feca
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Refactor MessageCollectorActor and DoNothingActor

Refactor/consolidate the duplicate copies of
MessageCollectorActor and DoNothingActor used in
org/opendaylight/controller/cluster/datastore and
org/opendaylight/controller/cluster/raft/utils to
use just the one in raft.

Also moved the EchoActor into raft.

Change-Id: I72784a6799ae4331ab52d497d421b9a8bb98f34a
Signed-off-by: Gary Wu <gary.wu1@huawei.com>

BUG-4638: fix typedef types

This adds the fix for model.util.type type structure, which differs for
leaf types with default value.

Change-Id: Ibefebda88d5a6876a72b1ff1ebdf0cb639135a07
Signed-off-by: Robert Varga <rovarga@cisco.com>

BUG-3516: make PingPongTransactionChain.close() asynchronous

When the system is critical loads Thread.yield can bring a long-term
blocking effect for hijacked threads like Netty. We should not be
blocking for prolonged time.

Rework the shutdown logic to be asynchronous, and scheduling the
potential outstanding transaction to complete as appropriate. Also fixes
the case where we would end up not reporting a transaction failure if
the transaction is readied, but was not submitted to the backend.

Change-Id: Ic7796a980d9e87242f70b7f7b9cdb30caeab9dd9
Signed-off-by: Vaclav Demcak <vdemcak@cisco.com>
Signed-off-by: Robert Varga <rovarga@cisco.com>

Changed the artifact id from 'benchmark-features' to 'features-benchmark'
Added 'odl-mdsal-features' feature definition that contains all benchmark artifacts (api, ds, ntf and rpc)
Added rpcbenchmark
Fixed tabs and white space at the end-of-lines
Fixed more white spaces

Change-Id: I1789ae09c3f316facef38e484310f6c3a4098dd7
Signed-off-by: Jan Medved <jmedved@cisco.com>

Added notification benchmark (ntfbenchmark) and rpc benchmark models
Rplaced tabls with spaces

Change-Id: Ic81947d69ddc6286a9ed3be3600f77d46088b6b0
Signed-off-by: Jan Medved <jmedved@cisco.com>

Fix warnings on unparameterized generic types

Fix warnings on unparameterized generic types in
and around BucketStore and Messages.

Change-Id: I867e5f030f88b56c837780e2bae2e1de54266b26
Signed-off-by: Gary Wu <gary.wu1@huawei.com>

Bug 4651: Implement handling of ClusteredDOMDataTreeChangeListener in CDS

Implemented handling of ClusteredDOMDataTreeChangeListener similar as to
what was done previously for ClusteredDOMDataChangeListener.

I also refactored the listener support classes used by Shard and
extracted generic base classes for the common functionality.

Change-Id: I694a6a4ce41284f7ecd3bf73bc6201e9d5555998
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bug 4651: Add ClusteredDataTreeChangeListener interface and binding adapter

Change-Id: I1254a73570ded65925374021341f6900b9a7bdf9
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

CDS: Fix deleteSnapshots criteria in SnapshotManager

The SnapshotManager specifies a magic number 43200000 as the timestamp
for the criteria passed to deleteSnapshots. It's unclear where this
number came from but it prevents prior snapshots from getting deleted
as stored snapshot timestamps will be greater than this value (unless
one was created back in the 70's or 80's :)). Since the SnapshotManager
passes a valid upper bounds for the criteria's maxSequenceNr, I changed
it to pass Long.MAX_VALUE for the timestamp.

The ReplicationAndSnapshotsIntegrationTest actually verifies prior
snapshots were deleted by checking for size 1 when querying the
InMemorySnapshotStore. However this only passes b/c the
InMemorySnapshotStore::doDelete is incorrect in that it doesn't compare
the stored snapshot timestamp against the criteria timestamp. So I
changed the InMemorySnapshotStore to correctly compare the timestamps as
well. I found the source for an InMemorySnapshotStore on line and that's
what it does.

Change-Id: Ie7d5eec14f684a469f4b6ff84732c9a9c6042360
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Fix bug in DatastoreContext copy constructor

Change-Id: I0ea1f79a8ab3f092a76b690f5f2089c3a2e7d6cb
Signed-off-by: Gary Wu <gary.wu1@huawei.com>

BUG 2817 - Create reusable classes for doing an action on finding primary

For AddShardReplica we are finding the primary and when it is found we send
a ForwardedAddServer* message to self. I need to do something similar for
RemoveShardReplica - so I've extracted the design pattern that was used for
AddShardReplica into a set of reusable classes. To start with this I can
use this for RemoveShardReplica but I think it can be used for other such
messages in future.

Change-Id: Ib625403f6eab5b07bc126af9db3d4e6e566e2038
Signed-off-by: Moiz Raja <moraja@cisco.com>

BUG 2817 : Handle ServerRemoved message in Shard/ShardManager

When a server is removed and the new ServerConfiguration is
replicated and consensus has been reached on it the RaftActor
sends a ServerReoved message to the Replica which has just been
removed.

This ServerRemoved messsage is received by the Shard and it
forwards the message to the ShardManager. The ShardManager
then removes the replica from it's persistent list.

Change-Id: I9252ab9d9768b549915d8cccf46f102127d97945
Signed-off-by: Moiz Raja <moraja@cisco.com>

Refactor MockConfiguration to extend ConfigurationImpl

MockConfiguration is now essentially a wrapper for
ModuleShardConfigProvider whose source is a shard name -> members map.
This will make it easier when adding new methods to Configuration
plus unit tests will now use the producton ConfigurationImpl as this
class is simple enough where we don't really need the functionality mocked.

Change-Id: I88e520b275a658a6d718442ad31c1f1e3603c70c
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bug 2187: Remove ShardManager mbean replica operations

Remove the add/remove shard relica mbean operations as it was decided to
use RPCs instead.

Change-Id: I419a1ec57dfaa9b1d8d55aae5a995d8050b43d70
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Prevent partial init in DatastoreSnapshotRestore

The config subsystem should only push one config
at a time, but in case it doesn't, synchronize
DatastoreSnapshotRestore.initialize() to prevent
partial initialization in the event of concurrent
calls to getAndRemove().

Change-Id: Ie614e8b2045d86ea46b55609bf5cde9e6597b086
Signed-off-by: Gary Wu <gary.wu1@huawei.com>

Bug 2187: Implement add-shard-replica RPC

The unit test creates 3 actor systems each with their own datastores.
Now that the ShardManager persists shard info and due to the static
nature of the InMemorySnapshotStore, each ShardManager needs to have a
unique persistenceId otherwise the equivalent ShardManager's persistence
Ids will clash. Therefore I added a shardManagerPersistenceId field to
the DatastoreContext so the unit test can provide a uniique Id based on
member name.

Change-Id: I907cd568d64f43586ffc1ec8581e4208f46db327
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Clean up plugin management

A number of plugins are managed by odlparent, so remove unnecessary
entries (i.e. specified identically in odlparent).

Remove all references to ${exam.version} (the dependencies are
inherited from odlparent).

Change-Id: I43ac4a692b7911321b448e788536d58f916657d1
Signed-off-by: Stephen Kitt <skitt@redhat.com>

BUG 2817 - Basic implementation of RemoveServer in the Raft code

When a RemoveServer is received it may ask for the removal of
a the current leader or one of the followers. As a first pass
we do not support removal of the current leader. To correctly
implement removal of the leader we would have to implement
leader transition which I intend to build in a future patch.

When a follower is removed the server configuration is changed
immediately on the leader and the new configuration persisted
to the journal. When other followers receive the removed
journal entry they would also remove the server from their
configuration, this is the same as what was done for the
AddServer implementation.

As soon as then new configuration is persisted we respond with
success to the caller. This is the same as for AddServer.

When the ServerConfiguration is complete we send a ServerRemoved
message to the follower which has been removed.

Change-Id: I2b85d82cbeef13cca830e3cc212aebbbcd95c818
Signed-off-by: Moiz Raja <moraja@cisco.com>

Remove unused ShardCommitCoordinator#CohortEntry constructor

Change-Id: I43b478bd6b5467cc46a65c97a5888ce0ec5ded5c
Signed-off-by: Moiz Raja <moraja@cisco.com>

Fix failure of testCloseCandidateRegistrationInQuickSuccession

Moved checking of whether the ownershipchange event occurred with
hasOwner=false to the loop so that we pass the test only when all
listeners receive that event with hasOwner=false

Change-Id: I463272822e6a39f310fef5996b541e1d06c79548
Signed-off-by: Moiz Raja <moraja@cisco.com>

Bug 2187: Don't close over internal state in ShardManager

For AddShardReplica, we use the ask pattern for the FindPrimary and
AddServer messages. However in the OnComplete callbacks we're closing
over internal state which isn't safe since the callback will be notified
outside of the actor's execution context which may result in concurrent
mutation of internal state. Therefore I added internal messages that are
sent to self in the callbacks.

Change-Id: I1f6662a4e473749925046f127cad868e54b761a2
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bug 3231 jolokia access should be controlled by aaa

Due to unfortunate lack of support, we are going to have to just use
basic authentication from config file for now.  I have committed this
patch to upstream jolokia:
https://github.com/rhuss/jolokia/pull/225
which will unlock power for us to use AAA.  However, this won't be
available until a new release is cut on Jolokia's end.

The only options for jolokia-osgi bundle for authentication are basic
file authn (which is implemented in this proposed changeset) and JAAS.
ODL's JAAS is unencrypted and generally disregarded, so basic file
authN was chosen.  By default, the credentials are admin/admin.

Change-Id: I35770bcf13b3cb32e59685e9bbf0ef47d73d132f
Signed-off-by: Ryan Goulding <ryandgoulding@gmail.com>

Bug 2187: Bootstrap EOS shard when no local shards configured

The intended workflow to initially form a cluster dynamically is to
change the role for a second node to say member-2. Since the initial
static shard config is bootstrapped to member-1, no local shards will be
created. However, the entity-ownership shard is special in that it is
intended to exist on every node.

The EOS will be boostrapped as follows:

For the EOS CreateShard message, all unique members for all shards are
obtained from the static shard config. It assumes the local member is
present in the config however in the above workflow it won't be. So on EOS
CreateShard, if the local member isn’t in the initial member list then
it will create the local shard with an empty peer list and
DisableElectionsRaftPolicy so it stays as follower. Also the shard will
be flagged as inactive in the ShardManager. A subsequent
AddShardReplica will be needed to make it active.

The other option is to not create EOS shard but there may be initial
candidate registrations which would be missed unless we add retry logic
in the service class. But the EOS shard already has retry logic so it
would be ideal to leverage it.

I also made changes to the AddShardReplica logic to handle an existing
local shard as will occur for the EOS shard:

- remove the failure reply if local shard already exists
- if the local shard exists and the primary shard is the local shard,
   do nothing and return AlreadyExistsException failure reply
- otherwise send AddServer to the primary
- on FindPrimary, if the local shard exists but is not active, do a
   remote find as if the local shard doesn't exist
- on AddServer, if the new server is already in the peer list, the
   ALREADY_EXISTS status is returned. Return AlreadyExistsException
   failure reply
- on AddServer failure, if the local shard was pre-existing don't
   remove it.

We still want to prevent an AddShardReplica request which one is already
in progress so I added a Set to track this.

I added an integration test for bootstrapping the EOS shard. It starts
with an inactive shard and registers a candidate, which gets queued
since there's no leader. It then issues AddShardReplica and verifies the
candidate gets registered with the leader.

To get this to work required some teaks in the RaftActor and Follower.
When the ShardManager clears the DisableElectionsRaftPolicy, the
RaftActor creates a new Follower instance however it loses the previous
leader Id. If the new server config hasn't been replicated yet then it
has no peers and immediately tries to start an election. Since it has no
peers it goes to Leader wih no followers creating a 2 leader situation.
To alleviate this I transferred the previous leader Id to the new
Follower instance to prevent the immediate election.

Eeven after that the test still didn't work b/c the leader was still not
in the EOS shard peer list so lookup of the leader address returned
null. So I changed getPeerAddress in the RaftActorContext to lookup in
the resolver if no peer info exists.

I also added more units for AddShardReplica to increase code coverage.

Change-Id: Id2a12ae226af69611d5ca5155f5f018cef82dff4
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bug-4636: NotificationSubscriber's exception prevents notifications to other listeners

Catch the exception and log it with enough context.

Change-Id: I23c248c59753008e6d09155513b2dba108fbccbf
Signed-off-by: Kamal Rameshan <kramesha@cisco.com>
Signed-off-by: Robert Varga <rovarga@cisco.com>

Specify dsbenchmark's parent POM relativePath

This is required to build controller with no pre-existing controller
artifacts.

Change-Id: I7fe9f6ae015a75ddaa5d53dcdd770a214b4322bb
Signed-off-by: Stephen Kitt <skitt@redhat.com>

BUG 2187 - Persisting shard list in ShardManager

In ShardManager, the local shard list is persisted as a snapshot.
On recovery, persisted shard list is used to create the shards.
During recovery, obtained persisted information is updated to the
configuration so that it is uniformly available to the DatastoreContext.

Incorporated the comments

Also, as localShards are now created after RecoveryCompletion, the
shardManager mbean is associated with the shardManager immediately
after creation. On creating the localShards, the shards addition
is notified to the mbean object.
In the shardManagerTests involving verification of the syncStatus
and CountDownLatch objects, the testcases are made to wait for
localShard creation by waiting for recoveryCompletion message.

Change-Id: I523ed9b14af4b1b6e272f05faac1cf37abfef336
Signed-off-by: kalaiselvik <Kalaiselvi_K@Dell.com>

Remove unused ShardCommitCoordinator constructor parameter

Change-Id: I1c25a18e6f4ed700547f7cc9931d5a44d31c7b93
Signed-off-by: Moiz Raja <moraja@cisco.com>

Bug 4564: Implement datastore restore from backup file

Added a singleton DatastoreSnapshotRestore class that looks for and
reads a restore file in a specific directory and deserializes the datastore
snapshots. The restore file is then deleted.

The DatastoreSnapshotRestore instance needs to be injected into both
DistributedDatastore instances which are created via separate config
system Module instances. However the only way to inject the
DatastoreSnapshotRestore instance would be to define a yang module
and service. I didn't want to go thru the overhead of all that and I
didn't want the DatastoreSnapshotRestore advertised as a service. So I made
it a static singleton that is created via a new bundle Activator class.

The DatastoreSnapshot instance is passed to the ShardManager which
passes each ShardSnapshot to the corresponding Shard actor. On
recovery complete, the RaftActor takes care of applying the restored
snapshot.

Change-Id: Ied3db4e49b98320abb34e2acf73b27b29232f8d6
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

BUG-865: specify DataTreeType explicitly

This removes the use of the compatibility create() method and specifies
the requested OPERATIONAL data tree explicitly.

Change-Id: Ib0f84202357cd413b43035450af1ecef0898a0ad
Signed-off-by: Robert Varga <rovarga@cisco.com>

Fix resource leaks in test cases

Close AutoCloseable objects created in
test cases that were not being closed.
Add mock calls for close() methods that
now need to be stubbed.

Change-Id: Iab057a3a1850d024f02656eb1ae82c6fb1486030
Signed-off-by: Gary Wu <gary.wu1@huawei.com>

Bug 2187: Persisting Actor peerIds' in snapshot

Persisting Raft Actor's peer information in a snapshot and recovering the same
from the snapshot.
Incorporated the comments.

Change-Id: I12831f129b2bdeb1c64f473e94be617f8d6ee487
Signed-off-by: kalaiselvik <Kalaiselvi_K@Dell.com>

Make methods static

Private methods which do not touch object state can be made static.

Change-Id: I4f5a7e6215c7570660ee797f4e694745844f72e7
Signed-off-by: Robert Varga <rovarga@cisco.com>

Bug 4564: Implement restore from snapshot in RaftActor

The restore snapshot is supplied by the derived actor's
RaftActorRecoveryCohort. If one exists the the RaftActorRecoverySupport
desrializes and applies the snapshot.

I also add a Builder to MockRaftActor to make it easier to pass
additional params.

Change-Id: Ib52b24331038ed48221cc27086fa3cceafe39fcf
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

BUG 4554 : Ownership is not cleared when all candidates are removed

When all candidates for an entity get unregistered at approximately
the same time it can create a situation where the owner for the
entity is not cleared. Consequently no entity ownership change is
raised where hasOwner is false even when there are no owners for
the entity.

This could be a problem for applications which do
some action when there are no candidates for an entity. The
openflow application for example relies on the disappearance of
all owners to actually remove a switch from inventory. Without
this event we have the situation that nodes hang around in inventory.

Problem Sequence
----------------

The sequence of events which leads to this problem are as follows.

Let's say member-1 owned entity-1 and there are 3 candidates for
entity-1 - member-1, member-2 and member-3. Now let's say due to
some event all candidates have to unregister. The data
transaformations will go like this.

delete member-1
delete member-2
delete member-3
delete member-1 succeeds so choose new owner - in this case member-2
make-owner member-2
delete member-2 succeeds - member-2 is not the current owner so do nothing
delete member-3 succeeds - member-3 is not the current owner so do nothing
make-owner member-2 succeeds. Now we have an owner for entity-1 even though we have not candidates

Solution
--------

The solution proposed in this patch is to set member to empty when
there are no remaining candidates. This changes the above sequence as follows.

delete member-1
delete member-2
delete member-3
delete member-1 succeeds so choose new owner - in this case member-2
make-owner member-2
delete member-2 succeeds - member-2 is not the current owner so do nothing
delete member-3 succeeds - member-3 is the last candidate so set member to ""
make-owner ""
make-owner member-2 succeeds. Now we have an owner for entity-1 even though we have not candidates
make-owner "" succeeds. Now we have owner for entity-1 set to no one as it should be

Change-Id: I583e8c6991742ada5846e87da35db255eeed144e
Signed-off-by: Moiz Raja <moraja@cisco.com>

BUG 4615 : Add method on EOS to check if a candidate is registered locally

Change-Id: Iedb2e4cf92553910cf5e1bd85978f88e10bf3c25
Signed-off-by: Moiz Raja <moraja@cisco.com>

Implement LeastLoadedCandidateSelectionStrategy

Change-Id: I09035505bcfa0ef5b2ac357217186ad98db7974c
Signed-off-by: Moiz Raja <moraja@cisco.com>

Maintain EntityOwnershipStatistics

Implementing a LoadBalancing entity owner selection
strategy depends on our ability to find the load on
specific candidates. The EntityOwnershipStatistics collects
this information and provides query methods to access
ownership counts for candidates.

Change-Id: I7e812b15e8fb21e3be1aed10384600b9acb8bf20
Signed-off-by: Moiz Raja <moraja@cisco.com>

Add a mechanism to read the entity owner selection strategies from a config file

Change-Id: Ie951e4f83aaf38f00e959f4243820a88cb988788
Signed-off-by: Moiz Raja <moraja@cisco.com>
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Pass in EntityOwnerSelectionStrategyConfig when constructing DistributedEntityOwnershipService

Change-Id: Iad1014db726a06de9a89a9987216ca4c96981122
Signed-off-by: Moiz Raja <moraja@cisco.com>

Pass in EntityOwnerSelectionStrategyConfig when constructing EntityOwnershipShard

Change-Id: I56c2f4f87c61e81b662cd0b30c60775389e9b9a3
Signed-off-by: Moiz Raja <moraja@cisco.com>

Allow passing of delay to the EntityOwnerElectionStrategy

Change-Id: If745443585e68a26c10622a7888ec52dbee0059c
Signed-off-by: Moiz Raja <moraja@cisco.com>

Add Delayed Owner selection base on strategy

Change-Id: I04fc216ffc7e5c3fd35b34b6d03a5030c359d77f
Signed-off-by: Moiz Raja <moraja@cisco.com>

Bug 2187: AddServer: check if already exists

On AddServer, if the new server already exists as a peer return
ALREADY_EXISTS status reply.

Change-Id: I3b324850e1f05fce72eced3b2ced52f1510973fe
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bug 2187: Increases test coverage in RaftActorRecoverySupport

This is a follow-up patch to
https://git.opendaylight.org/gerrit/#/c/29112/ to add more unit test
coverage.

Change-Id: I1dcd87c9bed55b75eed03e7736b0165f656f661f
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bug 2187: Return OK reply after AddServer persist

The AddServer processing was changed to return OK reply as soon as the
new ServerConfigurationPayload is persisted without waiting for
consensus. Prior, since the new server config is applied immediately in
the leader, if consensus wasn't reached, this would cause the
ShardManager on the calling side to delete new follower actor, resulting
in a "zombie" peer in the leader. Even if consensus isn't reached, the
new server config would've at least most likely been replicated to the
new follower and other down followers would eventually be replicated
when they come back up.

Change-Id: I425fa78d5dd023feda7913ed8d1b5b6c285ccae4
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

BUG 4589 : Handle writing and reading large strings

Change-Id: If81926757aef3c1275ba43a7cf8c7adf94d86e08
Signed-off-by: Moiz Raja <moraja@cisco.com>
(cherry picked from commit 28484d59aa626dd4b32cdeb2d10dbc2c47cc051a)

Bug 4564: Add Shard Builder class

Added a Builder class to Shard to replace the props and Creator
classes to make it easier to pass new params to Shard w/o having
to change a lot of code and unit tests. An upcoming patch will add
a new param.

Change-Id: I122747d0cc6c14f090026efe81425e1e1e4edc37
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Remove duplicate junit dependency

odlparent is already declaring the scope as test, no need to repeat
that. Fixes warnings in autorelease.

Change-Id: Ia0b6550d2ecbce80eefa168d78c8b50e29100698
Signed-off-by: Robert Varga <rovarga@cisco.com>

Introduce EntityOwnerSelectionStrategy

Currently the EntityOwnershipService does not do any load
balancing, in that it allows the first candidate that registers
to become an owner. There is a need to do that so that applications
which choose to do some *work* based on if it owns an entity can
scale better.

This patch introduces the concept of an EntityOwnerSelectionStrategy
with the intent to provide custom strategies later to choose an owner.

Since custom strategies require intimate knowledge of how the
EntityOwnershipShard chooses a leader at this time I do not think
a strategy can be passed to the EntityOwnershipService via API. The
intent therefor is to choose a strategy based on configuration
wherein a custom strategy can be chosen for each entity type. If
the Strategy needs any custom configuration then it can have configuration
files of it's own

Change-Id: Ia53b8edb59fb1d06a426d9d9a95c07ef4ae65cd1
Signed-off-by: Moiz Raja <moraja@cisco.com>

Bug 2187: Recover Peer Id's and Update peer map during Journal recovery

Recover ServerConfigurationPayload ReplicatedLogEntry's and immediately apply to the peer map in RaftActorContext.
Review Comments incoporated.

Change-Id: I1b1b3c21e83eb5ea799dd040a4da8f78f1155082
Signed-off-by: Rajesh_Sindagi <Rajesh_Sindagi@dell.com>

Clean up duplicate/unused dependencies and properties

Remove dependencies and properties provided in odlparent (with the
same versions).

org.json.version in features/mdsal/pom.xml is unused.

A few properties are only used once, in controller, so replace them
with the version in-place.

(All this will allow a number of properties to be removed from
odlparent.)

Change-Id: I07e9f2298ebd008d82b22b156dc2ddce50151641
Signed-off-by: Stephen Kitt <skitt@redhat.com>

Cache config QNameModules

Use pre-instantiated and cached QNames, so we do not end up wasting
space unnecessarily.

Change-Id: I7ff7b9a098fbf182770d07ccbd0b9bb60334fb82
Signed-off-by: Robert Varga <rovarga@cisco.com>

BUG-4556: lazy computation of MXBean maps

Further analysis of our feature:install CPU usage shows that we spend
inordinate amount of time constructing MXBean maps. Make the
construction more asynchronous.

Change-Id: I69450bfe8debb65160c40aed6a75ff3d3bef831d
Signed-off-by: Robert Varga <rovarga@cisco.com>

Set odlparent-lite as parent for benchmark/pom.xml

Change-Id: I80e8c621a909fd4dde0a7d25d887ea4523451ce6
Signed-off-by: Vratko Polak <vrpolak@cisco.com>

Bug 4560: Improve config system logging for debuggability

Manually cherry-picked from
https://git.opendaylight.org/gerrit/#/c/28985 as the files have moved in
master.

Also the code has changed slightly in master, specifically the
ConfigPusherImplTest no longer uses a Thread uncaught exception handler
for verification. However it does rely on exceptions thrown from the
ConfigPusherImpl so, to keep the same behavior, I added a
propagateExceptions flag to ConfigPusherImpl#process. The
ConfigPersisterActivator production code passes false so unchecked
exceptions aren't handled as uncaught exceptions.

Change-Id: Iabc22030abc22cf11a1476986ba3d3366021b4fb
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Set odlparent-lite as artifacts parent

Change-Id: I4ae4994db55739460ca5d326865d7e704a2b8e26
Signed-off-by: Thanh Ha <thanh.ha@linuxfoundation.org>

Bug 4564: Implement clustering backup-datastore RPC

Added a new RPC backup-datastore to send the GetSnapshot message to the
ShardManager's and persist the list of DatastoreSnapshots to a file.

I also renamed the cluster-config yang module to cluster-admin to make
it more general as the backup RPC isn't related to configuration.

Change-Id: I18e5d47f7052b890c3547066145e4d5d0fbe1277
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bug 4564: Implement GetSnapshot message in ShardManager

Added a serializable DatastoreSnapshot class that stores the serialized
snapshot for each shard.

On GetSnapshot, the ShardManager sends a GetSnapshot message to each
shard and creates a ShardManagerGetSnapshotReplyActor to compile the
replies and return a DatastoreSnapshot instance to the caller.

Change-Id: I11f872aa701f1e51de9cbccdc1a372a76bc45cff
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bug 4564: Implement GetSnapshot message in RaftActor

Added a new client message, GetSnapshot, to return a serialized Snapshot
instance. The implementation just captures the snapshot for return and does
not persist it. If data persistence isn't enabled, it does not initiate a
capture and returns a serialized Snapshot instance containing just the
persistable state, eg election term info.

Change-Id: I9ea7fc8e0e60c4d6874f5eb0188543e1d9b51243
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bug 4149: Implement per-shard DatastoreContext settings

Added the ability to specify shard-specific settings in the .cfg file by
prefixing the shard name to the property name, similar to what we allow
at the datastore level.

I added a DatastoreContextFactory that has methods to get the base
DatastoreContext and a per-shard DatastoreContext. The
DatastoreContextFactory is now passed to the ShardManager instead of the
DatastoreContext. The DatastoreContextFactory uses the
DatastoreContextIntrospector to overlay per-shard settings onto the
base DatastoreContext.

Change-Id: I329c98c1577a74ebe665052f76e28da3867e2e86
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Added the data store benchmark (dsbenchmark, Bug 4519, https://bugs.opendaylight.org/show_bug.cgi?id=4519)

Change-Id: Ibc6d214b43b6353adbc49ba7b5b4a302ae1fbd95
Signed-off-by: Jan Medved <jmedved@cisco.com>

Speed up YangStoreService

Change-Id: Ibaf972650045b5d85be155f653f7eef36aae6c6e
Signed-off-by: Robert Varga <rovarga@cisco.com>

Bug 4563: Increase akka seed-node-timeout

Change-Id: I8f17872ef30a96d58a666e3499cf42ab59f0491d
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Fix precondition string

The string has been corrupted, fix it up.

Change-Id: I36312ca4e5ca6365b3003a2ad57ca2734d156578
Signed-off-by: Robert Varga <rovarga@cisco.com>

Improve YangStoreService performance

Simple changes to eliminate synthetic methods and unneeded duplication
of collections.

Change-Id: I370d4ed85720e2b7eb811204afa9f532b716b16d
Signed-off-by: Robert Varga <rovarga@cisco.com>

Do not subclass Hashtable

Rather than subclassing, instantiate a Hashtable and fill it.

Change-Id: Icfd4e812759874a702a2506e9090cd20535bdc50
Signed-off-by: Robert Varga <rovarga@cisco.com>

BUG 3973: Add config option for Java-only leveldb

Add comment in akka.conf on how to use the Java-only
version of leveldb for platforms where native leveldb
is unavailable.

Change-Id: I5693522597152ef7f86bb89d4be32e20f0582806
Signed-off-by: Gary Wu <gary.wu1@huawei.com>

Add leader unit test for non-voting consensus

Added a test case to LeaderTest to verify a non-voting follower
does not influence replication consensus.

Also I saw intermitent test failures (in jenkins as well during first
verify build) due to a message going to dead letters shortly after
actor creation (also reported in Bug 4223). Specifically it was occurring
when the leader sent the initial AppendEntries heartbeat to a follower. This
seems like a timing issue/bug in akka when using an ActorSelection. I
added code in the TestActorFactory to use an actorSelection and call
resolveOne in a retry loop. This seems to alleviate the issue as I ran
LeaderTest over 1000 times successfully.

Change-Id: I65cb87f419c280befe2d82300a981bd8e6f88742
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bug 2187: Address comments in https://git.opendaylight.org/gerrit/#/c/28596/

Addressed minor comments in https://git.opendaylight.org/gerrit/#/c/28596/.

Unified the response messages and debug messages.

Added persistenceId() format param to the debug messages for additional
context.

Change-Id: Ic1a4e852126425cf7ae67ee5b9ea301b06a3f9a8
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Always persist ServerConfigurationPayload log entries

We need to always persist ServerConfigurationPayload log entries
regardless of whether or not persistence is enabled for the derived
RaftActor's data.

I added a new tagging interface PersistentPayload, implemented by
ServerConfigurationPayload, to indicate a Payload
needs to always be persisted. Since log entries are persisted by
both the RaftActor and Follower behavior via the ReplicatedLog, the
logic to determine persistence based on PersistentPayload needs to be
available to both. The ReplicatedLog uses the persistence provider
contained in the RaftActorContext which is the
DelegatingPersistentDataProvider set by the RaftActor. So to keep
the rest of the code the same and keep it simple, I derived a
RaftActorDelegatingPersistentDataProvider which overrides persist to
handle the PersistentPayload logic utilizing the RaftActor's
existing PersistentDataProvider.

Change-Id: I243026b28ed57461ad92324b6947091ae74a7127
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Derive MockRaftActorContext from RaftActorContextImpl

I changed MockRaftActorContext to derive from RaftActorContextImpl since
it duplicates most of the functionality in RaftActorContextImpl and,
with the addition of PeerInfo, MockRaftActorContext can now provide the
same functionality as RaftActorContextImpl w/o having to duplicate it in
MockRaftActorContext. Also this will make it easier when the RaftActorContext
interface is changed.

Change-Id: Ief90232fc992a50b3f0fea5ece323a14916760f2
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

BUG-3381: Capture Snapshot on recovery if journal is not empty

Change-Id: Ib1068cb6d4848d151039887b51458399ff421178
Signed-off-by: evvy <dhiraviam.natarajan@gmail.com>

Add wait state for AddServer if snapshot in progress

It is possible a snapshot capture coild be in progress when we
attempt to initiate snapshot capture on AddServer. I added a wait
state to the FSM and a new message, SnapshotComplete, that is sent
by the SnapshotManager.

Added more unit test cases.

Change-Id: I119a264e03686ea70f7834e551c2fb45dd39f903
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

BUG 2187 - Creating ShardReplica

Creating local shard replica with a custom Raftpolicy. Informs Shard leader of the local shard.
Processes AddServerReply from shard leader.
On successful replication, makes local shard voting capable.
On replication failure, local shard is removed.

Incorporated the comments

Change-Id: Id2b90039c39211b20322bc2d141520723d44c391
Signed-off-by: kalaiselvik <Kalaiselvi_K@Dell.com>

BUG-2187: Non voting and Uninitialized followers are not to be counted towards consensus

Change-Id: I1ba86cf2e2f904847ea8f819e84a3dc54fcc31d2
Signed-off-by: Rajesh_Sindagi <Rajesh_Sindagi@dell.com>

Add voting state to ServerConfigurationPayload

Changed the internal state to a list of ServerInfo instances which
contain he server id and voting state.

Also removed the oldServerConfig field as it won't be needed.

Change-Id: I10b3ca8dc2ffed9b5db0a7d0f6ca74d73a837b8e
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Fix small bug in startup archetype

Change-Id: I83913ed9f16b38f6e6fd461b76dece1a09f4c8ca
Signed-off-by: Ed Warnicke <hagbard@gmail.com>

BUG-2399: fixup tests

The test model specifies the top-level container as structural, yet the
tests expect it to exist when empty. Mark the container as presence,
restoring behavior expected by tests.

Change-Id: Ided99720468a8bee14d5c66342e524450f5a9050
Signed-off-by: Robert Varga <rovarga@cisco.com>

Introduce PeerInfo and VotingState

We need to store the voting state for each per so I created a
PeerInfo class to include, id, address and voting state (represented by a
VotingState enum). The RaftActorContext now stores PeerInfo instances
in its peer map and added methods to access PeerInfo. As a consequence,
RaftActorContext#getPeerAddresses was no longer needed and was removed.

AbstractLeader and Candidate were modified to utilize the PeerInfo to
calculate the majority vote/min replication count, ie ignore non-voting peers.

Previously we had added a FollowerState enum and stored it in the
FollowerLogInformation. Since voting state is now stored in the
RaftActorContext peer info, I removed the FollowerState from
FollowerLogInformation to avoid redundancy and having to keep both
up to date.

Change-Id: I1394511a8db7f0b9df3ed7879c77c1f44f3b143d
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Bump Akka to 2.3.14

Change-Id: Ia6bf3f1a4c025ec1e84662c04ccdc40c04e569a2
Signed-off-by: Gary Wu <gary.wu1@huawei.com>

Remove checks for NormalizedNodeBuilderWrapper

Interface contract already guarantees returned objects are subclasses of
NormalizedNodeBuilderWrapper, the instanceof guards only non-nullness.

Switch to explicit assertNotNull() to reduce eclipse warnings.

Change-Id: Ibf0d73752c6e1ebeacbb10677e2f11f185098bd9
Signed-off-by: Robert Varga <rovarga@cisco.com>

Do not use MoreExecutors.sameThreadExecutor()

This method is deprecated, replace it with proper service/executor.

Change-Id: I7257a28f28784313cafc250f2c2fd1c623332dec
Signed-off-by: Robert Varga <rovarga@cisco.com>

Make REUSABLE_*_TL final

Since these are public static fields, they should be final to prevent
possible shenanigans.

Change-Id: I4a360e060ddde57a73118bcf3d053ce397204136
Signed-off-by: Robert Varga <rovarga@cisco.com>

Reduce ShardDataTree#getDataTree() callsites

A lot of these callsites perform a specific function, expose those
functions without leaking the DataTree. This is needed to handle
asynchronous persistence and optimistic transaction commit.

Change-Id: I330cb4172349e0d1d8daacc3aafce7dad64cd8b2
Signed-off-by: Robert Varga <rovarga@cisco.com>

Do not declare unneeded Exception throw

Fixes sonar warnings

Change-Id: I31ab95c75cf30b33c9025d6f6e4662ccc5df7a47
Signed-off-by: Robert Varga <rovarga@cisco.com>

Make private methods static

These methods do not reference object state and therefore can be made
static.

Change-Id: I416e415b90647b4f700b7893fe4f64f479271fab
Signed-off-by: Robert Varga <rovarga@cisco.com>

Add getPeerIds to RaftActorContext

For upcoming to work to add voting status to the peer info in
RaftActorContext, I added a getPeerIds method to replace calls to
getPeerAddresses as virtually all callers really just want the IDs or want
to check the size. getPeerAddresses will (likely) be removed altogether -
this is a preliminary patch.

Change-Id: I2b6f2c36dfec14ccd4bbfef35e67ed86cf3e3e45
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Fix resource leaks in TransactionChainProxyTest

Close TransactionChainProxy objects (AutoCloseable)
that were not being closed in the test cases.

Change-Id: I85b1f951545b764007bdb2e808a2438c9bd4b2b2
Signed-off-by: Gary Wu <gary.wu1@huawei.com>

update leveldbjni version to support Solaris

Change-Id: I46de5b3cc9c220a70a408194fb3ff709cdff1937
Signed-off-by: rshoaib <rao.shoaib@oracle.com>

Bug 2187: Code cleanup and refactoring

I addressed remaining comments from a prior patch.

I also refactored RaftActorServerConfigurationSupport to use an FSM
similar to the SnapshotManager with some generic classes. This will
make it easier to implement RemoveServer and reuse code.

Change-Id: Id3cdcede3f9c393c878abd3e9a9d3a5e12c5fb8a
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

Remove unused Jersey dependencies from the controller

The code utilizing Jersey was moved to the netconf project. This change
removes some of the deprecated dependencies.

Change-Id: I62b944497c976b1251412d8d047ef833e69dfb0a
Signed-off-by: Ryan Goulding <ryandgoulding@gmail.com>

Remove unnecessary @SuppressWarnings

Change-Id: I2b59e7f29a15298c1135c12b6bd9699205706600
Signed-off-by: Gary Wu <Gary.Wu1@huawei.com>

Fix resource leaks in exception handling

Fix resource leaks when exceptions are
encountered during ConfigManagerActivator.start().

Change-Id: Ic12c756aa5a768add0bc62e71eed94e5b2fa5fea
Signed-off-by: Gary Wu <Gary.Wu1@huawei.com>

Bug 4037: Allow auto-downed node to rejoin cluster

This patch will detect when a node has been
auto-downed/quarantined by another node. When this
happens, the ActorSystem of the datastore will be
restarted to allow the node to rejoin the cluster.

Change-Id: I0913bf455d426b6a0fccb17eac61b74f0911fa5d
Signed-off-by: Gary Wu <Gary.Wu1@huawei.com>

Bug 2187: AddServer unit test and bug fixes

Follow-up patch to https://git.opendaylight.org/gerrit/#/c/28018/.

Got the unit tests working and added more unit tests to cover more code.

Also fixed several bugs in the code that were failing the tests. One bug
was caused by replicating data quickly after install snapshot was
complete. On the final install snapshot chunk the follower sends an
ApplySnaphot message to persist and apply the snapshot. On the reply,
the leader assumes the follower is up-to-date and sets its next index.
However, applying the snapshot, ie updating the log and commit index, is
actually done after the async callback from the snapshot persist. In between
that time, if the leader sends the server config AppendEntries, the follower's
log is still empty and it deems itself out-of-sync and reports back failure.
This will cause the leader to eventually send a new install snaphot
which isn't which is not desirable. Also it may delay consensus for the
server config entry.

To fix this, I delayed the final InstallSnapshotReply until after the
ApplySnapshot is complete. I did this by adding a Callback to the
ApplySnapshot message which the SnapshotManager invokes.

Also the new server config was constructed without the leader's ID - it
needs to contain all members.

Also the ServerConfigurationPayload wasn't being applied in the
followers.

Another issue was that, if the leader had no peers initially, the
heartbeat wasn't scheduled so, when the new server was added, heartbeats
weren't occurring. So I change addFollower to schedule the heartbeat.

I added a test for adding a non-voting server which caused an endless
loop in AbstractLeader#handleAppendEntriesReply where it updates the
commitIndex based on the replicated count. To fix this, I added a break
if the replicatedLogEntry is null.

Change-Id: I5dff351140c611d58357cd58900bed401606038c
Signed-off-by: Tom Pantelis <tpanteli@brocade.com>

BUG 2187 - JMX API for create/delete shard replica

Change-Id: I48a4dcb7983f5f231e9ddc04e851950abf7c2d8a
Signed-off-by: kalaiselvik <Kalaiselvi_K@Dell.com>

BUG-2187: Add Server - Leader Implementation

Processes addServer request from the follower, forwards the request
to the shard leader, if not the leader.

The follower shard replica data is brought to sync with leader by installing the snapshot from the shard leader.
On sucessful application of snapshot data, this voting but not initialized member is transitioned to voting member.
New server configuration is persisted and replicated to majority of the followers and responds back with OK message to the shard follower.

In case where the leader is unable to sync data to the follower in a configured time period, TIMEOUT message is responded back to the shard follower without adding/persisting the new server configuration.

Change-Id: I9a3870d14bb6ad532ff64f315b2e2000d8b803e2
Signed-off-by: Rajesh_Sindagi <Rajesh_Sindagi@dell.com>