Robert Varga [Wed, 2 Apr 2025 19:41:33 +0000 (21:41 +0200)]
Fix CompressionSupport.NONE
We should be handing out a PlainSnapshotSource, not a LZ4-compressed
one.
JIRA: CONTROLLER-2134
Change-Id: I41fc06d7a28a5d35e084645612c0deb6f3145c13
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 2 Apr 2025 02:58:25 +0000 (04:58 +0200)]
Rename SnapshotFileFormat to CompressionSupport
The idea of having a single compressed stream will not work quite well.
Take a first step by separating what we have into CompressionSupport.
JIRA: CONTROLLER-2134
Change-Id: I2140b1fa84ebdd9f473b07f14bb673049614a941
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 15:03:18 +0000 (16:03 +0100)]
Propagate stateDir to LocalAccess
We will need the top-level directory for persistent RaftStorage outside
of Pekko persistence. Make sure we make it available from LocalAccess.
JIRA: CONTROLLER-2134
Change-Id: I7c8130cc89a024f8d276a988787d32e9b0c93b53
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 1 Apr 2025 15:53:09 +0000 (17:53 +0200)]
Move base.messages.CaptureSnapshot
This class is no longer being sent to actor, but rather is a plain
holder used mostly by SnapshotManager.
Move the class into SnapshotManager, hiding its constructor and dropping
the ControlMessage part.
Change-Id: I8bd223bf4b6c16f2a9afc5c52bdf30c096ecbeb8
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 1 Apr 2025 15:40:28 +0000 (17:40 +0200)]
Clean up Leader
Use local variables to squash nullness warnigns and make
LeadershipTransferContext properly constant.
Change-Id: I50e48092d4f3d36b7ce8fee6df79cfa6c67317e2
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 1 Apr 2025 15:28:44 +0000 (17:28 +0200)]
Improve sal-akka-raft assertions
Use assertInstanceOf()/assertSame() instead of referencing the
behavior's role.
Change-Id: I58d65fea5676b95f76ca00b40d5a9e44a33d8dfe
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 12:55:11 +0000 (13:55 +0100)]
Rewire SnapshotManager.captureToInstall()
Do not may a roundtrip through actor messages, but rather use the
facilities provided RaftStorage to perform off-loaded serialization.
This eliminates the offload actor from sal-distributed-datastore, as
well as the need CaptureSnapshotReply, as we now message completion via
executeInSelf() and callbacks.
JIRA: CONTROLLER-2134
Change-Id: I0518e7eb80559382ede5b4b4e7db41c66fd6a564
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 07:31:16 +0000 (08:31 +0100)]
Add DataPersistenceProvider.streamToInstall()
This is the first step in having asynchronous access to a snapshot
bytestream: RaftStorage now exposes streamToInstall() method, which
writes out a snapshot in the background and invokes a callback once
that is completed.
JIRA: CONTROLLER-2134
Change-Id: Ic650dbf4b31c62135dbc7b28fa9682f1ef5bf824
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 1 Apr 2025 06:44:49 +0000 (08:44 +0200)]
Fix RaftStorage startup
PersistenceControl fails to start enabledStorage, leading to all sorts
of mayhem. Fix that up.
JIRA: CONTROLLER-2134
Change-Id: I54023b01f83c7657846253da8e1f505fc52584f6
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 1 Apr 2025 06:11:03 +0000 (08:11 +0200)]
Silence PropertiesTermInfoStore
Log NoSuchFileException at trace(), reducing clutter in tests.
JIRA: CONTROLLER-2133
Change-Id: I40c1855c957ad82e84f3a02fcedb1167845dc28d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 31 Mar 2025 19:45:29 +0000 (21:45 +0200)]
Allow TestDataProvider's execution to be adjusted
We have a use case where we would like to delay execution of
DataProvider callback. Ditch the shared instance and allow the executor
to be set.
JIRA: CONTROLLER-2134
Change-Id: I0e99bb51eb549f6db3e27f5fdc600f6c3c97f097
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 31 Mar 2025 17:28:05 +0000 (19:28 +0200)]
Centralize MockRaftActorSnapshotCohort methods
We have 4 implementations doing the same thing. Use default methods to
reduce duplication.
JIRA: CONTROLLER-2134
Change-Id: I3969590e94b2d138ac665221d7d31b0f0ac3a064
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 31 Mar 2025 16:40:32 +0000 (18:40 +0200)]
Use ByteStateSnapshotCohort in SnapshotManagerTest
SnapshotManagerTest is dealing with ByteState mostly and
ByteStateSnapshotCohort can widely implement the deserializeSnapshot()
method.
Use ByteStateSnapshotCohort there, which will make further changes
easier.
JIRA: CONTROLLER-2134
Change-Id: I672f934ddaf31c8a5e2cfd5d17664fc8985c4858
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 28 Mar 2025 20:37:20 +0000 (21:37 +0100)]
Propagate start/stop to RaftStorage
RaftStorage needs to be an active component to deal with async
persistence, synchronizing persistent content, etc. This patch adds to
lifecycle hooks in RaftActor to control lifecycle.
JIRA: CONTROLLER-2134
Change-Id: I4e225d68628c696b4bcc80b9fbb04fd7d157ee2f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 28 Mar 2025 19:47:59 +0000 (20:47 +0100)]
Add FileBackedOutputStream.Configuration
Encapsulate the two options and propagate them to RaftStorage for later
use.
JIRA: CONTROLLER-2134
Change-Id: I341b8ef3891f355c59167ba9235a960b7672bca6
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 28 Mar 2025 16:36:52 +0000 (17:36 +0100)]
Use Path instead of String for temp directory
Let's be type-safe, so that things do not get mixed up.
JIRA: CONTROLLER-2134
Change-Id: I8f5400213fc3cf424d61d83379d90fabb192d58e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 28 Mar 2025 16:59:54 +0000 (17:59 +0100)]
Clean up sal-clustering-commons pom.xml
sal-clustering-commons on longer needs checker-qual, remove that
dependency. We can also enforce modernizer issues.
Change-Id: I587ae6df660f122c63c983d663cce0b6c1b3bab9
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 28 Mar 2025 10:10:14 +0000 (11:10 +0100)]
Introduce raft.spi.ByteArray
This patch replaces uses of ByteSource with InputStreamProvider, of
which ByteArray is a convenient implementation of. This eliminates the
need to use yangtools.concepts.Either, as we are expressing the two
options via class hierarchy.
JIRA: CONTROLLER-2134
Change-Id: I249f572fe8be0e64de796fdf7d48cbf2ba9b95c3
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 27 Mar 2025 23:34:46 +0000 (00:34 +0100)]
Rehost controller.cluster.io
FileBackedOutputStream is used to receive AppendEntries, which
eventually land in the RAFT journal. No ODL downstream is using this
facility, so it is fair to say that this is part of RAFT SPI.
This allows us to change cds-access-api's dependency from
sal-clustering-commons to raft-api. This is quite natural for the
datastore: at some point we want to expose a CommitInfo which contains
an EntryInfo reference to when the transaction was committed.
JIRA: CONTROLLER-2134
Change-Id: I7458e17055affd56b736f85476b004cb29a2b1d2
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 26 Mar 2025 11:35:34 +0000 (12:35 +0100)]
Refactor SnapshotSource
Reduce the number of classes by introducing InputStreamProvider.
JIRA: CONTROLLER-2134
Change-Id: I261c714da7871aa8ed9e66524ee6783faaadec1f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 11:28:52 +0000 (12:28 +0100)]
Propagate SnapshotFileFormat to RaftStore
Each RaftStore needs to have a preferred file format. Hook it to
use-lz4-compression, hardcoding to 256KiB block size, just as we do when
we transfer to followers.
JIRA: CONTROLLER-1423
Change-Id: I7a59f386abc250fe7f813175650ad9374f4711f4
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 10:25:46 +0000 (11:25 +0100)]
Remove InputOutputStreamFactory.lz4(String)
We do not have to use a String, just use the corresponding constant
directly. There is only one caller who can take ownership of the
corresponding code block.
Change-Id: Ie8bc8162f1cb47744e013fed1fee03f0af10fc03
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 07:13:51 +0000 (08:13 +0100)]
Add RaftStorage.start()/stop()
Add internal thread pool and two methods to control it.
JIRA: CONTROLLER-2134
Change-Id: I87060447b86d7358f2f3cc5f9598168bd963058c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 05:46:20 +0000 (06:46 +0100)]
Improve ShardManagerInfo
Define TargetBehavior to offload type mapping to JMX.
Change-Id: If7d39639ae90aaca18807dfb116c795d5b047d18
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 05:33:50 +0000 (06:33 +0100)]
Document default backup-datastore timeout
We have an implementation-specific default of 60 seconds, let's make
sure it is captured in the YANG model.
Change-Id: I66020e33c73c770cbed8bf9a6a0c592bd272c5b2
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 06:34:23 +0000 (07:34 +0100)]
Move sal-akka-raft
We have a raft/ top-level directory, move sal-akka-raft there. Also
switch it from using mdsal-parent to using bundle-parent.
Change-Id: Idb597dafa423723443a5e6e344e9f2b06c8b1410
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 03:00:07 +0000 (04:00 +0100)]
Rename findLatestSnapshot()
We really should be addint the interface to DataPersistenceProvider and
a 'tryLatestSnapshot()' is a better name.
JIRA: CONTROLLER-2134
Change-Id: I65af11051ff4cc473053186a7226ff6d052b98ac
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 05:29:56 +0000 (06:29 +0100)]
Add SnapshotFileFormat
This is a useful utility, which we will use to build our
LocalRaftStorage.
Change-Id: I32027581f8eb55da435310aa35e0bbac7ab4436e
JIRA: CONTROLLER-2134
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 15:40:53 +0000 (16:40 +0100)]
Add DataPersistenceProvider guidance
We will need to evolve this contract a bit, add more specific guidance.
JIRA: CONTROLLER-2134
Change-Id: Id1fcfb50fe96e7adfa8df089ac8dc9622240b770
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 00:47:35 +0000 (01:47 +0100)]
Move TermInfo to o.o.raft.api
This is a natural raft-api thing. Move it there and cover it with tests.
Change-Id: If4dc97b144bedbbad81c21ba6fe795b24078e9eb
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 23:37:59 +0000 (00:37 +0100)]
Promote (Immutable)RaftEntryMeta
These two constructs are raft.api material. Promote these two as:
- raft.api.EntryMeta as a replacement for RaftEntryMeta
- raft.api.EntryInfo as a replacement for ImmutableRaftEntryMeta
Change-Id: I93908e29f11ffad3342da7dc0f4a24678cecef35
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 21:11:50 +0000 (22:11 +0100)]
Split out raft-spi
We have a nice set of classes that could be more widely used. Let's
split them out. This has the neat benefit of making lz4-java an
implementation detail hidden from the outside world.
We also introduce odl-lz4 to package lz4-java independently of
everything else.
Change-Id: I3434bac53dba6935c022e9ab79fc40eead01e40b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 18:24:58 +0000 (19:24 +0100)]
Split out raft-api
We now have RaftRole to start a low-level RAFT API artifact. Introduce
raft-api, which holds RaftRole and its corresponding ServerRole. Use it
to improve type safety of ShardStatsMXBean.
Change-Id: I99eda413a9cac4c7fbae036512432a97630ef28c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:45:00 +0000 (14:45 +0100)]
Alpha-sort sal-akka-raft dependencies
Previous reformat has missed these, fix it up.
Change-Id: I54961bf1e42415974235d8e50fd539dea633433d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 10:07:55 +0000 (11:07 +0100)]
Add SnapshotSource
We will need to deal with snapshots being available for reading in
multiple formats. This patch add the concept of a SnapshotSource, with
two formats: plain and LZ4.
This ends up being a framework, but that is completely fine, as it is
transparent and provides a fixed amount of functionality.
We plug it into RaftStorage, as that is where it is going to be needed.
The two RaftStorage implementations run a no-op, with TODOs for later
implementation.
JIRA: CONTROLLER-2134
Change-Id: I99a8a967e6d08da681fa45082927b3039a523a49
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:13:28 +0000 (14:13 +0100)]
Enforce sal-akka-raft dependencies
We have squeeky-clean dependencies, make sure it stays that way.
Change-Id: Icfedbdd3c037b35edbef87c70e4466b3da09d681
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:10:11 +0000 (14:10 +0100)]
Remove broken Export-Package declaration
This is a day-zero bug: 'Export-Package' is mis-spelled as
'Export-package' and this is ineffective.
Remove the declaration and provide documentation for the remaining
DynamicImport-Package.
Change-Id: I807768a908747bc13bb51cea7fd7c4cb50f3fe6b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:06:48 +0000 (14:06 +0100)]
Remove slf4j-simple dependency
This dependency is always added by our parent pom, hence there is no
point in repeating it here. Also reformat pom.xml to follow style we use
in most places.
Change-Id: I13d071fd2799bfb2256e6c8831ddbe2fb4e08648
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 10:54:07 +0000 (11:54 +0100)]
Require memberId() for RaftStorage
We need a consistent way of logging, make sure RaftStorage has it.
JIRA: CONTROLLER-2134
Change-Id: I67ca04c5c7e45358276775285ea05fd58fb7ce3e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 18:50:08 +0000 (19:50 +0100)]
Factor out RaftStorage
We are in a place where we can start pulling the persistence apart. This
takes the first step by introducing RaftStorage and dropping a number of
FIXME for future evolution.
JIRA: CONTROLLER-2134
Change-Id: I658ce51cff971e39dc53b8f25e42123ad5d78b3e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 15:43:31 +0000 (16:43 +0100)]
Reintroduce ForwardingDataPersistenceProvider
This is a neat utility which allows us to sit in front of persistence,
forwarding to it -- without exposing our real implementations.
While we are here, also improve testApplyStateRace by scheduling
callback invocation after the delegate is done with it.
JIRA: CONTROLLER-2134
Change-Id: I3fb4a423a1302a568bc5eecb77f26d2b5d444a4f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 13:24:30 +0000 (14:24 +0100)]
Clean up OnDemandRaftState
Use RaftRole and fix the builder pattern to be properly immutable.
Change-Id: I4630a5d920d545ca615e72dd7a0a8a017e8b517a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 05:34:10 +0000 (06:34 +0100)]
Promote RaftState
Let's start a new package, opendaylight.raft.api, which holds a single
enumeration for now -- RaftRole. The change in name frees up 'RaftState'
for use by something that actually has state.
This also shows us that we have a bug in Example actor -- using an
illegal cast in cast of IsolatedLeader.
JIRA: CONTROLLER-2134
Change-Id: I50b25b4d13cf426285face63ed8be2a8356ab83d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 04:19:48 +0000 (05:19 +0100)]
Move cluster.notifications
Let's rehost these into sal-akka-raft for now, as that is where they are
used from. This allows us to tie 'role' with 'RaftState', leading to
improved type safety.
JIRA: CONTROLLER-2134
Change-Id: Icb4968774d89517486c50bff95393e8c68b8265b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 04:06:30 +0000 (05:06 +0100)]
Modernize RoleChanged/LeaderStateChanged
These are pure DTOs, use a record for that. While we're here, also make
sure to require memberId()/newRole(). The situation is a bit
complicated, because we use subclassing to carry more data.
JIRA: CONTROLLER-2134
Change-Id: I37a486dd75c1a313161908cfb287e61e18a5a401
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 02:36:40 +0000 (03:36 +0100)]
Remove RaftActor.currentTerm()
This method is not used anywhere, remove it.
JIRA: CONTROLLER-2134
Change-Id: I1ce49c91655086470f8c083629051bf5c8dbef40
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 02:26:58 +0000 (03:26 +0100)]
Introduce resetReplicatedLog()
A ton of our tests rely on replacing ReplicatedLog. This is a
huge-no-no, which is unfortunately exposed via RaftActorContext.
This patch splits of resetReplicatedLog(), which does the same thing,
except it has a different name and is amenable to being implemented
as a set of operations rather than a wholesale replacement.
JIRA: CONTROLLER-2137
Change-Id: I4774564b092fadbbd4f0f4a57f3d13e1234ec376
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 01:35:51 +0000 (02:35 +0100)]
Remove RaftActorContext.setCommitIndex()
Mass-migrate tests and eliminate the legacy method.
JIRA: CONTROLLER-2137
Change-Id: I6035aa1edf546d8d5ae519e5db3edd83ca632624
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 01:22:32 +0000 (02:22 +0100)]
Remove RaftActorContext.setLastApplied()
Mass-migrate tests and eliminate the legacy method.
JIRA: CONTROLLER-2137
Change-Id: I194b70975fe1cca97f3a81163410247d7a76e3c5
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 00:38:19 +0000 (01:38 +0100)]
Remove RaftActorContext compatibility getters
Mass-migrate tests to use ReplicatedLog for commitIndex/lastApplied.
JIRA: CONTROLLER-2137
Change-Id: I9f30b3afb0b7df05ef6238944f79fcba050e1c3d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 22:15:18 +0000 (23:15 +0100)]
Do not reset ReplicatedLog
Replacing ReplicatedLog is a rather bad thing, as we need to be mindful
when we can and cannot cache it.
This patch takes the first step towards making it an invariant:
introduce a resetToSnapshot() method which resets the state and take
advantage of that.
We also deprecate setReplicatedLog() for removal.
JIRA: CONTROLLER-2137
Change-Id: Ifc9ce0b0910214d5d0f314f26623fc4583fd1849
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 22:38:32 +0000 (23:38 +0100)]
Reduce use of ReplicatedLog.last()
We have a few call sites which are accessing the entry, whereas they
only need its metadata. Update them to not call last().
JIRA: CONTROLLER-2137
Change-Id: If13079a9a7d95cffb70a0befb877d8dcfb859fcd
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 20:34:36 +0000 (21:34 +0100)]
Clean up replicatedLog() references
This is a follow-up for the previous patch, cleaning up references to
deprecated methods and simplifying replicated log references.
JIRA: CONTROLLER-2137
Change-Id: Id95123119a11e7571f62c45a081fbb018a288f91
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 21:57:50 +0000 (22:57 +0100)]
Lock down AbstractReplicatedLog
Clean up this class and make most methods final.
Change-Id: I65eb72794131f32d3a4c524bbafd2ea1bf64726f
JIRA: CONTROLLER-2137
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 18:55:34 +0000 (19:55 +0100)]
Move commitIndex/lastApplied to ReplicatedLog
We have maintenance split between ReplicatedLog and RaftActorContext.
Move commitIndex and lastApplied to ReplicatedLog. This requires a bit
of shuffling in tests, as replacing the log would lose the changes made.
JIRA: CONTROLLER-2137
Change-Id: I6a6f4bb01abc2a8fb2da47110ae73c4d4ff1d9dc
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 18:26:50 +0000 (19:26 +0100)]
Reduce context.getCurrentBehavior() callers
RaftActor has a method to talk to RaftActorContext, use that instead of
direct calls.
JIRA: CONTROLLER-2134
Change-Id: Ib3ca80d6dacbb636f26f322ef99c5414a5a453e9
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 17:45:44 +0000 (18:45 +0100)]
Simplify AbstractReplicateLog constructor
The constructor has a ton of arguments for a single caller, really. Move
the code into ReplicateLogImpl.
JIRA: CONTROLLER-2134
Change-Id: Ia1184fe5d129ae72df6e777e6531371dd23b410a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 17:17:19 +0000 (18:17 +0100)]
Shorten call to isRecoveryApplicable()
We do not need to go through RaftActorContext, as RaftActor has the
information available.
JIRA: CONTROLLER-2134
Change-Id: I0e7f71ad7819dddfe5b2e25cf2e77ded1161375d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 17:01:52 +0000 (18:01 +0100)]
Update RaftActorContext.getCluster()
Use a nullable return instead of an optional.
Change-Id: I74f34c440e9364dee900b41fa39a96a3b31222c3
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 16:54:43 +0000 (17:54 +0100)]
Remove RaftActorContext.actorSelection()
This method is used only internally, as RaftActor can access its own
ActorContext. We also lock down a few methods and remove duplicate
implementation.
Change-Id: I08546d4f11bdb2ff62f6ab5c37957368b658c90b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 14:20:05 +0000 (15:20 +0100)]
Move SendHeartBeat
This message is used interally by AbstractLeader, move it there for
clarity. Also clean up message dispatch.
JIRA: CONTROLLER-2134
Change-Id: Ie18c478c24136c09910e887382b683fa5ecd3105
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 14:01:48 +0000 (15:01 +0100)]
Hide SnapshotComplete
This is a purely-internal message. Hide in SnapshotManager to prevent
shenanigans.
JIRA: CONTROLLER-2134
Change-Id: Ia3fbd3d6a76cc4c4b49ea99c93fc8545a793918f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 13:55:29 +0000 (14:55 +0100)]
Expose AbstractLeader.sendInstallSnapshot()
Do not abuse handleMessage() when dispatching from ShardManager to
AbstractLeader, but call the target method directly. This allows us to
completely hide SnapshotBytes -- and rename them back to SnapshotHolder.
JIRA: CONTROLLER-2136
Change-Id: I34962cc2060603ce1245581d2cd9727590f91e73
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 13:43:31 +0000 (14:43 +0100)]
Improve leader check
Comparing memberId() with getLeaderId() is really a check to see if we
are a leader. Replace it with an instanceof check.
JIRA: CONTROLLER-2134
Change-Id: Iabc79d7b80961588d1b449c1138063bf01f6c5a5
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 12:57:41 +0000 (13:57 +0100)]
Move SnapshotBytes propagation
We no longer need a Snapshot, hence we can move the propagation logic to
the single method which is invoking it.
JIRA: CONTROLLER-2134
Change-Id: I87544711d1a69825fa920f23c2893d58f757c753
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 12:48:24 +0000 (13:48 +0100)]
Refactor SendInstallSnapshot
SendInstallSnapshot contains a Snapshot, from which we only extract last
applied index/term to instantiate SnapshotHolder.
Merge SendInstallSnapshot and SnapshotHolder into a replacement class
called SnapshotBytes and adjust callers accordingly -- making the
contract simpler and easier to test.
JIRA: CONTROLLER-2134
Change-Id: I3d23176dda2322595679b4854cb8dabfb8d480f7
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 12:13:02 +0000 (13:13 +0100)]
Do not use Optional for snapshotHolder
We are in complete control of the lifecycle here, so Optional does not
provide any benefit. Ditch it and use a simple field.
JIRA: CONTROLLER-2134
Change-Id: I199a7d21fbec3adec9092024c6e5ca05376a5b9a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 11:48:15 +0000 (12:48 +0100)]
Clean up SnapshotHolder
The holder does not need a full snapshot and can be a simple record.
Also refactor test methods to hide it.
JIRA: CONTROLLER-2134
Change-Id: I84de057724003f844f0cb20aa60a32228298836c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 11:31:33 +0000 (12:31 +0100)]
Move SendInstallSnapshot
This class is used only by AbstractLeader, move it there and modernize
it.
JIRA: CONTROLLER-2134
Change-Id: I09282c69b029b3b7f52ad6024aafbf656b4dfa84
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:50:13 +0000 (11:50 +0100)]
Remove ReplicatedLogImpl.create()
Turn the two methods into simple constructors, updating callers.
JIRA: CONTROLLER-2134
Change-Id: I0d3a7836ae469363d7953eaffffe24d384eae3ad
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:44:20 +0000 (11:44 +0100)]
Annotate SnapshotManger.memberId()
Returned String cannot be null, annotate that fact.
Change-Id: Ic7148a054309df557b99267f5cff74a21f548abe
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:37:47 +0000 (11:37 +0100)]
Inline memberId in CandidateTest
We have a few stray callers to RaftActorContext.getId(), inline the
"candidate" memberId.
JIRA: CONTROLLER-2134
Change-Id: I33220c222e0eacb300cb5306e0a218068f57bda6
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:32:05 +0000 (11:32 +0100)]
Inline FollowerTest memberId checks
We are using RaftActorContext.get() to arrive at the constant "follower"
-- inline it instead.
JIRA: CONTROLLER-2134
Change-Id: If427fe33fb5b4e5ffb8032d6647223e05d16c5ee
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:27:13 +0000 (11:27 +0100)]
Rename RaftActorBehavior.getId()
We are using memberId() in other place, let's make sure to be
consistent. We also update callers of RaftActorContext.getId() to use
this method where possible.
JIRA: CONTROLLER-2134
Change-Id: If9afee1c546a60073e182f1af23a67a0c1e77fa9
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:26:11 +0000 (11:26 +0100)]
Clean up CandidateTest a bit
MessageCollectorActor.expectFirstMatching() calls can be a single-line
affair, do that.
Change-Id: I4fd6e68225d4e8d5a137212473c7df339241a5f0
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:01:21 +0000 (11:01 +0100)]
Reduce calls to RaftActorContext.getId()
RaftActorRecoverySupport has LocalAccess, hence we can get memberId from
there.
RaftActorServerConfigurationSupport has RaftActor, so let's get memberId
from there.
JIRA: CONTROLLER-2134
Change-Id: I78da1043ee8e52cac987fdbc66d5b3bdc1073a51
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 09:07:25 +0000 (10:07 +0100)]
Remove Snapshot.getElection{Term,VotedFor}
These methods are not used anywhere, remove them.
Change-Id: I5e0cf0d1979a7792651b6721453cdac6073b5b01
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 08:57:45 +0000 (09:57 +0100)]
Modernize CaptureSnapshot
Use List.of() and make the class final.
Change-Id: I3b23bf4d0ef600ce6117e1b0bdd51e607df95c27
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 02:41:34 +0000 (03:41 +0100)]
Remove SnapshotManager.computeLastAppliedEntry()
This method has a single caller, inline it.
JIRA: CONTROLLER-2134
Change-Id: Ife65ee3cdd3415476004d7b950801206e825f5f3
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 19:31:41 +0000 (20:31 +0100)]
Split up computeLastAppliedEntry()
There are two distinct cases here: we either have followers or we do
not. Split the two cases into separate methods.
JIRA: CONTROLLER-2134
Change-Id: Ice6247f29f30ca7382d4c6c95a3ff40b9c486585
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 18:48:28 +0000 (19:48 +0100)]
Switch AbstractReplicatedLogTest to JUnit5
This is trivial conversion.
Change-Id: I76a065c7a6de9d9478d9d6ddc47eb0c8b054d719
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 18:40:36 +0000 (19:40 +0100)]
Split out MockReplicatedLog
This class deserves to be outside of the test class, simplifying its
name.
JIRA: CONTROLLER-2134
Change-Id: I39052e6f4b6cf05dddbc025b2898239a1ff2d24d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 18:37:53 +0000 (19:37 +0100)]
Move SnapshotManager.computeLastAppliedEntry()
This method is tightly coupled to a ReplicatedLog and independent of
SnapshotManager. Move it to AbstractReplicatedLog for further evolution.
JIRA: CONTROLLER-2134
Change-Id: Ic87e4896094c166ff53a62c0d56eb1a0fbf38b1f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 17:12:48 +0000 (18:12 +0100)]
Modernuze FollowerInitialSyncUpStatus
This message should be just a plain record. Also clean up the
SyncStatusTracker internals to improve the logic, coupled with a
long-overdue clean up of SyncStatusTrackerTest.
Change-Id: If0cc3c838e433ba87fe614327f289ae2b0aa39ac
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 16:45:24 +0000 (17:45 +0100)]
Hide SyncStatusTracker
This class is only used by Follower, hence it does not need to be
exposed outside of the package.
Change-Id: I9e770e2c4b9fdc78d44aef8015c5e42d43f5f0dd
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 16:36:06 +0000 (17:36 +0100)]
Lock down Follower
Do not allow overriding methods other than those the tests need.
Change-Id: I0ab961116d1ee3be82060a821363f1905d51ff32
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 16:29:03 +0000 (17:29 +0100)]
Improve AbstractLeader class hierarchy
AbstractLeader.handleAppendEntriesReply() always results in 'this' being
returned and is overridden in the three subclasses.
Rename it to processAppendEntriesReply(), without the ability to change
behavior and make it final. The subclasses then use it as a common
utility, doing their own thing as needed.
Since Leader is subclassed in tests, we lock down all its methods except
the single one that is being overridden.
Change-Id: I594bebedfa612f7e946d97040ea55c053c2c9f3d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 16:08:03 +0000 (17:08 +0100)]
Improve sal-akka-raft test assertions
We have a number of assertions which just check RaftState value -- but
we really should be checking which behaviour we are servicing.
Change-Id: If2a9d898b0e237e7c1886d298c4584b94149f65e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 15:29:36 +0000 (16:29 +0100)]
Refactor SwitchBehavior
We only support switching to leader or to follower, whereas our messages
allow for any RaftState -- leading to us needing to place explicit
guards.
This patch makes SwitchBehavior a sealed interface with two possible
specializations: BecomeFollower and BecomeLeader. The result is more
expressive code without the need for RaftActorBehavior.createBehavior()
dispatch.
Change-Id: Ib9da61012f1814ef6786992aa2ddbe1d2c72040a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 14:41:42 +0000 (15:41 +0100)]
Reduce use of getRaftActorContext()()
Internal use here is just plain clutter. Fix that.
Change-Id: I826424aac10f00f1162e5909f579238044b46985
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 14:34:31 +0000 (15:34 +0100)]
Hide RaftActorBehaviour.switchBehavior()
Eliminate indirection through RaftState, allowing us to use a single
method for switching behaviors.
Change-Id: I90ff03ed1dc6f82d8b1d917ff6c7ab97cd474ff1
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 14:04:31 +0000 (15:04 +0100)]
Clean up Follower reinitialization
Use an instanceof pattern to talk directly to Follower to obtain a copy
-- making it a tad cleaner.
Change-Id: I34213d4d690a6bdce96febd6e2f51d77ee14effd
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 13:49:47 +0000 (14:49 +0100)]
Lock down behaviors
A number of improvements:
- hide constructors
- enforce leader transitions
- make IsolaterLeader/PreLeader final
Change-Id: I54101d56af4a7b99051d14444c01d39bf2c35c6f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 12:54:53 +0000 (13:54 +0100)]
Button down ShardSnapshotCohort
The output stream is now always non-null: button this down, simplifying
things a little bit.
JIRA: CONTROLLER-2134
Change-Id: I339c13742e4b7e02b353c2a5fd8d879deb28bb29
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 10:39:21 +0000 (11:39 +0100)]
Remove SnapshotManager.convertSnapshot()
This is a useless indirection, just inline it into the single caller.
JIRA: CONTROLLER-2134
Change-Id: I87830a1e57623c630de037c8bfe345792c33b398
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 21:35:36 +0000 (22:35 +0100)]
Require output stream
Make sure we specialize createSnapshot() to be the alternative to
takeSnapshot().
Change-Id: I75caf958c6bf2969dc2fd604508b08cbc1e670a4
JIRA: CONTROLLER-2134
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 15:10:49 +0000 (16:10 +0100)]
Use takeSnapshot() in normal stream captures
We only need asynchronous callout during captureToInstall(), otherwise
we can use the much simpler interface, which does not need a turnaround
through CaptureSnapshotReply.
This necessitates a bit of an update to the test suites, where we switch
from capture() to captureToInstall() to catch the negative scenarios.
Integration-level tests need to be updated as well: since we now invoke
persist() directly, we need to fiddle with SaveSnapshotSuccess instead
of CaptureSnapshotReply.
JIRA: CONTROLLER-2134
Change-Id: I18ffb7c062f52474d5df3c581d997072dac04e0e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 10:07:57 +0000 (11:07 +0100)]
Fix ShardTest capture trigger
The test here operates on a live actor, but it invokes
ShardManager.capture() outside of actor confinement. This means that if
any processing in ShardManager decides to send a message to the actor,
the actor will process asynchronously.
Fix this by tickling ShardManager via executeInSelf().
Change-Id: I6a26d98b14ee2d4101f8b036a573baa41d5148ee
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 16:48:12 +0000 (17:48 +0100)]
Override takeSnapshot() in mocks
We have two specialized mocks, which override default forwarding
behaviour of MockRaftActor.createSnapshot(). This adds overrides of
takeSnapshot(), so that the behaviour matches.
JIRA: CONTROLLER-2134
Change-Id: Idde23e1ba42d98b7f228546b25b03063ba3998ba
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 16:45:39 +0000 (17:45 +0100)]
Clean up whitespace
We have an a superfluous empty line and a space -- remove both.
Change-Id: Ib915c8ddab13eaee39ccc0252b26eb711aa41664
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>