Robert Varga [Tue, 25 Mar 2025 11:28:52 +0000 (12:28 +0100)]
Propagate SnapshotFileFormat to RaftStore
Each RaftStore needs to have a preferred file format. Hook it to
use-lz4-compression, hardcoding to 256KiB block size, just as we do when
we transfer to followers.
JIRA: CONTROLLER-1423
Change-Id: I7a59f386abc250fe7f813175650ad9374f4711f4
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 10:25:46 +0000 (11:25 +0100)]
Remove InputOutputStreamFactory.lz4(String)
We do not have to use a String, just use the corresponding constant
directly. There is only one caller who can take ownership of the
corresponding code block.
Change-Id: Ie8bc8162f1cb47744e013fed1fee03f0af10fc03
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 07:13:51 +0000 (08:13 +0100)]
Add RaftStorage.start()/stop()
Add internal thread pool and two methods to control it.
JIRA: CONTROLLER-2134
Change-Id: I87060447b86d7358f2f3cc5f9598168bd963058c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 05:46:20 +0000 (06:46 +0100)]
Improve ShardManagerInfo
Define TargetBehavior to offload type mapping to JMX.
Change-Id: If7d39639ae90aaca18807dfb116c795d5b047d18
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 05:33:50 +0000 (06:33 +0100)]
Document default backup-datastore timeout
We have an implementation-specific default of 60 seconds, let's make
sure it is captured in the YANG model.
Change-Id: I66020e33c73c770cbed8bf9a6a0c592bd272c5b2
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 06:34:23 +0000 (07:34 +0100)]
Move sal-akka-raft
We have a raft/ top-level directory, move sal-akka-raft there. Also
switch it from using mdsal-parent to using bundle-parent.
Change-Id: Idb597dafa423723443a5e6e344e9f2b06c8b1410
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 03:00:07 +0000 (04:00 +0100)]
Rename findLatestSnapshot()
We really should be addint the interface to DataPersistenceProvider and
a 'tryLatestSnapshot()' is a better name.
JIRA: CONTROLLER-2134
Change-Id: I65af11051ff4cc473053186a7226ff6d052b98ac
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 05:29:56 +0000 (06:29 +0100)]
Add SnapshotFileFormat
This is a useful utility, which we will use to build our
LocalRaftStorage.
Change-Id: I32027581f8eb55da435310aa35e0bbac7ab4436e
JIRA: CONTROLLER-2134
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 15:40:53 +0000 (16:40 +0100)]
Add DataPersistenceProvider guidance
We will need to evolve this contract a bit, add more specific guidance.
JIRA: CONTROLLER-2134
Change-Id: Id1fcfb50fe96e7adfa8df089ac8dc9622240b770
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 00:47:35 +0000 (01:47 +0100)]
Move TermInfo to o.o.raft.api
This is a natural raft-api thing. Move it there and cover it with tests.
Change-Id: If4dc97b144bedbbad81c21ba6fe795b24078e9eb
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 23:37:59 +0000 (00:37 +0100)]
Promote (Immutable)RaftEntryMeta
These two constructs are raft.api material. Promote these two as:
- raft.api.EntryMeta as a replacement for RaftEntryMeta
- raft.api.EntryInfo as a replacement for ImmutableRaftEntryMeta
Change-Id: I93908e29f11ffad3342da7dc0f4a24678cecef35
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 21:11:50 +0000 (22:11 +0100)]
Split out raft-spi
We have a nice set of classes that could be more widely used. Let's
split them out. This has the neat benefit of making lz4-java an
implementation detail hidden from the outside world.
We also introduce odl-lz4 to package lz4-java independently of
everything else.
Change-Id: I3434bac53dba6935c022e9ab79fc40eead01e40b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 18:24:58 +0000 (19:24 +0100)]
Split out raft-api
We now have RaftRole to start a low-level RAFT API artifact. Introduce
raft-api, which holds RaftRole and its corresponding ServerRole. Use it
to improve type safety of ShardStatsMXBean.
Change-Id: I99eda413a9cac4c7fbae036512432a97630ef28c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:45:00 +0000 (14:45 +0100)]
Alpha-sort sal-akka-raft dependencies
Previous reformat has missed these, fix it up.
Change-Id: I54961bf1e42415974235d8e50fd539dea633433d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 10:07:55 +0000 (11:07 +0100)]
Add SnapshotSource
We will need to deal with snapshots being available for reading in
multiple formats. This patch add the concept of a SnapshotSource, with
two formats: plain and LZ4.
This ends up being a framework, but that is completely fine, as it is
transparent and provides a fixed amount of functionality.
We plug it into RaftStorage, as that is where it is going to be needed.
The two RaftStorage implementations run a no-op, with TODOs for later
implementation.
JIRA: CONTROLLER-2134
Change-Id: I99a8a967e6d08da681fa45082927b3039a523a49
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:13:28 +0000 (14:13 +0100)]
Enforce sal-akka-raft dependencies
We have squeeky-clean dependencies, make sure it stays that way.
Change-Id: Icfedbdd3c037b35edbef87c70e4466b3da09d681
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:10:11 +0000 (14:10 +0100)]
Remove broken Export-Package declaration
This is a day-zero bug: 'Export-Package' is mis-spelled as
'Export-package' and this is ineffective.
Remove the declaration and provide documentation for the remaining
DynamicImport-Package.
Change-Id: I807768a908747bc13bb51cea7fd7c4cb50f3fe6b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:06:48 +0000 (14:06 +0100)]
Remove slf4j-simple dependency
This dependency is always added by our parent pom, hence there is no
point in repeating it here. Also reformat pom.xml to follow style we use
in most places.
Change-Id: I13d071fd2799bfb2256e6c8831ddbe2fb4e08648
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 10:54:07 +0000 (11:54 +0100)]
Require memberId() for RaftStorage
We need a consistent way of logging, make sure RaftStorage has it.
JIRA: CONTROLLER-2134
Change-Id: I67ca04c5c7e45358276775285ea05fd58fb7ce3e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 18:50:08 +0000 (19:50 +0100)]
Factor out RaftStorage
We are in a place where we can start pulling the persistence apart. This
takes the first step by introducing RaftStorage and dropping a number of
FIXME for future evolution.
JIRA: CONTROLLER-2134
Change-Id: I658ce51cff971e39dc53b8f25e42123ad5d78b3e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 15:43:31 +0000 (16:43 +0100)]
Reintroduce ForwardingDataPersistenceProvider
This is a neat utility which allows us to sit in front of persistence,
forwarding to it -- without exposing our real implementations.
While we are here, also improve testApplyStateRace by scheduling
callback invocation after the delegate is done with it.
JIRA: CONTROLLER-2134
Change-Id: I3fb4a423a1302a568bc5eecb77f26d2b5d444a4f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 13:24:30 +0000 (14:24 +0100)]
Clean up OnDemandRaftState
Use RaftRole and fix the builder pattern to be properly immutable.
Change-Id: I4630a5d920d545ca615e72dd7a0a8a017e8b517a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 05:34:10 +0000 (06:34 +0100)]
Promote RaftState
Let's start a new package, opendaylight.raft.api, which holds a single
enumeration for now -- RaftRole. The change in name frees up 'RaftState'
for use by something that actually has state.
This also shows us that we have a bug in Example actor -- using an
illegal cast in cast of IsolatedLeader.
JIRA: CONTROLLER-2134
Change-Id: I50b25b4d13cf426285face63ed8be2a8356ab83d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 04:19:48 +0000 (05:19 +0100)]
Move cluster.notifications
Let's rehost these into sal-akka-raft for now, as that is where they are
used from. This allows us to tie 'role' with 'RaftState', leading to
improved type safety.
JIRA: CONTROLLER-2134
Change-Id: Icb4968774d89517486c50bff95393e8c68b8265b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 04:06:30 +0000 (05:06 +0100)]
Modernize RoleChanged/LeaderStateChanged
These are pure DTOs, use a record for that. While we're here, also make
sure to require memberId()/newRole(). The situation is a bit
complicated, because we use subclassing to carry more data.
JIRA: CONTROLLER-2134
Change-Id: I37a486dd75c1a313161908cfb287e61e18a5a401
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 02:36:40 +0000 (03:36 +0100)]
Remove RaftActor.currentTerm()
This method is not used anywhere, remove it.
JIRA: CONTROLLER-2134
Change-Id: I1ce49c91655086470f8c083629051bf5c8dbef40
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 02:26:58 +0000 (03:26 +0100)]
Introduce resetReplicatedLog()
A ton of our tests rely on replacing ReplicatedLog. This is a
huge-no-no, which is unfortunately exposed via RaftActorContext.
This patch splits of resetReplicatedLog(), which does the same thing,
except it has a different name and is amenable to being implemented
as a set of operations rather than a wholesale replacement.
JIRA: CONTROLLER-2137
Change-Id: I4774564b092fadbbd4f0f4a57f3d13e1234ec376
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 01:35:51 +0000 (02:35 +0100)]
Remove RaftActorContext.setCommitIndex()
Mass-migrate tests and eliminate the legacy method.
JIRA: CONTROLLER-2137
Change-Id: I6035aa1edf546d8d5ae519e5db3edd83ca632624
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 01:22:32 +0000 (02:22 +0100)]
Remove RaftActorContext.setLastApplied()
Mass-migrate tests and eliminate the legacy method.
JIRA: CONTROLLER-2137
Change-Id: I194b70975fe1cca97f3a81163410247d7a76e3c5
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 00:38:19 +0000 (01:38 +0100)]
Remove RaftActorContext compatibility getters
Mass-migrate tests to use ReplicatedLog for commitIndex/lastApplied.
JIRA: CONTROLLER-2137
Change-Id: I9f30b3afb0b7df05ef6238944f79fcba050e1c3d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 22:15:18 +0000 (23:15 +0100)]
Do not reset ReplicatedLog
Replacing ReplicatedLog is a rather bad thing, as we need to be mindful
when we can and cannot cache it.
This patch takes the first step towards making it an invariant:
introduce a resetToSnapshot() method which resets the state and take
advantage of that.
We also deprecate setReplicatedLog() for removal.
JIRA: CONTROLLER-2137
Change-Id: Ifc9ce0b0910214d5d0f314f26623fc4583fd1849
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 22:38:32 +0000 (23:38 +0100)]
Reduce use of ReplicatedLog.last()
We have a few call sites which are accessing the entry, whereas they
only need its metadata. Update them to not call last().
JIRA: CONTROLLER-2137
Change-Id: If13079a9a7d95cffb70a0befb877d8dcfb859fcd
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 20:34:36 +0000 (21:34 +0100)]
Clean up replicatedLog() references
This is a follow-up for the previous patch, cleaning up references to
deprecated methods and simplifying replicated log references.
JIRA: CONTROLLER-2137
Change-Id: Id95123119a11e7571f62c45a081fbb018a288f91
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 21:57:50 +0000 (22:57 +0100)]
Lock down AbstractReplicatedLog
Clean up this class and make most methods final.
Change-Id: I65eb72794131f32d3a4c524bbafd2ea1bf64726f
JIRA: CONTROLLER-2137
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 18:55:34 +0000 (19:55 +0100)]
Move commitIndex/lastApplied to ReplicatedLog
We have maintenance split between ReplicatedLog and RaftActorContext.
Move commitIndex and lastApplied to ReplicatedLog. This requires a bit
of shuffling in tests, as replacing the log would lose the changes made.
JIRA: CONTROLLER-2137
Change-Id: I6a6f4bb01abc2a8fb2da47110ae73c4d4ff1d9dc
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 18:26:50 +0000 (19:26 +0100)]
Reduce context.getCurrentBehavior() callers
RaftActor has a method to talk to RaftActorContext, use that instead of
direct calls.
JIRA: CONTROLLER-2134
Change-Id: Ib3ca80d6dacbb636f26f322ef99c5414a5a453e9
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 17:45:44 +0000 (18:45 +0100)]
Simplify AbstractReplicateLog constructor
The constructor has a ton of arguments for a single caller, really. Move
the code into ReplicateLogImpl.
JIRA: CONTROLLER-2134
Change-Id: Ia1184fe5d129ae72df6e777e6531371dd23b410a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 17:17:19 +0000 (18:17 +0100)]
Shorten call to isRecoveryApplicable()
We do not need to go through RaftActorContext, as RaftActor has the
information available.
JIRA: CONTROLLER-2134
Change-Id: I0e7f71ad7819dddfe5b2e25cf2e77ded1161375d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 17:01:52 +0000 (18:01 +0100)]
Update RaftActorContext.getCluster()
Use a nullable return instead of an optional.
Change-Id: I74f34c440e9364dee900b41fa39a96a3b31222c3
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 16:54:43 +0000 (17:54 +0100)]
Remove RaftActorContext.actorSelection()
This method is used only internally, as RaftActor can access its own
ActorContext. We also lock down a few methods and remove duplicate
implementation.
Change-Id: I08546d4f11bdb2ff62f6ab5c37957368b658c90b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 14:20:05 +0000 (15:20 +0100)]
Move SendHeartBeat
This message is used interally by AbstractLeader, move it there for
clarity. Also clean up message dispatch.
JIRA: CONTROLLER-2134
Change-Id: Ie18c478c24136c09910e887382b683fa5ecd3105
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 14:01:48 +0000 (15:01 +0100)]
Hide SnapshotComplete
This is a purely-internal message. Hide in SnapshotManager to prevent
shenanigans.
JIRA: CONTROLLER-2134
Change-Id: Ia3fbd3d6a76cc4c4b49ea99c93fc8545a793918f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 13:55:29 +0000 (14:55 +0100)]
Expose AbstractLeader.sendInstallSnapshot()
Do not abuse handleMessage() when dispatching from ShardManager to
AbstractLeader, but call the target method directly. This allows us to
completely hide SnapshotBytes -- and rename them back to SnapshotHolder.
JIRA: CONTROLLER-2136
Change-Id: I34962cc2060603ce1245581d2cd9727590f91e73
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 13:43:31 +0000 (14:43 +0100)]
Improve leader check
Comparing memberId() with getLeaderId() is really a check to see if we
are a leader. Replace it with an instanceof check.
JIRA: CONTROLLER-2134
Change-Id: Iabc79d7b80961588d1b449c1138063bf01f6c5a5
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 12:57:41 +0000 (13:57 +0100)]
Move SnapshotBytes propagation
We no longer need a Snapshot, hence we can move the propagation logic to
the single method which is invoking it.
JIRA: CONTROLLER-2134
Change-Id: I87544711d1a69825fa920f23c2893d58f757c753
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 12:48:24 +0000 (13:48 +0100)]
Refactor SendInstallSnapshot
SendInstallSnapshot contains a Snapshot, from which we only extract last
applied index/term to instantiate SnapshotHolder.
Merge SendInstallSnapshot and SnapshotHolder into a replacement class
called SnapshotBytes and adjust callers accordingly -- making the
contract simpler and easier to test.
JIRA: CONTROLLER-2134
Change-Id: I3d23176dda2322595679b4854cb8dabfb8d480f7
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 12:13:02 +0000 (13:13 +0100)]
Do not use Optional for snapshotHolder
We are in complete control of the lifecycle here, so Optional does not
provide any benefit. Ditch it and use a simple field.
JIRA: CONTROLLER-2134
Change-Id: I199a7d21fbec3adec9092024c6e5ca05376a5b9a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 11:48:15 +0000 (12:48 +0100)]
Clean up SnapshotHolder
The holder does not need a full snapshot and can be a simple record.
Also refactor test methods to hide it.
JIRA: CONTROLLER-2134
Change-Id: I84de057724003f844f0cb20aa60a32228298836c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 11:31:33 +0000 (12:31 +0100)]
Move SendInstallSnapshot
This class is used only by AbstractLeader, move it there and modernize
it.
JIRA: CONTROLLER-2134
Change-Id: I09282c69b029b3b7f52ad6024aafbf656b4dfa84
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:50:13 +0000 (11:50 +0100)]
Remove ReplicatedLogImpl.create()
Turn the two methods into simple constructors, updating callers.
JIRA: CONTROLLER-2134
Change-Id: I0d3a7836ae469363d7953eaffffe24d384eae3ad
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:44:20 +0000 (11:44 +0100)]
Annotate SnapshotManger.memberId()
Returned String cannot be null, annotate that fact.
Change-Id: Ic7148a054309df557b99267f5cff74a21f548abe
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:37:47 +0000 (11:37 +0100)]
Inline memberId in CandidateTest
We have a few stray callers to RaftActorContext.getId(), inline the
"candidate" memberId.
JIRA: CONTROLLER-2134
Change-Id: I33220c222e0eacb300cb5306e0a218068f57bda6
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:32:05 +0000 (11:32 +0100)]
Inline FollowerTest memberId checks
We are using RaftActorContext.get() to arrive at the constant "follower"
-- inline it instead.
JIRA: CONTROLLER-2134
Change-Id: If427fe33fb5b4e5ffb8032d6647223e05d16c5ee
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:27:13 +0000 (11:27 +0100)]
Rename RaftActorBehavior.getId()
We are using memberId() in other place, let's make sure to be
consistent. We also update callers of RaftActorContext.getId() to use
this method where possible.
JIRA: CONTROLLER-2134
Change-Id: If9afee1c546a60073e182f1af23a67a0c1e77fa9
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:26:11 +0000 (11:26 +0100)]
Clean up CandidateTest a bit
MessageCollectorActor.expectFirstMatching() calls can be a single-line
affair, do that.
Change-Id: I4fd6e68225d4e8d5a137212473c7df339241a5f0
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 10:01:21 +0000 (11:01 +0100)]
Reduce calls to RaftActorContext.getId()
RaftActorRecoverySupport has LocalAccess, hence we can get memberId from
there.
RaftActorServerConfigurationSupport has RaftActor, so let's get memberId
from there.
JIRA: CONTROLLER-2134
Change-Id: I78da1043ee8e52cac987fdbc66d5b3bdc1073a51
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 09:07:25 +0000 (10:07 +0100)]
Remove Snapshot.getElection{Term,VotedFor}
These methods are not used anywhere, remove them.
Change-Id: I5e0cf0d1979a7792651b6721453cdac6073b5b01
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 08:57:45 +0000 (09:57 +0100)]
Modernize CaptureSnapshot
Use List.of() and make the class final.
Change-Id: I3b23bf4d0ef600ce6117e1b0bdd51e607df95c27
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 02:41:34 +0000 (03:41 +0100)]
Remove SnapshotManager.computeLastAppliedEntry()
This method has a single caller, inline it.
JIRA: CONTROLLER-2134
Change-Id: Ife65ee3cdd3415476004d7b950801206e825f5f3
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 19:31:41 +0000 (20:31 +0100)]
Split up computeLastAppliedEntry()
There are two distinct cases here: we either have followers or we do
not. Split the two cases into separate methods.
JIRA: CONTROLLER-2134
Change-Id: Ice6247f29f30ca7382d4c6c95a3ff40b9c486585
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 18:48:28 +0000 (19:48 +0100)]
Switch AbstractReplicatedLogTest to JUnit5
This is trivial conversion.
Change-Id: I76a065c7a6de9d9478d9d6ddc47eb0c8b054d719
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 18:40:36 +0000 (19:40 +0100)]
Split out MockReplicatedLog
This class deserves to be outside of the test class, simplifying its
name.
JIRA: CONTROLLER-2134
Change-Id: I39052e6f4b6cf05dddbc025b2898239a1ff2d24d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 18:37:53 +0000 (19:37 +0100)]
Move SnapshotManager.computeLastAppliedEntry()
This method is tightly coupled to a ReplicatedLog and independent of
SnapshotManager. Move it to AbstractReplicatedLog for further evolution.
JIRA: CONTROLLER-2134
Change-Id: Ic87e4896094c166ff53a62c0d56eb1a0fbf38b1f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 17:12:48 +0000 (18:12 +0100)]
Modernuze FollowerInitialSyncUpStatus
This message should be just a plain record. Also clean up the
SyncStatusTracker internals to improve the logic, coupled with a
long-overdue clean up of SyncStatusTrackerTest.
Change-Id: If0cc3c838e433ba87fe614327f289ae2b0aa39ac
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 16:45:24 +0000 (17:45 +0100)]
Hide SyncStatusTracker
This class is only used by Follower, hence it does not need to be
exposed outside of the package.
Change-Id: I9e770e2c4b9fdc78d44aef8015c5e42d43f5f0dd
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 16:36:06 +0000 (17:36 +0100)]
Lock down Follower
Do not allow overriding methods other than those the tests need.
Change-Id: I0ab961116d1ee3be82060a821363f1905d51ff32
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 16:29:03 +0000 (17:29 +0100)]
Improve AbstractLeader class hierarchy
AbstractLeader.handleAppendEntriesReply() always results in 'this' being
returned and is overridden in the three subclasses.
Rename it to processAppendEntriesReply(), without the ability to change
behavior and make it final. The subclasses then use it as a common
utility, doing their own thing as needed.
Since Leader is subclassed in tests, we lock down all its methods except
the single one that is being overridden.
Change-Id: I594bebedfa612f7e946d97040ea55c053c2c9f3d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 16:08:03 +0000 (17:08 +0100)]
Improve sal-akka-raft test assertions
We have a number of assertions which just check RaftState value -- but
we really should be checking which behaviour we are servicing.
Change-Id: If2a9d898b0e237e7c1886d298c4584b94149f65e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 15:29:36 +0000 (16:29 +0100)]
Refactor SwitchBehavior
We only support switching to leader or to follower, whereas our messages
allow for any RaftState -- leading to us needing to place explicit
guards.
This patch makes SwitchBehavior a sealed interface with two possible
specializations: BecomeFollower and BecomeLeader. The result is more
expressive code without the need for RaftActorBehavior.createBehavior()
dispatch.
Change-Id: Ib9da61012f1814ef6786992aa2ddbe1d2c72040a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 14:41:42 +0000 (15:41 +0100)]
Reduce use of getRaftActorContext()()
Internal use here is just plain clutter. Fix that.
Change-Id: I826424aac10f00f1162e5909f579238044b46985
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 14:34:31 +0000 (15:34 +0100)]
Hide RaftActorBehaviour.switchBehavior()
Eliminate indirection through RaftState, allowing us to use a single
method for switching behaviors.
Change-Id: I90ff03ed1dc6f82d8b1d917ff6c7ab97cd474ff1
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 14:04:31 +0000 (15:04 +0100)]
Clean up Follower reinitialization
Use an instanceof pattern to talk directly to Follower to obtain a copy
-- making it a tad cleaner.
Change-Id: I34213d4d690a6bdce96febd6e2f51d77ee14effd
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 13:49:47 +0000 (14:49 +0100)]
Lock down behaviors
A number of improvements:
- hide constructors
- enforce leader transitions
- make IsolaterLeader/PreLeader final
Change-Id: I54101d56af4a7b99051d14444c01d39bf2c35c6f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 12:54:53 +0000 (13:54 +0100)]
Button down ShardSnapshotCohort
The output stream is now always non-null: button this down, simplifying
things a little bit.
JIRA: CONTROLLER-2134
Change-Id: I339c13742e4b7e02b353c2a5fd8d879deb28bb29
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 10:39:21 +0000 (11:39 +0100)]
Remove SnapshotManager.convertSnapshot()
This is a useless indirection, just inline it into the single caller.
JIRA: CONTROLLER-2134
Change-Id: I87830a1e57623c630de037c8bfe345792c33b398
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 21:35:36 +0000 (22:35 +0100)]
Require output stream
Make sure we specialize createSnapshot() to be the alternative to
takeSnapshot().
Change-Id: I75caf958c6bf2969dc2fd604508b08cbc1e670a4
JIRA: CONTROLLER-2134
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 15:10:49 +0000 (16:10 +0100)]
Use takeSnapshot() in normal stream captures
We only need asynchronous callout during captureToInstall(), otherwise
we can use the much simpler interface, which does not need a turnaround
through CaptureSnapshotReply.
This necessitates a bit of an update to the test suites, where we switch
from capture() to captureToInstall() to catch the negative scenarios.
Integration-level tests need to be updated as well: since we now invoke
persist() directly, we need to fiddle with SaveSnapshotSuccess instead
of CaptureSnapshotReply.
JIRA: CONTROLLER-2134
Change-Id: I18ffb7c062f52474d5df3c581d997072dac04e0e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 21 Mar 2025 10:07:57 +0000 (11:07 +0100)]
Fix ShardTest capture trigger
The test here operates on a live actor, but it invokes
ShardManager.capture() outside of actor confinement. This means that if
any processing in ShardManager decides to send a message to the actor,
the actor will process asynchronously.
Fix this by tickling ShardManager via executeInSelf().
Change-Id: I6a26d98b14ee2d4101f8b036a573baa41d5148ee
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 16:48:12 +0000 (17:48 +0100)]
Override takeSnapshot() in mocks
We have two specialized mocks, which override default forwarding
behaviour of MockRaftActor.createSnapshot(). This adds overrides of
takeSnapshot(), so that the behaviour matches.
JIRA: CONTROLLER-2134
Change-Id: Idde23e1ba42d98b7f228546b25b03063ba3998ba
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 16:45:39 +0000 (17:45 +0100)]
Clean up whitespace
We have an a superfluous empty line and a space -- remove both.
Change-Id: Ib915c8ddab13eaee39ccc0252b26eb711aa41664
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 16:06:54 +0000 (17:06 +0100)]
Enforce Snapshot consistency
Make sure we do not allow nulls where inappropriate.
Change-Id: Ibb3f664f4702b6f8e87ac67b9450f63c7d3b21de
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 16:05:34 +0000 (17:05 +0100)]
Clean up ternary operator use
The colon should be on a new line, fix that.
Change-Id: I4d4a51664495bc6d949429c7b4f1d60c66994d63
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 16:05:09 +0000 (17:05 +0100)]
Remove unused logger
We are not using this logger anymore, remove it.
Change-Id: I38a6749dcc02cd7cc19c86c81f0b2c1c98a435bc
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 14:27:13 +0000 (15:27 +0100)]
Remove GetSnapshot.timeout()
We are using takeSnapshot(), which means there is absolutely no point in
propagating the timeout.
What we do instead is we use the specified timeout for Patterns.ask(),
which actually leads to a more correct behaviour.
JIRA: CONTROLLER-2134
Change-Id: I2b3fdd9af695df31562a2c620735544721c0e078
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 14:07:28 +0000 (15:07 +0100)]
Use takeSnapshot() to service GetSnapshot
The code to service GetSnapshot requests does not serialize the snapshot
itself: use takeSnapshot() to service it.
This renders GetSnapshotReplyActor superfluous, hence we remove it as
well as timeout tests.
JIRA: CONTROLLER-2134
Change-Id: Iabeb1a042fa3410d31d4c1f0298b3dedb8cafc72
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 13:54:12 +0000 (14:54 +0100)]
Add RaftActorSnapshotCohort.takeSnapshot()
We only have a complicated way of acquiring snapshots, but there are use
cases, where we actually want to to take a snapshot without serializing
it -- which can be a convenient synchronous operation.
This patch adds that takeSnapshot(), which is much easier to use.
JIRA: CONTROLLER-2134
Change-Id: If0a0af0b21e8a90a06eff14f3a23772d3af07450
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 12:34:38 +0000 (13:34 +0100)]
Reduce use of context.getId()
We have memberId() available, use that instead of getId().
Change-Id: I2416ae2a3d9feaff3a99a46ba7835c0931429c06
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 12:18:35 +0000 (13:18 +0100)]
Reduce use of Optional
While Optional looks cool, its use in arguments is not. Ditch it in
favor of @Nullable in SnapshotManager.commit().
JIRA: CONTROLLER-2134
Change-Id: Id03af74b8a0cff87ebb96b15a7a1ac5cb3744e96
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 10:46:12 +0000 (11:46 +0100)]
Make RaftActorSnapshotCohort type-safe
We are dancing around with State, where not every case is really
possible. We are about to make State handoff more dicey, so this patch
ensures that we have the cohorts advertize their supported class.
JIRA: CONTROLLER-2134
Change-Id: Iaea1e580ea8c806ff147a428330a6a9b407e730b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 10:31:56 +0000 (11:31 +0100)]
Split out MockSnapshotState
This is a simplistic class, use a standalone record for it.
JIRA: CONTROLLER-2134
Change-Id: Ida1f5abf59df589972eefddccc07b9f87f7ec33b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 09:37:53 +0000 (10:37 +0100)]
Split off request allocation
Move request allocation to the three callers, improving modularity of
the methods. Also ensure that public method are clustered together.
JIRA: CONTROLLER-2134
Change-Id: I879594f328d96028294c599c1b6845bd34e91a37
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 09:17:12 +0000 (10:17 +0100)]
Invert state checking
We have over-generalized capture() method, which we will need to break
down. This takes the first step to perform basic valid-or-bail first.
JIRA: CONTROLLER-2134
Change-Id: I590bea5642e8779edd3f65927821bc8ecb63fd4c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 08:06:13 +0000 (09:06 +0100)]
Close executor via try-with-resources
ExecutorService is now AutoCloseable, take advantage of that.
Change-Id: Idcc244d1945d157976b0e402f7d9e2eba733167f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 20 Mar 2025 02:54:19 +0000 (03:54 +0100)]
Improve RaftActor(Context) interactions
RaftActor.getSnapshot() can be simplified by:
- using memberId()
- capturing common bits from RaftActorContext before capturing the
snapshot
- using ActorContext.actorOf() directly
This eliminates the need for RaftActorContext.actorOf(), which is now
removed.
A futher change is to use ActorRef.noSender(), as the sender information
is completely unused by GetSnapshotReply recipients.
JIRA: CONTROLLER-2134
Change-Id: Ieb6eaac87984fdb5434d242d4b97887451a8b6c4
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 19 Mar 2025 20:20:48 +0000 (21:20 +0100)]
Clean up SnapshotManager.capture() tests
Use ImmutableRaftEntryMeta instead of SimpleReplicateLogEntry, reducing
the amount of clutter.
JIRA: CONTROLLER-2134
Change-Id: Ibe67bfad07fd510f05c123bbcfc6a04341888648
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 19 Mar 2025 15:12:22 +0000 (16:12 +0100)]
Remove RaftActorContextImpl.close()
We are indirecting to current behavior, just inline this login in
RaftActor.
Change-Id: I2ed629cd67e032058ecf922c9b51a31cea7e8d15
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 19 Mar 2025 14:43:39 +0000 (15:43 +0100)]
Move CommitSnapshot
CommitSnapshot needs to be visible for testing purposes, move it to
SnapshotManager, simplifying RaftActorSnapshotMessageSupport a bit more.
JIRA: CONTROLLER-2134
Change-Id: I15d0d841dc8bc0396c37e77d26e43ff7a17d232f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 19 Mar 2025 14:04:59 +0000 (15:04 +0100)]
Simplify RaftActorSnapshotMessageSupport
We really only need SnapshotManager here, let's move things around to
simplify things a bit.
JIRA: CONTROLLER-2134
Change-Id: I5d50d9f75a567c2eed5f5296308d5e0fb3758e9c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 19 Mar 2025 13:43:13 +0000 (14:43 +0100)]
Move GetSnapshot support
GetSnapshot has nothing to do with actual snapshotting, it just happens
to use the same mechanism. Move the implementation to RaftActor, making
RaftActorSnapshotMessageSupport independent of RaftActorSnapshotCohort
et al.
JIRA: CONTROLLER-2134
Change-Id: I574e02dc1bf268d8d78077fc4f15b939a983ee8d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 19 Mar 2025 14:53:05 +0000 (15:53 +0100)]
Clean up ExampleActor a bit
Improve mapping to/from state: make sure we know it is Serializable and
verify incoming snapshot.
Change-Id: I1813973102b98ddf6153bc0f0a97f267335b329e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>