Robert Varga [Thu, 10 Apr 2025 09:23:01 +0000 (11:23 +0200)]
Do not mask IOException JournalSegment constructor
Make sure callers are aware of an error being possible here, leading
to improved error handlign in SegmentedByteBufJournal.
JIRA: CONTROLLER-2137
Change-Id: I03c6080d9628082a039e6885b8dbcebf5a206345
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 10 Apr 2025 09:15:19 +0000 (11:15 +0200)]
Report IOException from SegmentedByteBufJournal.size()
Do not use UncheckedIOException to mask errors here.
JIRA: CONTROLLER-2137
Change-Id: I460de64f4a93f12777fcda96aa235f062db0cf38
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 10 Apr 2025 07:54:45 +0000 (09:54 +0200)]
Report IOException from segment operations
We are reporting an UncheckedIOException on IO failures, push this
masking one level up, improving SegmentedJournalActor error handling.
JIRA: CONTROLLER-2137
Change-Id: I135ecdbe44ed9daf684a635e87d6e7f65b7b6779
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 10 Apr 2025 09:05:31 +0000 (11:05 +0200)]
Close deleteJournal reader
We have a dangling reader here, make sure to close it when we are done.
Change-Id: I7b669bca3cd3cfaafb7811c6b83c941bdd04d897
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 10 Apr 2025 07:27:45 +0000 (09:27 +0200)]
Remove StorageException
Use UncheckedIOException, as all remaining users are just masking an
existing IOException.
JIRA: CONTROLLER-2137
Change-Id: I4886ee6e13a07759da0b4e4df698ab5ebabeb9bd
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 10 Apr 2025 07:25:25 +0000 (09:25 +0200)]
Factor out StorageExhaustedException
Use an IOException subclass to report storage exhausted.
JIRA: CONTROLLER-2137
Change-Id: I3a3b886a0d1359c5130fdfa4731948aa86f51dc4
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 9 Apr 2025 12:46:15 +0000 (14:46 +0200)]
Report IOException from EntryWriter.append()
Eliminate a silent exception in favor of being explicit.
JIRA: CONTROLLER-2137
Change-Id: I8640de39fc18c4755919788a945d64a50abc0531
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 9 Apr 2025 13:50:49 +0000 (15:50 +0200)]
Simplify SharedFileBackedOutputStream callback
We only need a Runnable, not two separate objects.
Change-Id: I8d6c59ef1114ffc2795222349bfcd5618ff262a5
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 9 Apr 2025 11:18:12 +0000 (13:18 +0200)]
Fixup applyIdentifierFor()
Simplify the declaration, so that SpotBugs does not get confused.
Change-Id: Ib79d9ea9fd4cb90d7b31a3743fecdee3e91f0b3e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 9 Apr 2025 09:36:33 +0000 (11:36 +0200)]
Refactor base.messages.ApplyState
This is a pure DTO, reduce its proliferation, opting for methods with
explicit arguments instead.
JIRA: CONTROLLER-2137
Change-Id: I17f87dd6600767996e957ca6111e7691f652951b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 9 Apr 2025 08:34:05 +0000 (10:34 +0200)]
Refactor getApplyStateFor()
Reduce ApplyState proliferation, as we only need to pick up the
Identifier.
JIRA: CONTROLLER-2137
Change-Id: I64c3c84c76dbbb77d51a11f2efc1e3b766d728c8
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 9 Apr 2025 07:34:39 +0000 (09:34 +0200)]
Refactor MockPayload
Split it out into its own file, with MockCommand as the new name.
Change-Id: Ibee1897a4385a7da756b81c0efd180a2726a2b8b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 8 Apr 2025 17:35:01 +0000 (19:35 +0200)]
Pull down Shard.persistPayload()
The decision whether or not to go through persistence/replication for a
particular submitCommand() is internal to RaftActor.
Pull down the logic from Shard down to RaftActor, allowing us to hide
RaftActor.hasFollowers().
JIRA: CONTROLLER-2137
Change-Id: Ic8fe37e19a172e561c41c370b7a7d4a17d267ea3
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 8 Apr 2025 21:46:38 +0000 (23:46 +0200)]
Fix raft-journal module info
ByteBuf implements ReferenceCounted, but is not a JPMS module, so we
need to mirror this dependency ourselves.
Change-Id: Ieedf559a5de734f249f208d330dcf39df118b29a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 8 Apr 2025 16:57:47 +0000 (18:57 +0200)]
Drop ActorRef from ApplyState
ApplyState is carrying an ActorRef, which goes unused: the core driver
is Identifier here.
Drop ActorRef from ApplyState/Replicate and rename methods to clarify
lifecycle:
- persistData() becomes submitCommand()
- applyState() becomes applyCommand()
JIRA: CONTROLLER-2137
Change-Id: Ie6a23daadf17fcf5af2a947a8ab09aa199bdd8ab
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 8 Apr 2025 13:01:53 +0000 (15:01 +0200)]
Remove a superfluous constructor
JournalSegmentWriter is only ever instantiated from Inactive state,
but let's lower the tangle index by eliminating the Inactive-taking
constructor.
Change-Id: I5205167f9c21d8d639babfd338abdc5bbad9108a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 8 Apr 2025 12:03:24 +0000 (14:03 +0200)]
Rename CompressionSupport to CompressionType
This is a simple name adjustment.
JIRA: CONTROLLER-2134
Change-Id: I0918e6d3791c55190b65e28bfd8dc85f03793da9
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 8 Apr 2025 08:46:04 +0000 (10:46 +0200)]
Move InputOutputStreamFactory
This class is only used by LocalSnapshotStore, move it to the same
package.
Change-Id: I3caceae1b54672c0a09655c6d1d7b86c8c5b7151
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 7 Apr 2025 20:27:14 +0000 (22:27 +0200)]
Switch to StateSnapshot.Reader to InputStream
This isolates the guesswork around InstallSnapshot content, making it
something internal to sal-akka-raft.
JIRA: CONTROLLER-2134
Change-Id: I48b8b2089f59d4ba21601f43187ad4159bec4ee1
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 8 Apr 2025 05:27:16 +0000 (07:27 +0200)]
Remove ReplicatedLogEntry.getData()
Replace all users with a call to command().
JIRA: CONTROLLER-2137
Change-Id: Ie21c2cd0de59058ce3bb7f3c42ae133dc9fbac2d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 7 Apr 2025 15:42:52 +0000 (17:42 +0200)]
Add EntryStore and LogEntry
Split out the methods supporting entries and define a baseline for how a
LogEntry looks like.
Naming is adjusted to match the RAFT paper, i.e. a LogEntry carries a
'command', not 'data'. This paves the way for a potential EntrySource
interface.
This clarifies RaftActorSnapshotCohort's relationship with
StateSnapshot, introducing a StateSnapshot.Support to hold the
indirection to reader/writer.
JIRA: CONTROLLER-2137
Change-Id: I1513ab5ba8533bc5816294b15a4c5fcc19b974a2
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 10:56:26 +0000 (11:56 +0100)]
Expand PekkoRaftStorage
Teach RaftStorage about its root directory and consult it in
PekkoRaftStorage.
JIRA: CONTROLLER-2134
Change-Id: I8f6265d697f1f1a237109b0c875050824827995c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 7 Apr 2025 08:55:22 +0000 (10:55 +0200)]
Do not pass ActorContext to ShardSnapshotCohort
We are no longer creating an actor, hence we do not need ActorContext.
JIRA: CONTROLLER-2134
Change-Id: Ic46b3a6abc367eeb7cd314161c15a0a32d6bf19e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 7 Apr 2025 08:50:32 +0000 (10:50 +0200)]
Fix a raw type
Guarantee a ShardSnapshotCohort return, fixing a warning.
JIRA: CONTROLLER-2134
Change-Id: I4606cac4487e34a89f3cdbeb667f4094bb704e4c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 4 Apr 2025 19:41:41 +0000 (21:41 +0200)]
Split out SnapshotStore methods
Move more methods from DataPersistenceProvider to SnapshotStore, in
order to isolate the scope of changes.
JIRA: CONTROLLER-2134
Change-Id: I39aff5675b0515ee5f35e15ff130ed883f81027d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 3 Apr 2025 14:25:45 +0000 (16:25 +0200)]
Clean up streamToInstall()
Rather than having a clunky BiConsumer, use a sealed class hierarchy to
communicate success/error.
Change-Id: I0f8305f109d95c8ed9efb21d495a34889613d99c
JIRA: CONTROLLER-2134
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 3 Apr 2025 18:36:14 +0000 (20:36 +0200)]
Refactor FileBackedOutputStream
The lifecycle here is a bit hairy. Improve it by keeping track of what
state we are in.
While we are in the area, split out TransientFile(StreamSource), which
is backed by a temporary file.
Major improvement here is TransientFile, which acts as the GC anchor to
a file -- eliminating the need to hold on to FileBackedOutputStream.
JIRA: CONTROLLER-2134
Change-Id: I3fbdcca3e927551572a38dd4d0268b0d56b68aa9
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 3 Apr 2025 11:43:01 +0000 (13:43 +0200)]
Remove superfluous openBufferedStream()
StreamSource is already delegating to openStream(), hence there is no
need to do the same in other implementations.
JIRA: CONTROLLER-2134
Change-Id: I8b733e342f7bbb5fac8de755c6ad973509eb27db
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 3 Apr 2025 11:33:46 +0000 (13:33 +0200)]
Fix RaftStorage actor confinement violation
We are invoking a callback from a background thread. This should be
routed via executeInSelf() to ensure we execute in proper context.
JIRA: CONTROLLER-2134
Change-Id: I92a40ebd7ed170ae7a77d7923eedefe285978692
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 3 Apr 2025 05:13:12 +0000 (07:13 +0200)]
Ditch use of ByteSource
Rename InputStreamProvider to StreamSource with two specializations, based
on whether or not we know the size.
Allow conversion from StreamSource to SizedStreamSource, for the cost of
consuming the input stream once.
Also disconnect SnapshotSource from StreamSource: the former indicates
the format of the underlying InputStreams, while StreamSource provides
the streams themselves.
JIRA: CONTROLLER-2134
Change-Id: I214b990c9b9d573661de824cbb5d371bcd66ccb7
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 2 Apr 2025 23:16:50 +0000 (01:16 +0200)]
Add cluster.raft.spi.EntryData et al.
We need to reign in ReplicatedLogEntry serialization, so that we can
operate without Serializable being in the picture.
Baseline semantincs of a ReplicateLogEntry is that it represents a step
in RAFT maintentenance, affecting either the contents of StateSnapshot
or a RAFT server state transition.
Introduce cluster.raft.spi.EntryData as the unifying concept, with
RaftDelta representing RAFT server transitions and StateDelta
representing user state transitions.
This provides a natural place for Reader/Writer interfaces, which allow
us to co-locate serialization support within the delta contract.
JIRA: CONTROLLER-2137
Change-Id: Icbf3b32a9ca68c6b60709ea79b509b3078793ea9
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 3 Apr 2025 00:52:44 +0000 (02:52 +0200)]
Convert sal-akka-raft into a JPMS module
We need to use sealed classes across packages, for which we need
sal-akka-raft to be a full module.
This is a bit painful, as we need to:
- move test utils into o.o.c.cluster.raft
- open that package to Pekko
- add a ton of package-info.java files to cover all the exports
- retain the explicit DynamicImport-Package=* declaration
JIRA: CONTROLLER-2137
Change-Id: Ia013cac658c054815ba3d06764f13d61b2b953d7
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 2 Apr 2025 21:30:57 +0000 (23:30 +0200)]
Split out cluster.raft.spi.StateSnapshot
This extracts the trait of being a snapshot from being Serializable --
which is replaced via explicit StateSnapshot.{Reader,Writer} interfaces.
The newly-enabled SnapshotFile.readSnapshot() is a testament to our
ability to read/write state snapshots without involving Java
Serialization. Unfortunately only testing Payloads are taking advantage
of this right now, as ShardSnapshotCohort still requires
Object{Input,Output}Stream.
JIRA: CONTROLLER-2134
Change-Id: Ia3cfd570f6176d629749f779bb86caf01d656929
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 2 Apr 2025 11:43:08 +0000 (13:43 +0200)]
Add cluster.raft.spi.SnapshotFileFormat
This is a first cut at having a sensible file format for storing
snapshots. Unlike Pekko-based persistence, which stored the complete
contents of a Snapshot, we just pick store data that cannot be inferred
from the way things are laid out.
The following o.o.raft.spi constructs are introduced:
- InstallableSnapshot, a handle on byte state that can be tranferred via
InstallSnapshot RPCs
- SizedStreamSource, the moral equivalent of Guava's ByteSource
- FileStreamSource, a SizedStreamSource served from a section of a file
JIRA: CONTROLLER-2134
Change-Id: I9743ec8b701badd12857e483acefb95709d9b68a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 2 Apr 2025 19:41:33 +0000 (21:41 +0200)]
Fix CompressionSupport.NONE
We should be handing out a PlainSnapshotSource, not a LZ4-compressed
one.
JIRA: CONTROLLER-2134
Change-Id: I41fc06d7a28a5d35e084645612c0deb6f3145c13
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 2 Apr 2025 02:58:25 +0000 (04:58 +0200)]
Rename SnapshotFileFormat to CompressionSupport
The idea of having a single compressed stream will not work quite well.
Take a first step by separating what we have into CompressionSupport.
JIRA: CONTROLLER-2134
Change-Id: I2140b1fa84ebdd9f473b07f14bb673049614a941
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 15:03:18 +0000 (16:03 +0100)]
Propagate stateDir to LocalAccess
We will need the top-level directory for persistent RaftStorage outside
of Pekko persistence. Make sure we make it available from LocalAccess.
JIRA: CONTROLLER-2134
Change-Id: I7c8130cc89a024f8d276a988787d32e9b0c93b53
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 1 Apr 2025 15:53:09 +0000 (17:53 +0200)]
Move base.messages.CaptureSnapshot
This class is no longer being sent to actor, but rather is a plain
holder used mostly by SnapshotManager.
Move the class into SnapshotManager, hiding its constructor and dropping
the ControlMessage part.
Change-Id: I8bd223bf4b6c16f2a9afc5c52bdf30c096ecbeb8
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 1 Apr 2025 15:40:28 +0000 (17:40 +0200)]
Clean up Leader
Use local variables to squash nullness warnigns and make
LeadershipTransferContext properly constant.
Change-Id: I50e48092d4f3d36b7ce8fee6df79cfa6c67317e2
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 1 Apr 2025 15:28:44 +0000 (17:28 +0200)]
Improve sal-akka-raft assertions
Use assertInstanceOf()/assertSame() instead of referencing the
behavior's role.
Change-Id: I58d65fea5676b95f76ca00b40d5a9e44a33d8dfe
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 12:55:11 +0000 (13:55 +0100)]
Rewire SnapshotManager.captureToInstall()
Do not may a roundtrip through actor messages, but rather use the
facilities provided RaftStorage to perform off-loaded serialization.
This eliminates the offload actor from sal-distributed-datastore, as
well as the need CaptureSnapshotReply, as we now message completion via
executeInSelf() and callbacks.
JIRA: CONTROLLER-2134
Change-Id: I0518e7eb80559382ede5b4b4e7db41c66fd6a564
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 07:31:16 +0000 (08:31 +0100)]
Add DataPersistenceProvider.streamToInstall()
This is the first step in having asynchronous access to a snapshot
bytestream: RaftStorage now exposes streamToInstall() method, which
writes out a snapshot in the background and invokes a callback once
that is completed.
JIRA: CONTROLLER-2134
Change-Id: Ic650dbf4b31c62135dbc7b28fa9682f1ef5bf824
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 1 Apr 2025 06:44:49 +0000 (08:44 +0200)]
Fix RaftStorage startup
PersistenceControl fails to start enabledStorage, leading to all sorts
of mayhem. Fix that up.
JIRA: CONTROLLER-2134
Change-Id: I54023b01f83c7657846253da8e1f505fc52584f6
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 1 Apr 2025 06:11:03 +0000 (08:11 +0200)]
Silence PropertiesTermInfoStore
Log NoSuchFileException at trace(), reducing clutter in tests.
JIRA: CONTROLLER-2133
Change-Id: I40c1855c957ad82e84f3a02fcedb1167845dc28d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 31 Mar 2025 19:45:29 +0000 (21:45 +0200)]
Allow TestDataProvider's execution to be adjusted
We have a use case where we would like to delay execution of
DataProvider callback. Ditch the shared instance and allow the executor
to be set.
JIRA: CONTROLLER-2134
Change-Id: I0e99bb51eb549f6db3e27f5fdc600f6c3c97f097
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 31 Mar 2025 17:28:05 +0000 (19:28 +0200)]
Centralize MockRaftActorSnapshotCohort methods
We have 4 implementations doing the same thing. Use default methods to
reduce duplication.
JIRA: CONTROLLER-2134
Change-Id: I3969590e94b2d138ac665221d7d31b0f0ac3a064
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 31 Mar 2025 16:40:32 +0000 (18:40 +0200)]
Use ByteStateSnapshotCohort in SnapshotManagerTest
SnapshotManagerTest is dealing with ByteState mostly and
ByteStateSnapshotCohort can widely implement the deserializeSnapshot()
method.
Use ByteStateSnapshotCohort there, which will make further changes
easier.
JIRA: CONTROLLER-2134
Change-Id: I672f934ddaf31c8a5e2cfd5d17664fc8985c4858
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 28 Mar 2025 20:37:20 +0000 (21:37 +0100)]
Propagate start/stop to RaftStorage
RaftStorage needs to be an active component to deal with async
persistence, synchronizing persistent content, etc. This patch adds to
lifecycle hooks in RaftActor to control lifecycle.
JIRA: CONTROLLER-2134
Change-Id: I4e225d68628c696b4bcc80b9fbb04fd7d157ee2f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 28 Mar 2025 19:47:59 +0000 (20:47 +0100)]
Add FileBackedOutputStream.Configuration
Encapsulate the two options and propagate them to RaftStorage for later
use.
JIRA: CONTROLLER-2134
Change-Id: I341b8ef3891f355c59167ba9235a960b7672bca6
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 28 Mar 2025 16:36:52 +0000 (17:36 +0100)]
Use Path instead of String for temp directory
Let's be type-safe, so that things do not get mixed up.
JIRA: CONTROLLER-2134
Change-Id: I8f5400213fc3cf424d61d83379d90fabb192d58e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 28 Mar 2025 16:59:54 +0000 (17:59 +0100)]
Clean up sal-clustering-commons pom.xml
sal-clustering-commons on longer needs checker-qual, remove that
dependency. We can also enforce modernizer issues.
Change-Id: I587ae6df660f122c63c983d663cce0b6c1b3bab9
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Fri, 28 Mar 2025 10:10:14 +0000 (11:10 +0100)]
Introduce raft.spi.ByteArray
This patch replaces uses of ByteSource with InputStreamProvider, of
which ByteArray is a convenient implementation of. This eliminates the
need to use yangtools.concepts.Either, as we are expressing the two
options via class hierarchy.
JIRA: CONTROLLER-2134
Change-Id: I249f572fe8be0e64de796fdf7d48cbf2ba9b95c3
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Thu, 27 Mar 2025 23:34:46 +0000 (00:34 +0100)]
Rehost controller.cluster.io
FileBackedOutputStream is used to receive AppendEntries, which
eventually land in the RAFT journal. No ODL downstream is using this
facility, so it is fair to say that this is part of RAFT SPI.
This allows us to change cds-access-api's dependency from
sal-clustering-commons to raft-api. This is quite natural for the
datastore: at some point we want to expose a CommitInfo which contains
an EntryInfo reference to when the transaction was committed.
JIRA: CONTROLLER-2134
Change-Id: I7458e17055affd56b736f85476b004cb29a2b1d2
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Wed, 26 Mar 2025 11:35:34 +0000 (12:35 +0100)]
Refactor SnapshotSource
Reduce the number of classes by introducing InputStreamProvider.
JIRA: CONTROLLER-2134
Change-Id: I261c714da7871aa8ed9e66524ee6783faaadec1f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 11:28:52 +0000 (12:28 +0100)]
Propagate SnapshotFileFormat to RaftStore
Each RaftStore needs to have a preferred file format. Hook it to
use-lz4-compression, hardcoding to 256KiB block size, just as we do when
we transfer to followers.
JIRA: CONTROLLER-1423
Change-Id: I7a59f386abc250fe7f813175650ad9374f4711f4
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 10:25:46 +0000 (11:25 +0100)]
Remove InputOutputStreamFactory.lz4(String)
We do not have to use a String, just use the corresponding constant
directly. There is only one caller who can take ownership of the
corresponding code block.
Change-Id: Ie8bc8162f1cb47744e013fed1fee03f0af10fc03
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 07:13:51 +0000 (08:13 +0100)]
Add RaftStorage.start()/stop()
Add internal thread pool and two methods to control it.
JIRA: CONTROLLER-2134
Change-Id: I87060447b86d7358f2f3cc5f9598168bd963058c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 05:46:20 +0000 (06:46 +0100)]
Improve ShardManagerInfo
Define TargetBehavior to offload type mapping to JMX.
Change-Id: If7d39639ae90aaca18807dfb116c795d5b047d18
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 05:33:50 +0000 (06:33 +0100)]
Document default backup-datastore timeout
We have an implementation-specific default of 60 seconds, let's make
sure it is captured in the YANG model.
Change-Id: I66020e33c73c770cbed8bf9a6a0c592bd272c5b2
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 06:34:23 +0000 (07:34 +0100)]
Move sal-akka-raft
We have a raft/ top-level directory, move sal-akka-raft there. Also
switch it from using mdsal-parent to using bundle-parent.
Change-Id: Idb597dafa423723443a5e6e344e9f2b06c8b1410
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 03:00:07 +0000 (04:00 +0100)]
Rename findLatestSnapshot()
We really should be addint the interface to DataPersistenceProvider and
a 'tryLatestSnapshot()' is a better name.
JIRA: CONTROLLER-2134
Change-Id: I65af11051ff4cc473053186a7226ff6d052b98ac
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 05:29:56 +0000 (06:29 +0100)]
Add SnapshotFileFormat
This is a useful utility, which we will use to build our
LocalRaftStorage.
Change-Id: I32027581f8eb55da435310aa35e0bbac7ab4436e
JIRA: CONTROLLER-2134
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 15:40:53 +0000 (16:40 +0100)]
Add DataPersistenceProvider guidance
We will need to evolve this contract a bit, add more specific guidance.
JIRA: CONTROLLER-2134
Change-Id: Id1fcfb50fe96e7adfa8df089ac8dc9622240b770
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Tue, 25 Mar 2025 00:47:35 +0000 (01:47 +0100)]
Move TermInfo to o.o.raft.api
This is a natural raft-api thing. Move it there and cover it with tests.
Change-Id: If4dc97b144bedbbad81c21ba6fe795b24078e9eb
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 23:37:59 +0000 (00:37 +0100)]
Promote (Immutable)RaftEntryMeta
These two constructs are raft.api material. Promote these two as:
- raft.api.EntryMeta as a replacement for RaftEntryMeta
- raft.api.EntryInfo as a replacement for ImmutableRaftEntryMeta
Change-Id: I93908e29f11ffad3342da7dc0f4a24678cecef35
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 21:11:50 +0000 (22:11 +0100)]
Split out raft-spi
We have a nice set of classes that could be more widely used. Let's
split them out. This has the neat benefit of making lz4-java an
implementation detail hidden from the outside world.
We also introduce odl-lz4 to package lz4-java independently of
everything else.
Change-Id: I3434bac53dba6935c022e9ab79fc40eead01e40b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 18:24:58 +0000 (19:24 +0100)]
Split out raft-api
We now have RaftRole to start a low-level RAFT API artifact. Introduce
raft-api, which holds RaftRole and its corresponding ServerRole. Use it
to improve type safety of ShardStatsMXBean.
Change-Id: I99eda413a9cac4c7fbae036512432a97630ef28c
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:45:00 +0000 (14:45 +0100)]
Alpha-sort sal-akka-raft dependencies
Previous reformat has missed these, fix it up.
Change-Id: I54961bf1e42415974235d8e50fd539dea633433d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 10:07:55 +0000 (11:07 +0100)]
Add SnapshotSource
We will need to deal with snapshots being available for reading in
multiple formats. This patch add the concept of a SnapshotSource, with
two formats: plain and LZ4.
This ends up being a framework, but that is completely fine, as it is
transparent and provides a fixed amount of functionality.
We plug it into RaftStorage, as that is where it is going to be needed.
The two RaftStorage implementations run a no-op, with TODOs for later
implementation.
JIRA: CONTROLLER-2134
Change-Id: I99a8a967e6d08da681fa45082927b3039a523a49
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:13:28 +0000 (14:13 +0100)]
Enforce sal-akka-raft dependencies
We have squeeky-clean dependencies, make sure it stays that way.
Change-Id: Icfedbdd3c037b35edbef87c70e4466b3da09d681
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:10:11 +0000 (14:10 +0100)]
Remove broken Export-Package declaration
This is a day-zero bug: 'Export-Package' is mis-spelled as
'Export-package' and this is ineffective.
Remove the declaration and provide documentation for the remaining
DynamicImport-Package.
Change-Id: I807768a908747bc13bb51cea7fd7c4cb50f3fe6b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 13:06:48 +0000 (14:06 +0100)]
Remove slf4j-simple dependency
This dependency is always added by our parent pom, hence there is no
point in repeating it here. Also reformat pom.xml to follow style we use
in most places.
Change-Id: I13d071fd2799bfb2256e6c8831ddbe2fb4e08648
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Mon, 24 Mar 2025 10:54:07 +0000 (11:54 +0100)]
Require memberId() for RaftStorage
We need a consistent way of logging, make sure RaftStorage has it.
JIRA: CONTROLLER-2134
Change-Id: I67ca04c5c7e45358276775285ea05fd58fb7ce3e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 18:50:08 +0000 (19:50 +0100)]
Factor out RaftStorage
We are in a place where we can start pulling the persistence apart. This
takes the first step by introducing RaftStorage and dropping a number of
FIXME for future evolution.
JIRA: CONTROLLER-2134
Change-Id: I658ce51cff971e39dc53b8f25e42123ad5d78b3e
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 15:43:31 +0000 (16:43 +0100)]
Reintroduce ForwardingDataPersistenceProvider
This is a neat utility which allows us to sit in front of persistence,
forwarding to it -- without exposing our real implementations.
While we are here, also improve testApplyStateRace by scheduling
callback invocation after the delegate is done with it.
JIRA: CONTROLLER-2134
Change-Id: I3fb4a423a1302a568bc5eecb77f26d2b5d444a4f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 13:24:30 +0000 (14:24 +0100)]
Clean up OnDemandRaftState
Use RaftRole and fix the builder pattern to be properly immutable.
Change-Id: I4630a5d920d545ca615e72dd7a0a8a017e8b517a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 05:34:10 +0000 (06:34 +0100)]
Promote RaftState
Let's start a new package, opendaylight.raft.api, which holds a single
enumeration for now -- RaftRole. The change in name frees up 'RaftState'
for use by something that actually has state.
This also shows us that we have a bug in Example actor -- using an
illegal cast in cast of IsolatedLeader.
JIRA: CONTROLLER-2134
Change-Id: I50b25b4d13cf426285face63ed8be2a8356ab83d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 04:19:48 +0000 (05:19 +0100)]
Move cluster.notifications
Let's rehost these into sal-akka-raft for now, as that is where they are
used from. This allows us to tie 'role' with 'RaftState', leading to
improved type safety.
JIRA: CONTROLLER-2134
Change-Id: Icb4968774d89517486c50bff95393e8c68b8265b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 04:06:30 +0000 (05:06 +0100)]
Modernize RoleChanged/LeaderStateChanged
These are pure DTOs, use a record for that. While we're here, also make
sure to require memberId()/newRole(). The situation is a bit
complicated, because we use subclassing to carry more data.
JIRA: CONTROLLER-2134
Change-Id: I37a486dd75c1a313161908cfb287e61e18a5a401
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 02:36:40 +0000 (03:36 +0100)]
Remove RaftActor.currentTerm()
This method is not used anywhere, remove it.
JIRA: CONTROLLER-2134
Change-Id: I1ce49c91655086470f8c083629051bf5c8dbef40
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 02:26:58 +0000 (03:26 +0100)]
Introduce resetReplicatedLog()
A ton of our tests rely on replacing ReplicatedLog. This is a
huge-no-no, which is unfortunately exposed via RaftActorContext.
This patch splits of resetReplicatedLog(), which does the same thing,
except it has a different name and is amenable to being implemented
as a set of operations rather than a wholesale replacement.
JIRA: CONTROLLER-2137
Change-Id: I4774564b092fadbbd4f0f4a57f3d13e1234ec376
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 01:35:51 +0000 (02:35 +0100)]
Remove RaftActorContext.setCommitIndex()
Mass-migrate tests and eliminate the legacy method.
JIRA: CONTROLLER-2137
Change-Id: I6035aa1edf546d8d5ae519e5db3edd83ca632624
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 01:22:32 +0000 (02:22 +0100)]
Remove RaftActorContext.setLastApplied()
Mass-migrate tests and eliminate the legacy method.
JIRA: CONTROLLER-2137
Change-Id: I194b70975fe1cca97f3a81163410247d7a76e3c5
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sun, 23 Mar 2025 00:38:19 +0000 (01:38 +0100)]
Remove RaftActorContext compatibility getters
Mass-migrate tests to use ReplicatedLog for commitIndex/lastApplied.
JIRA: CONTROLLER-2137
Change-Id: I9f30b3afb0b7df05ef6238944f79fcba050e1c3d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 22:15:18 +0000 (23:15 +0100)]
Do not reset ReplicatedLog
Replacing ReplicatedLog is a rather bad thing, as we need to be mindful
when we can and cannot cache it.
This patch takes the first step towards making it an invariant:
introduce a resetToSnapshot() method which resets the state and take
advantage of that.
We also deprecate setReplicatedLog() for removal.
JIRA: CONTROLLER-2137
Change-Id: Ifc9ce0b0910214d5d0f314f26623fc4583fd1849
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 22:38:32 +0000 (23:38 +0100)]
Reduce use of ReplicatedLog.last()
We have a few call sites which are accessing the entry, whereas they
only need its metadata. Update them to not call last().
JIRA: CONTROLLER-2137
Change-Id: If13079a9a7d95cffb70a0befb877d8dcfb859fcd
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 20:34:36 +0000 (21:34 +0100)]
Clean up replicatedLog() references
This is a follow-up for the previous patch, cleaning up references to
deprecated methods and simplifying replicated log references.
JIRA: CONTROLLER-2137
Change-Id: Id95123119a11e7571f62c45a081fbb018a288f91
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 21:57:50 +0000 (22:57 +0100)]
Lock down AbstractReplicatedLog
Clean up this class and make most methods final.
Change-Id: I65eb72794131f32d3a4c524bbafd2ea1bf64726f
JIRA: CONTROLLER-2137
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 18:55:34 +0000 (19:55 +0100)]
Move commitIndex/lastApplied to ReplicatedLog
We have maintenance split between ReplicatedLog and RaftActorContext.
Move commitIndex and lastApplied to ReplicatedLog. This requires a bit
of shuffling in tests, as replacing the log would lose the changes made.
JIRA: CONTROLLER-2137
Change-Id: I6a6f4bb01abc2a8fb2da47110ae73c4d4ff1d9dc
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 18:26:50 +0000 (19:26 +0100)]
Reduce context.getCurrentBehavior() callers
RaftActor has a method to talk to RaftActorContext, use that instead of
direct calls.
JIRA: CONTROLLER-2134
Change-Id: Ib3ca80d6dacbb636f26f322ef99c5414a5a453e9
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 17:45:44 +0000 (18:45 +0100)]
Simplify AbstractReplicateLog constructor
The constructor has a ton of arguments for a single caller, really. Move
the code into ReplicateLogImpl.
JIRA: CONTROLLER-2134
Change-Id: Ia1184fe5d129ae72df6e777e6531371dd23b410a
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 17:17:19 +0000 (18:17 +0100)]
Shorten call to isRecoveryApplicable()
We do not need to go through RaftActorContext, as RaftActor has the
information available.
JIRA: CONTROLLER-2134
Change-Id: I0e7f71ad7819dddfe5b2e25cf2e77ded1161375d
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 17:01:52 +0000 (18:01 +0100)]
Update RaftActorContext.getCluster()
Use a nullable return instead of an optional.
Change-Id: I74f34c440e9364dee900b41fa39a96a3b31222c3
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 16:54:43 +0000 (17:54 +0100)]
Remove RaftActorContext.actorSelection()
This method is used only internally, as RaftActor can access its own
ActorContext. We also lock down a few methods and remove duplicate
implementation.
Change-Id: I08546d4f11bdb2ff62f6ab5c37957368b658c90b
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 14:20:05 +0000 (15:20 +0100)]
Move SendHeartBeat
This message is used interally by AbstractLeader, move it there for
clarity. Also clean up message dispatch.
JIRA: CONTROLLER-2134
Change-Id: Ie18c478c24136c09910e887382b683fa5ecd3105
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 14:01:48 +0000 (15:01 +0100)]
Hide SnapshotComplete
This is a purely-internal message. Hide in SnapshotManager to prevent
shenanigans.
JIRA: CONTROLLER-2134
Change-Id: Ia3fbd3d6a76cc4c4b49ea99c93fc8545a793918f
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 13:55:29 +0000 (14:55 +0100)]
Expose AbstractLeader.sendInstallSnapshot()
Do not abuse handleMessage() when dispatching from ShardManager to
AbstractLeader, but call the target method directly. This allows us to
completely hide SnapshotBytes -- and rename them back to SnapshotHolder.
JIRA: CONTROLLER-2136
Change-Id: I34962cc2060603ce1245581d2cd9727590f91e73
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 13:43:31 +0000 (14:43 +0100)]
Improve leader check
Comparing memberId() with getLeaderId() is really a check to see if we
are a leader. Replace it with an instanceof check.
JIRA: CONTROLLER-2134
Change-Id: Iabc79d7b80961588d1b449c1138063bf01f6c5a5
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 12:57:41 +0000 (13:57 +0100)]
Move SnapshotBytes propagation
We no longer need a Snapshot, hence we can move the propagation logic to
the single method which is invoking it.
JIRA: CONTROLLER-2134
Change-Id: I87544711d1a69825fa920f23c2893d58f757c753
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>
Robert Varga [Sat, 22 Mar 2025 12:48:24 +0000 (13:48 +0100)]
Refactor SendInstallSnapshot
SendInstallSnapshot contains a Snapshot, from which we only extract last
applied index/term to instantiate SnapshotHolder.
Merge SendInstallSnapshot and SnapshotHolder into a replacement class
called SnapshotBytes and adjust callers accordingly -- making the
contract simpler and easier to test.
JIRA: CONTROLLER-2134
Change-Id: I3d23176dda2322595679b4854cb8dabfb8d480f7
Signed-off-by: Robert Varga <robert.varga@pantheon.tech>