-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup 15mar25 #3986
base: cep-15-accord
Are you sure you want to change the base?
Cleanup 15mar25 #3986
Conversation
Co-authored-by: Ariel Weisberg <[email protected]> Co-authored-by: Benedict Elliott Smith <[email protected]>
Introduce a special kind of non-durable read that provides only per-key linearizable isolation; i.e. strict-serializable isolation for single partition-key reads. This read creates only a happens-before edge, by collecting dependencies for execution and ensuring that execution happens strictly after these dependencies have executed, but at no precise time otherwise. So later writes may be witnessed, and if multiple keys are read they may represents different points in time. patch by Benedict; reviewed by Ariel Weisberg for CASSANDRA-19305
…endency elision patch by Benedict; reviewed by Aleksey Yeshchenko for CASSANDRA-19310
Patch by Blake Eggleston; Reviewed by Ariel Wesberg for CASSANDRA-19016
patch by David Capwell; reviewed by Benedict Elliott Smith for CASSANDRA-19355
…d uses the sync point empty txn for reads patch by David Capwell; reviewed by Blake Eggleston for CASSANDRA-19503
…tdown causing the test to fail patch by David Capwell; reviewed by Blake Eggleston for CASSANDRA-19514
Patch by Blake Eggleston; Reviewed by Ariel Weisberg and David Capwell for CASSANDRA-19472
patch by Caleb Rackliffe; reviewed by David Capwell for CASSANDRA-18987
patch by Benedict Elliott Smith, David Capwell; reviewed by Benedict Elliott Smith for CASSANDRA-19605
…uring deserialization
…ct issues earlier before they are seen in Cassandra patch by Benedict Elliott Smith, David Capwell; reviewed by Benedict Elliott Smith, David Capwell for CASSANDRA-19618
patch by Aleksey Yeschenko; reviewed by Alex Petrov and Benedict Elliott Smith for CASSANDRA-18888
patch by David Capwell; reviewed by Blake Eggleston for CASSANDRA-19642
patch by Caleb Rackliffe; reviewed by David Capwell and Ariel Weisberg for CASSANDRA-18732
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20393.
patch by Benedict Elliott Smith; reviewed by David Capwell for CASSANDRA-20420
…a NPE patch by David Capwell; reviewed by Benedict Elliott Smith for CASSANDRA-20417
Patch by Alex Petrov; reviewed by Benedict Elliott Smith CASSANDRA-20347.
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20424.
… required topology changes patch by David Capwell; reviewed by Benedict Elliott Smith for CASSANDRA-20426
- Decouple command serialization from TableMetadata version; introduce ColumnMetadata ids; gracefully handle missing TableId - DataInputPlus.readLeastSignificantBytes must truncate high bits - Fix RandomPartitioner accord serialization - Fast path stable commits must not override recovery propose/commit decisions regarding visibility of a transaction - RejectBefore must mergeMax, not merge, to ensure we maintain epoch and hlc increasing independently - Bad commitInvalidate decision - consistent filtering for touches and stillTouches - ensure TRUNCATE_BEFORE implies SHARD_APPLIED - TopologyManager.unsyncedOnly off-by-one error - DurabilityQueue should not retry SyncPointErased - handle rare case of no deps but none needed - not updating CFK synchronously on recovery, which can lead to erroneous recovery decisions for other transactions - Don't return partial read response when one commandStore rejects the commit - Filter touches/stillTouches consistently - WaitingState computeLowEpoch must use hasTouched to handle historic key with no route Improve: - Use format parameters to defer building Invariants.requireArgument string - streamline RedundantStatus/RedundantBefore
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20114
…e changes without impacting C* messagin patch by David Capwell; reviewed by Aleksey Yeschenko, Alex Petrov for CASSANDRA-20403
…ioner not being setup. The first example is the empty range so no updates to the static partitioner is done which leads to this NPE
- Accord Journal purging was disabled - remove unique_id from schema keyspace - avoid String.format in Compactor hot path - avoid string concatenation on hot path; improve segment compactor partition build efficiency - Partial compaction should update records in place to ensure truncation of discontiguous compactions do not lead to an incorrect field version being used - StoreParticipants.touches behaviour for RX was erroneously modified; should touch all non-redundant ranges including those no longer owned - SetShardDurable should correctly set DurableBefore Majority/Universal based on the Durability parameter - fix erroneous prunedBefore invariant - Journal compaction should not rewrite fields shadowed by a newer record - Don't save updates to ERASED commands - Simplify CommandChange.getFlags - fix handling of Durability for Invalidated - Don't use ApplyAt for GC_BEFORE with partial input, as might be a saveStatus >= ApplyAtKnown but with executeAt < ApplyAtKnown patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20441
* Fix short accord simulation test (seed 0x6bea128ae851724b), ConcurrentModificationException * Increase wait time during closing to avoid Unterminated threads * Increase timeouts, improve test stability * More descriptive output from CQL test * Shorten max CMS delay * Improve future handling in config service Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20440
import static java.util.concurrent.TimeUnit.NANOSECONDS; | ||
import static org.apache.cassandra.concurrent.ExecutorFactory.Global.executorFactory; | ||
|
||
public class LockWithAsyncSignalTest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't this be a better in Simulator? There is only one real "assert" in the test and its
FBUtilities.waitOnFutures(consumers, 2L, TimeUnit.SECONDS);
and environmental issues can cause this to not timeout within 2s leading it to be flakey. If we wish to debug why the test failed it wouldn't be reproducable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simulator is insufficient for fully testing this kind of construct, as it cannot simulate the memory model. The simulator does help validate systems like this, but we do need to also run it as the hardware intended.
I think as now designed it would be fine to increase this timeout arbitrarily high though. It would be nice to further embellish the testing of these constructs, but this is a fairly good investment:reward tradeoff (and at the moment in particular, everything is about balance of investment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, this is lightweight enough I would hope 2s wasn’t unreasonable for any environment. Have you seen timeouts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trying to fix repeat job to work with unit tests and on accord... that will help collect data about how stable this test is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ junit_report_viewer.py
All tests successful
1000 iterations with weaker resources and was stable, so guess we can leave this alone for now
import static org.apache.cassandra.service.accord.AccordExecutor.Mode.RUN_WITH_LOCK; | ||
import static org.apache.cassandra.utils.Clock.Global.nanoTime; | ||
|
||
class AccordExecutorInfiniteLoops implements Shutdownable | ||
class AccordExecutorLoops |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain why you moved away from implements Shutdownable
? During shutdown we don't look to directly tell this class anymore that we are shutting down, and we hope that each thread was told to shutdown; this feels like it should cause stability issues during nodetool drain
and jvm-dtest shutdown no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking closer this class is only used by: AccordExecutorAsyncSubmit
, AccordExecutorSemiSyncSubmit
, AccordExecutorSyncSubmit
. So this class really doesn't care about directly being shutdown
private final AtomicInteger running = new AtomicInteger(); | ||
private final Condition terminated = Condition.newOneTimeCondition(); | ||
|
||
public AccordExecutorLoops(Mode mode, int threads, IntFunction<String> name, Function<Mode, Runnable> tasks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that this class no longer Loops
this name feels off to me. We define that there are N
threads, and we are done w/e all N
finish running their task
. This class doesn't really have anything to do with "looping", nor does it enforce that the sub-tasks are looped (old logic did enforce this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is arbitrary and to me it still makes perfect sense to call these threads “loops” since that’s what they do - we could if you like rename the loop provider from tasks to loops, if it helps your mental model.
public Object shutdownNow() | ||
{ | ||
shutdown(); | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the semantics that we don't to drop the backlog, yet this abstract class looks to push that to each implementation. In the locking case we only check if its shutdown if we get a null
task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m afraid I don’t follow. This could be changed really as we don’t implement Shutdownable, so we don’t need to return anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would remove this method to avoid confusion. shutdownNow
normally clears the backlog, so if you see this method you assume it would as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment above was mistaken (done from a phone on holiday making assumptions about code location). This place does implement Shutdownable
. Removing this therefore means not implementing Shutdownable
. The description for shutdownNow
in Shutdownable
makes clear it is up to the implementation whether shutdownNow
actually does anything to stop ongoing tasks, and this is true even for ExecutorService
(though, the description there is that it is 'best effort', it makes clear there are no guarantees which is broadly equivalent to implementation defined)
I don't terribly mind not implementing Shutdownable
though, if you're strongly of this view.
@Override | ||
public void shutdown() | ||
{ | ||
shutdown = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't volatile so how do we make sure the visibility is correct for all versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is what i see for AccordExecutorSyncSubmit
shutdown
is called in random thread during shutdownnotifyWorkExclusive
doeshasWork.signal();
- with lock case only checks this lock once we have no work to do, which is after we access this field
so from a JMM point of view this doesn't look like it actually forces the visibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we don't block new work once we shutdown, so if the node is very busy this will keep going?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Visibility is enforced by other volatile accesses. The writer notifies work as you correctly surmised (which involves a volatile write), and the readers are all guarded by volatile reads on the lock they are guarded by (that we notify of work). The volatile keyword on this field would provide no guarantees on its own and offers no additional guarantees in concert with the lock.
it is true that we don’t reject new work but this executor should be shutdown after networking and we should not really reject any work being produced by the local machine during shutdown.
is there a reason this PR depends on apache/cassandra-accord#183? As far as i can tell they are unrelated? |
Triggering CI to help validate the shutdown logic and the impact on CI. The shutdown semantic is the only thing really open so want to collect data on this. |
Ran CI and the shutdown logic is causing tests to fail, we don't close the cluster within 3m |
They are related in the sense that they each represent the latest batch of significant fixes for either side. It just so happens this time that it appears the two sides do not actually intersect. I will update the description and remove the dependency. |
7be53d5
to
de9925f
Compare
@@ -24,7 +24,7 @@ | |||
import java.util.function.IntFunction; | |||
|
|||
import accord.utils.Invariants; | |||
import org.agrona.collections.LongHashSet; | |||
import io.netty.util.collection.LongObjectHashMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why you moved off agrona
to netty
's. Not against it, just wondering if there was a reason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mistake - intended only to switch to a map to avoid unlikely scenario of thread if being reused for a dead thread (close to impossible given how we use it, but safer). Will switch to Agrona map version for consistency
notifyWorkExclusive(); | ||
notifyWork(); | ||
lock.lock(); | ||
try |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think this try/finally
isn't needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, leftover it looks like, not sure how that happened
Thanks for sending a pull request! Here are some tips if you're new here:
Commit messages should follow the following format:
The Cassandra Jira