CASSANDRA-20147: Semaphore permit overflow in batch commit mode #4437
+3
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In Batch commit mode, if multiple writes arrive during a single commitlog flush it will release the haveWork semaphore more times than it's acquired. If this happens often enough without an idle period, it will eventually overflow (several weeks, in our case). So, every time we acquire() the permit count should be reset to 0. This is similar to how it worked in 3.0, but with more places it's acquired. In theory this leaves a potential race, but only if 2 billion writes arrive within a single commitlog flush interval.
Without this change, I believe the flusher loop would also run without waiting for a while during idle periods after sustained high load.
[junit-timeout] OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
[junit-timeout] Testsuite: org.apache.cassandra.db.commitlog.BatchCommitLogTest-_jdk11
[junit-timeout] Testsuite: org.apache.cassandra.db.commitlog.BatchCommitLogTest-_jdk11 Tests run: 198, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 5.254 sec
patch by Elliott Sims ([email protected]); reviewed by TBD for CASSANDRA-20147
https://issues.apache.org/jira/browse/CASSANDRA-20147