-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CASSSIDECAR-210: Fix PeriodicTaskExecutor double execution due to race from reschedule #192
Conversation
…e from reschedule Patch by Yifan Cai; Reviewed by TBD for CASSSIDECAR-210
LOGGER.debug("Reschedule the task. task='{}' execCount={}", key, execCount); | ||
reschedule(periodicTask); | ||
promise.tryComplete(); | ||
LOGGER.debug("Rescheduling the task. task='{}' execCount={}", key, execCount); | ||
promise.tryComplete(Result.RESCHEDULED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the fix. Do not reschedule, but just signal the decision and update the delay when scheduling a new run.
testRescheduleDecision(0, 1); | ||
} | ||
|
||
void testRescheduleDecision(long initialDelay, long delay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the test that reproduce the bug.
Test fails with the earlier version of PeriodicTaskExecutor
, as there are flee tasks that keep on running.
Verify with the following 2 commands
git checkout upstream/trunk -- server/src/main/java/org/apache/cassandra/sidecar/tasks/PeriodicTaskExecutor.java
./gradlew :server:test --tests org.apache.cassandra.sidecar.tasks.PeriodicTaskExecutorTest.testRescheduleDecision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll re-review this in the morning (out of time for now) but a few comments so far. The scheduling/rescheduling logic is getting a bit complicated but not sure if there's much to do about that at this point.
* Tells the run completion handler whether task is rescheduled. | ||
* If it is rescheduled, the delay is adjusted to the initial delay of the task | ||
*/ | ||
RESCHEDULED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a Result that doesn't mean RESCHEDULED, like COMPLETED? The fact that we callPromise#tryComplete
with either RESCHEDULED or nothing at all (essentially, null
) seems really odd, even if we never really check for COMPLETED.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COMPLETED
is unnecessary. When the handler at onComplete
is invoked, we know that it is completed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably rename the enum. It is not really the result of execution. Looking for suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - the biggest thing to me was that this enum is used as the type of an AsyncResult and we set the result to RESCHEDULED
or nothing at all, which was just jarring when I was looking at the code.
What do you think of actually using the ScheduleDecision
enum here, and just getting rid of the Result
all together? Then each branch of the executeInternal
method would actually be able to indicate which decision was made - we end up only using RESCHEDULE
right now but at least there isn't an odd mix of promise.tryComplete()
and promise.tryComplete(Result.RESCHEDULE)
which didn't make a lot of sense to me - if the promise is a Promise<Result>
it should really be completed with something other than null... using Promise<ScheduleDecision>
makes more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep. ScheduleDecision
makes better sense.
server/src/main/java/org/apache/cassandra/sidecar/tasks/PeriodicTaskExecutor.java
Show resolved
Hide resolved
server/src/main/java/org/apache/cassandra/sidecar/tasks/PeriodicTaskExecutor.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 - thanks for the fix!
Patch by Yifan Cai; Reviewed by TBD for CASSSIDECAR-210