-
Notifications
You must be signed in to change notification settings - Fork 14
Run R tasks on the R thread #109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cfa7f62
to
9c1958d
Compare
a9fd35e
to
d8f9d56
Compare
In recent commits I made this change to modified crates/ark/src/r_task.rs
@@ -25,8 +25,8 @@ type SharedOption<T> = Arc<Mutex<Option<T>>>;
pub fn r_task<'env, F, T>(f: F) -> T
where
F: FnOnce() -> T,
- F: 'env + Send,
- T: 'env + Send,
+ F: 'env,
+ T: 'env,
{
// Escape hatch for unit tests
if unsafe { R_TASK_BYPASS } {
@@ -62,7 +62,7 @@ where
};
// Move `f` to heap and erase its lifetime
- let closure: Box<dyn FnOnce() + Send + 'env> = Box::new(closure);
+ let closure: Box<dyn FnOnce() + 'env> = Box::new(closure);
let closure: Box<dyn FnOnce() + Send + 'static> = unsafe { std::mem::transmute(closure) };
// Channel to communicate completion status of the task/closure This goes further than I did this to make it possible to use types that are not
I think it is safe to send these objects to another thread because the calling thread is blocked while the task is running, so that we have a perfect delineation of control flow. Following this change I was able to remove the remaining occurrences of |
I'm now less convinced we'll be able to make it transparent that R accesses are made on the R thread by passing an |
hmm we should be safe regarding data races but this does allow sending objects that are sensitive to the thread they are running on. So we expose ourselves to the same sort of problems we ran into on the R side with reticulate and Shiny, only here it's with Rust objects. I think we're fine for now but I'll take another look at solving the |
ede7539
to
0afbd38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of minor comments for now, I think the overall approach seems nice
crates/harp/src/exec.rs
Outdated
// This could be implemented with R interrupts but would require to | ||
// unsafely jump over the Rust stack, unless we wrapped all R API functions | ||
// to return an Option. | ||
pub fn safely<'env, F, T>(f: F) -> T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without_events_or_interrupts()
or with_no_events_or_interrupts()
if you wanted a more expressive withr like name - safely()
is a little generic, i think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning more safety features though, and then these names will be misleading. Hopefully we can eventually converge towards a couple of main safety operations and then associate a strong meaning with their name.
90f076a
to
1afd9c1
Compare
We now have I also refactored the |
7d28d17
to
d978606
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the new enum, the whole (1, false)
thing was confusing to me
Approving what I've seen so far. I'll use the PR today.
I can look at your eventual changes to the tests too when you get there
crates/ark/src/interface.rs
Outdated
// A task woke us up, fulfill it and start next loop tick | ||
// (this might run more tasks) | ||
recv(self.tasks_rx) -> task => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, IIUC, now in the loop we:
- Run up to 3 tasks
- Start waiting for either console input or another task
- If we have console input, we process that
- If we hit a task, then we append it and then restart the loop, running up to 3 more tasks
So we are pretty aggressive about getting through the task queue, which seems reasonable I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it's a 50/50 chance whether we process a request or a task, which means we might run more than 3 tasks at a time from read-console, but changing that would require some convolutions so I went for the simpler code.
Can't currently test `r_task()` because it requires a functioning `R_MAIN`
The control flow of a task is strictly delineated in such a way that the task is necessarily completed before control is returned to the caller. So it is safe to erase the `Send` requirement on the objects captured by the closure.
Since we relaxed the `Send` requirement on tasks
Co-authored-by: Davis Vaughan <[email protected]>
To avoid sending objects with implementations sensitive to thread state (id, thread-local storage, etc) Requires dev tree-sitter which adds `Send` and `Sync` bounds to a bunch of types.
- Yield to auxiliary threads and R event loop at start of loop tick - Wake up as soon as a task is available
bfba2c8
to
7d0ebeb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
// is sequential, objects captured by `f` might have implementations | ||
// sensitive to some thread state (ID, thread-local storage, etc). | ||
|
||
pub fn r_task<'env, F, T>(f: F) -> T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we document the required lifetime 'env
here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 2a1b228
// them on the R thread since that calls the R API. | ||
unsafe impl Sync for RObject {} | ||
|
||
// FIXME: This should only be Sync | ||
unsafe impl Send for RObject {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I saw this was handled on a follow-up PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup this will be fixed soon
} | ||
} | ||
|
||
// Be defensive for the case an auxiliary thread runs a task before R is initialized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we safely initialize R_MAIN
statically or at some known time before any other threads would be created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll think about this while working on moving R to the main thread.
impl RTaskMain { | ||
pub fn fulfill(&mut self) { | ||
// Move closure here and call it | ||
self.closure.take().map(|closure| closure()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we need any extra defense against Rust panics, or R errors / longjmps, for tasks invoked here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup I plan to improve this while working on a timeout for R tasks.
Addresses posit-dev/positron#1536
Addresses posit-dev/positron#1516
Addresses posit-dev/positron#431
Progress towards posit-dev/positron#1419
The tasks are run one by one at yield/interrupt time. I used a function rather than a macro because we discussed with @DavisVaughan the possibility of implementing his idea of passing the R API as a struct to the callback. This way in files implementing behaviour for auxiliary threads, we'd exclusively access the R API via this struct. In other files implementing behaviour for the main R thread, we could access the R API directly. This delineation will allow us to be more in control of safety.
TODO in further PRs:
Still need to figure out how to send some LSP tasks to the main thread because tree-sitter objects are not Send/Sync. Hopefully we can chop up the tasks more finely. Until then Shiny apps might still crash.Edit: now done.Add timeout on R tasks. I think this will require longjumping over Rust stacks, but I'll provide some tools to make it possible to reduce the Rust context that will be jumped over.