Run R tasks on the R thread #109

lionel- · 2023-10-09T11:36:56Z

Addresses posit-dev/positron#1536
Addresses posit-dev/positron#1516
Addresses posit-dev/positron#431
Progress towards posit-dev/positron#1419

The tasks are run one by one at yield/interrupt time. I used a function rather than a macro because we discussed with @DavisVaughan the possibility of implementing his idea of passing the R API as a struct to the callback. This way in files implementing behaviour for auxiliary threads, we'd exclusively access the R API via this struct. In other files implementing behaviour for the main R thread, we could access the R API directly. This delineation will allow us to be more in control of safety.

TODO in further PRs:

Still need to figure out how to send some LSP tasks to the main thread because tree-sitter objects are not Send/Sync. Hopefully we can chop up the tasks more finely. Until then Shiny apps might still crash. Edit: now done.
Add timeout on R tasks. I think this will require longjumping over Rust stacks, but I'll provide some tools to make it possible to reduce the Rust context that will be jumped over.

crates/ark/src/interface.rs

crates/ark/tests/environment.rs

crates/harp/src/session.rs

lionel- · 2023-10-10T12:31:07Z

In recent commits I made this change to r_task():

modified   crates/ark/src/r_task.rs
@@ -25,8 +25,8 @@ type SharedOption<T> = Arc<Mutex<Option<T>>>;
 pub fn r_task<'env, F, T>(f: F) -> T
 where
     F: FnOnce() -> T,
-    F: 'env + Send,
-    T: 'env + Send,
+    F: 'env,
+    T: 'env,
 {
     // Escape hatch for unit tests
     if unsafe { R_TASK_BYPASS } {
@@ -62,7 +62,7 @@ where
         };
 
         // Move `f` to heap and erase its lifetime
-        let closure: Box<dyn FnOnce() + Send + 'env> = Box::new(closure);
+        let closure: Box<dyn FnOnce() + 'env> = Box::new(closure);
         let closure: Box<dyn FnOnce() + Send + 'static> = unsafe { std::mem::transmute(closure) };
 
         // Channel to communicate completion status of the task/closure

This goes further than Crossbeam::thread::ScopedThreadBuilder from which r_task() is adapted. In addition to erasing the closure lifetime and allow it to be sent to another thread (because we guarantee safe storage for the duration of the call), we also erase the Send requirement on the variables captured by the closure.

I did this to make it possible to use types that are not Send inside the closure. This fixes this sort of compilation errors:

rustc [E0277]: `*const c_void` cannot be shared between threads safely
within `completions::CompletionContext<'_>`, the trait `Sync` is not implemented for `*const c_void`
required for `&completions::CompletionContext<'_>` to implement `Send`
rustc [E0277]: `*const tree_sitter::ffi::TSTree` cannot be shared between threads safely
within `completions::CompletionContext<'_>`, the trait `Sync` is not implemented for `*const tree_sitter::ffi::TSTree`
required for `&completions::CompletionContext<'_>` to implement `Send`
rustc [E0277]: required because it's used within this closure

I think it is safe to send these objects to another thread because the calling thread is blocked while the task is running, so that we have a perfect delineation of control flow.

Following this change I was able to remove the remaining occurrences of r_lock! as well as all the locking machinery. All R API accesses are now executed from the R thread, which addresses posit-dev/positron#1516 and hopefully posit-dev/positron#431.

lionel- · 2023-10-10T12:45:18Z

I'm now less convinced we'll be able to make it transparent that R accesses are made on the R thread by passing an RApi struct to the task closure. While this would nicely work with free functions, this wouldn't prevent accesses through trait methods since the latter can't really be stored in a struct. Or do you see a way to make it work @DavisVaughan?

lionel- · 2023-10-10T13:32:07Z

I think it is safe to send these objects to another thread because the calling thread is blocked while the task is running, so that we have a perfect delineation of control flow.

hmm we should be safe regarding data races but this does allow sending objects that are sensitive to the thread they are running on. So we expose ourselves to the same sort of problems we ran into on the R side with reticulate and Shiny, only here it's with Rust objects. I think we're fine for now but I'll take another look at solving the Send issues without relaxing the safety requirement.

DavisVaughan

Lots of minor comments for now, I think the overall approach seems nice

crates/ark/src/interface.rs

crates/harp/src/lib.rs

crates/ark/src/r_task.rs

crates/ark/src/interface.rs

DavisVaughan · 2023-10-10T15:04:16Z

crates/harp/src/exec.rs

+// This could be implemented with R interrupts but would require to
+// unsafely jump over the Rust stack, unless we wrapped all R API functions
+// to return an Option.
+pub fn safely<'env, F, T>(f: F) -> T


without_events_or_interrupts() or with_no_events_or_interrupts() if you wanted a more expressive withr like name - safely() is a little generic, i think

I'm planning more safety features though, and then these names will be misleading. Hopefully we can eventually converge towards a couple of main safety operations and then associate a strong meaning with their name.

crates/harp/src/lib.rs

crates/harp/src/session.rs

crates/harp/src/test.rs

lionel- · 2023-10-11T11:35:30Z

hmm we should be safe regarding data races but this does allow sending objects that are sensitive to the thread they are running on. So we expose ourselves to the same sort of problems we ran into on the R side with reticulate and Shiny, only here it's with Rust objects. I think we're fine for now but I'll take another look at solving the Send issues without relaxing the safety requirement.

We now have Send tasks again. This requires dev tree-sitter as well as this branch: r-lib/tree-sitter-r#59. More work is needed to ensure safety, see posit-dev/positron#1550

I also refactored the read_console() loop to make control flow clearer (we now match control flow with an evocative enum) and to wake up as soon as a task is available. Waking up this way fixes a performance issue introduced by switching from lock to tasks.

DavisVaughan

I really like the new enum, the whole (1, false) thing was confusing to me

Approving what I've seen so far. I'll use the PR today.

I can look at your eventual changes to the tests too when you get there

DavisVaughan · 2023-10-11T12:54:26Z

crates/ark/src/interface.rs

+                // A task woke us up, fulfill it and start next loop tick
+                // (this might run more tasks)
+                recv(self.tasks_rx) -> task => {


So, IIUC, now in the loop we:

Run up to 3 tasks

Start waiting for either console input or another task

If we have console input, we process that

If we hit a task, then we append it and then restart the loop, running up to 3 more tasks

So we are pretty aggressive about getting through the task queue, which seems reasonable I think

Actually it's a 50/50 chance whether we process a request or a task, which means we might run more than 3 tasks at a time from read-console, but changing that would require some convolutions so I went for the simpler code.

Can't currently test `r_task()` because it requires a functioning `R_MAIN`

The control flow of a task is strictly delineated in such a way that the task is necessarily completed before control is returned to the caller. So it is safe to erase the `Send` requirement on the objects captured by the closure.

Since we relaxed the `Send` requirement on tasks

Co-authored-by: Davis Vaughan <[email protected]>

To avoid sending objects with implementations sensitive to thread state (id, thread-local storage, etc) Requires dev tree-sitter which adds `Send` and `Sync` bounds to a bunch of types.

- Yield to auxiliary threads and R event loop at start of loop tick - Wake up as soon as a task is available

kevinushey

LGTM!

kevinushey · 2023-10-16T17:20:29Z

crates/ark/src/r_task.rs

+// is sequential, objects captured by `f` might have implementations
+// sensitive to some thread state (ID, thread-local storage, etc).
+
+pub fn r_task<'env, F, T>(f: F) -> T


Can we document the required lifetime 'env here as well?

Done in 2a1b228

kevinushey · 2023-10-16T17:22:49Z

crates/harp/src/object.rs

+// them on the R thread since that calls the R API.
+unsafe impl Sync for RObject {}
+
+// FIXME: This should only be Sync
 unsafe impl Send for RObject {}


I think I saw this was handled on a follow-up PR?

yup this will be fixed soon

kevinushey · 2023-10-16T17:27:52Z

crates/ark/src/r_task.rs

+    }
+}
+
+// Be defensive for the case an auxiliary thread runs a task before R is initialized


Could we safely initialize R_MAIN statically or at some known time before any other threads would be created?

I'll think about this while working on moving R to the main thread.

kevinushey · 2023-10-16T17:28:37Z

crates/ark/src/r_task.rs

+impl RTaskMain {
+    pub fn fulfill(&mut self) {
+        // Move closure here and call it
+        self.closure.take().map(|closure| closure());


Do you think we need any extra defense against Rust panics, or R errors / longjmps, for tasks invoked here?

yup I plan to improve this while working on a timeout for R tasks.

lionel- changed the title ~~Run R tasks on the main thread~~ Run R tasks on the R thread Oct 9, 2023

lionel- commented Oct 9, 2023

View reviewed changes

crates/ark/src/interface.rs Outdated Show resolved Hide resolved

crates/ark/tests/environment.rs Outdated Show resolved Hide resolved

crates/harp/src/session.rs Show resolved Hide resolved

lionel- force-pushed the feature/single-thread branch from cfa7f62 to 9c1958d Compare October 9, 2023 12:11

lionel- requested review from DavisVaughan, kevinushey and jmcphers October 9, 2023 12:29

lionel- force-pushed the feature/single-thread branch 2 times, most recently from a9fd35e to d8f9d56 Compare October 10, 2023 12:14

lionel- force-pushed the main branch from 1c8a2d9 to 7e14cbd Compare October 10, 2023 14:55

lionel- force-pushed the feature/single-thread branch from ede7539 to 0afbd38 Compare October 10, 2023 14:58

DavisVaughan reviewed Oct 10, 2023

View reviewed changes

lionel- force-pushed the feature/single-thread branch from 90f076a to 1afd9c1 Compare October 11, 2023 08:43

lionel- force-pushed the feature/single-thread branch 2 times, most recently from 7d28d17 to d978606 Compare October 11, 2023 12:27

DavisVaughan approved these changes Oct 11, 2023

View reviewed changes

lionel- added 11 commits October 12, 2023 07:00

Add r_task() and run environment tasks on the main thread

e5eb1b5

Log time waiting and prevent interrupts

2b85a42

Prepare for an uninitialized R thread

cd1ba2d

Implement r_safely! in harp and use it in session API

8b33fe9

Run read_console() within safely!

947eba8

Use r_task() in more places

3fe8b4f

Run tasks at interrupt time

7ddf8de

Add escape hatch in r_task() for unit tests

196e52a

Add timeout in acquire_r_main()

bfed682

Remove test_threads()

e1abbd4

Can't currently test `r_task()` because it requires a functioning `R_MAIN`

Disable library() diagnostics to avoid r_lock!

a1438d2

lionel- and others added 16 commits October 12, 2023 07:00

Relax Send requirement on r_task() closure

aa12f24

The control flow of a task is strictly delineated in such a way that the task is necessarily completed before control is returned to the caller. So it is safe to erase the `Send` requirement on the objects captured by the closure.

Remove remaining r_lock! contexts

fd2d19c

Use SEXP objects in R tasks again

c9ccd4d

Since we relaxed the `Send` requirement on tasks

Reformat with +nightly rustfmt

83435cb

Apply suggestions from code review

b019126

Co-authored-by: Davis Vaughan <[email protected]>

Restore Send bound on task closures

2be6acb

To avoid sending objects with implementations sensitive to thread state (id, thread-local storage, etc) Requires dev tree-sitter which adds `Send` and `Sync` bounds to a bunch of types.

Always check for interrupts in read_console() even when suspended

d4be1d3

Remove redundant interrupt disabling

bbb8b79

Run maximum 3 tasks at a time at yield-time

865365e

Refactor read_console() loop

287537a

- Yield to auxiliary threads and R event loop at start of loop tick - Wake up as soon as a task is available

Only run tasks within yield_to_tasks()

f66e0e9

Add initial logging when requesting a task

d56118c

Expand notes on r_task() safety

63b4c5f

Remove dangling r_safely

ad4b1e2

Make r_safely() a function rather than a macro

1ab04b7

Reintroduce a simple lock for unit tests

7d0ebeb

lionel- force-pushed the feature/single-thread branch from bfba2c8 to 7d0ebeb Compare October 12, 2023 05:01

lionel- merged commit a2a5bce into main Oct 12, 2023

lionel- deleted the feature/single-thread branch October 12, 2023 05:17

kevinushey reviewed Oct 16, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run R tasks on the R thread #109

Run R tasks on the R thread #109

lionel- commented Oct 9, 2023 •

edited

Loading

lionel- commented Oct 10, 2023 •

edited

Loading

lionel- commented Oct 10, 2023

lionel- commented Oct 10, 2023

DavisVaughan left a comment

DavisVaughan Oct 10, 2023

lionel- Oct 10, 2023

lionel- commented Oct 11, 2023

DavisVaughan left a comment

DavisVaughan Oct 11, 2023 •

edited

Loading

lionel- Oct 11, 2023

kevinushey left a comment

kevinushey Oct 16, 2023

lionel- Oct 17, 2023

kevinushey Oct 16, 2023

lionel- Oct 17, 2023

kevinushey Oct 16, 2023

lionel- Oct 17, 2023

kevinushey Oct 16, 2023

lionel- Oct 17, 2023

Run R tasks on the R thread #109

Run R tasks on the R thread #109

Conversation

lionel- commented Oct 9, 2023 • edited Loading

lionel- commented Oct 10, 2023 • edited Loading

lionel- commented Oct 10, 2023

lionel- commented Oct 10, 2023

DavisVaughan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lionel- commented Oct 11, 2023

DavisVaughan left a comment

Choose a reason for hiding this comment

DavisVaughan Oct 11, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevinushey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lionel- commented Oct 9, 2023 •

edited

Loading

lionel- commented Oct 10, 2023 •

edited

Loading

DavisVaughan Oct 11, 2023 •

edited

Loading