-
Notifications
You must be signed in to change notification settings - Fork 255
Use cached build plan mostly instead of Cargo to schedule a multi-target build #462
Conversation
It works! At least partially 😄 Now under The process roughly looks as follows:
There's a minor caveat - with current implementation it's possible for the user to quickly modify and save a file, resulting with an empty set of dirty files, which in turn creates an empty queue of compiler calls. For now, in this case, we just run Current heuristic for detecting a set of dirty units based on modified files is as follows:
To correctly detect changes based on changed files, the compiler would have to return a set of files that are required for a given package (if the hierarchy isn't directly mapped to a file system, it depends on the files that are Things to consider:
|
e86a393
to
ca6fcfe
Compare
Depends on #466 for compilation fix. |
Todo:
|
cefa149
to
54ccea6
Compare
Introduced tracking dirty files in the scope of the builds in 9853702.
Tested it, seems that regular and modified-while-building scenarios are working correctly and files are dirty files are tracked correctly. EDIT: Actually the successful build can probably clear the dirty files on another thread while one is being scheduled, so there will be more files that will be flagged to built, unnecessarily. While it shouldn't lead to bad behaviour, I'll have to take a look at it still. One solution would be to fetch them while pending job is popped off the queue and processed (this guarantees that there is no job executed). This also has the benefit of not having to copy the dirty files hashmap everytime a build is requested. However this still has a data race problem, where a build could be popped off the queue and prepared (with dirty files with versions collected then to use it to clear the final dirty files), while the user still manages to edit a file and bump up the version. It's not a major concern, as the only problem would be that the build doesn't clear a dirty state for a given file and will be pulled for any subsequent build. In practice this should yield better performance, as this can overshoot only by one file but would do so more often. Additionally the dirty files hashmap would only have to be copied when appropriate, not on every build request (possibly every keystroke?) but it's more spaghetti code probably this way? |
Didn't 100% confirm that, but from what I tried to work on loading multiple analysis at the same time, it seems there maybe be two issues:
|
Are you saying that if we re-compile the bin, then we also have new data for the lib? Or that the data might be incompatible with the old data for the lib? Or are you saying that rls-analysis needs some way to distinguish different packages for the same crate? So we need to be able to save data for foo-lib as well as foo-bin?
|
Because of how cached build plan works, under `workspace_mode` every member package must be forcefully rebuilt to store the arguments Cargo would generate for the target. This is required when only a certain member package will be rebuilt initially, but changes made to another one will pull a rebuild of a package whose rustc call wasn't cached. Furthermore, sometimes user can quickly input changes and save, before any build can be made. When this happens and there are no modified files, build plan will return an empty job queue, since no files are dirty at that time. When that happens, it delegates to Cargo, just when build scripts are modified. In the future, detecting dirty files and guessing which files relate to which rebuilt package is a concrete area for improvement.
Previously only `rustc` was called on a single file or whole `cargo` routine was executed to perform the build and since the build plan, cached from the `cargo` run, is used now in the `workspace_mode` instead, RLS needs to feed it files changed since last build. Since while the build is being performed, the user can further modify files that are currently being built, only files that were modified prior to that will be marked as clean.
Just to answer here, the problem is, that the analysis hosts the data in a per-crate manner, while we compile (crate, crate type) units and for each of those we have a separate
Depending on a package means depending on its lib target, so I think it's good we only save the data for the libs. In every other case (build scripts, bins) we'll build it and pass the data for different targets in-memory, so I think nothing here needs to change. The only thing that needs changing, I believe, is to support holding different analyses (for each crate target), per each crate. |
All looks good! Sorry it took so long to review |
WIP: This needs only plugging the logic in the
build/mod.rs
, which is relatively a small amount of work, but wanted to get a feedback if things are going in a good direction.@nrc could you take a look and tell me if the overall design is good and if there are any obvious mistakes? I tried to debug and test intermediate functions and behaviour using the RLS and different crates and it works as intended, at least from what I tested on single-package projects.
This probably still will need some tweaking on the vfs side, as
vfs.get_changes().iter().map(|(k,_)| k.clone()).collect()
seems to retrieve not only dirty files, also disk-synced files, which are only open (which might fool the dirty crate heuristics).