Skip to content

Use smallest spanning node as the "starter" node in completions #805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

jennybc
Copy link
Member

@jennybc jennybc commented May 14, 2025

Fixes #778

When we create a new DocumentContext, one of the pieces of data reflects some notion of the current node in the AST. Previously this was constructed as ast.root_node().find_closest_node_to_point() and I propose that we switch to ast.root_node().find_smallest_spanning_node() instead. In this PR, I:

  • Retain the previous notion of the current node under the name closest_node
  • Repurpose the node field to mean "the node that point is actually in"

This new definition of node is more favorable for completions (next I'll take #772 out of draft form, which will address #770), which is certainly the biggest user of DocumentContext, in terms of downstream code.

(The fact that redefining node has such a small impact on anything else below crates/ark/src/lsp/ is sort of telling. It feels like there's a lot of bespoke fiddling around with the AST and nodes tucked away in various corners that could potentially be centralized/deduplicated over time. But not today.)

I suspect there's broader refactoring that would make sense in node-finding in areas like signature help, hover, etc., but I don't want to open that can of worms at this time.
.to_string(),
")"
);
}
Copy link
Member Author

@jennybc jennybc May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this test more vivid, here's a screenshot of being in this scenario and requesting completions on the empty 2nd line (can be done with Ctrl + Space):

Screenshot 2025-05-14 at 9 55 55 AM

That shows behaviour before this PR, where we think we're completing the ) node and, therefore, don't provide any R completions. (Whether you see "No suggestions" or these fallback word-based completions is a nondeterministic mystery to us at the moment. @DavisVaughan usually sees the former, @jennybc usually sees the latter.)

Here's the same, after this PR. The completions don't change (yet), but from the output channel you can see that now we realize we're not in a node, i.e. the smallest spanning node is the 'Program' node:

Screenshot 2025-05-14 at 9 57 37 AM

Here's a peek at what will be possible once this PR gets merged, once I add some version of #772. Now we will get real completions from R when completing on, e.g., an empty line.

Screenshot 2025-05-14 at 1 25 34 PM

@jennybc jennybc requested review from DavisVaughan and Copilot May 14, 2025 20:27
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors how the current AST node is determined in a DocumentContext. The changes switch from using find_closest_node_to_point for node determination to using find_smallest_spanning_node for the primary node (favoring completions) while preserving the old behavior in a new field (closest_node).

  • Refactored node lookup in DocumentContext
  • Updated hover handling to use closest_node
  • Added tests to verify the new behaviors

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
crates/ark/src/lsp/hover.rs Updated hover function to use closest_node instead of node
crates/ark/src/lsp/document_context.rs Refactored node lookup for improved completion behavior and added corresponding tests

Copy link
Contributor

@DavisVaughan DavisVaughan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have mostly made peace with this. I'm a little nervous that we are going to find edge cases in completions that expected the old behavior (I'll list one below), but I tried this out for a good while and most of the usual completion moves work (namespace, function args, string file paths, special completions for Sys.getenv() and friends, etc).

I am comforted by the fact that we do have quite a lot of tests for the usual cases, all of which go through a "real" DocumentContext.

If we do get reports of broken behavior, we can probably fix them up as required.

Test cases we need to add

Here are a few tests we should add, related to a fix we need to do for call_node_position_type(), because it was written with the old form in mind.

Pressing ctrl+space here brings up the abort argument list

rlang::abort(@)

Pressing ctrl+space here brings up nothing

rlang::abort(
  @
)

Pressing ctrl+space here brings up the special env completion list

Sys.setenv(@)

but does not here

Sys.setenv(
  @
)

These are two separate completion paths, so we should add separate tests for these, but both have a root case of call_node_position_type() being broken, here is a test we can add for that as well in test_call_node_position_type:

        // After `(`, and on own line
        let (text, point) = point_from_cursor("fn(\n  @\n)");
        let document = Document::new(&text, None);
        let context = DocumentContext::new(&document, point, None);

        // On `main`, this passes
        assert_eq!(
            context.node.node_type(),
            NodeType::Anonymous(String::from("("))
        );

        // On this pr, `(`, doesn't contain the user's cursor, so instead it selects
        // the surrounding `Arguments` node
        // assert_eq!(context.node.node_type(), NodeType::Arguments);

        // On `main`, this passes, and this should also still pass on this PR, but currently does not
        assert_eq!(
            call_node_position_type(&context.node, context.point),
            CallNodePositionType::Name
        );

Some more historical background

When we started out, I was convinced that find_closest_node_to_point() would only ever find a node that "enclosed" the user's cursor, and I thought that if it did not do that, it must have been a bug.

I have now convinced myself that I was wrong, and find_closest_node_to_point() CAN choose a node that does not enclose the user's cursor by design. The actual guarantees of this function are more like:

Find the smallest node that is closest to the point in question. The node must be at or before the point in question. The node may be anonymous (i.e. an individual { token may be selected, rather than the entire { -> } named braced expression node).

The find_smallest_spanning_node() helper is more like:

Find the smallest node that encloses the point in question. For containment, bounds of [] are used, meaning that { 1 + 1 }@ will select the } token, @{ 1 + 1 } will select the { token, and { 1 + 1 @ } will select the whole { -> } node.

I think for many cases we really do want to include anonymous (i.e. non-named) nodes like the } and { tokens.


A particularly interesting PR to look at is:
#321

Where I added this test, confirming that find_closest_node_to_point() does not have to "enclose" the point
79f66e2

That PR was about the selection range feature, and ultimately I did not use find_closest_node_to_point() for it, because it was the wrong tool for the job.

I did end up using a native tree-sitter helper, named_descendant_for_point_range() instead. This is very similar to our find_smallest_spanning_node(), except for 2 key differences:

  • The tree-sitter version finds only named nodes, you'd have to use descendant_for_point_range() instead to also find anonymous ones
  • The tree-sitter version has bounds of [) rather than [] like we use, which I do think is an important difference for things like dplyr::@. I wrote some extra comments about this here
    // Checks only named nodes to find the smallest named node that contains
    // the point using the following definition of containment:
    // - `node.start_position() <= start`
    // - `node.end_position() > start`
    // - `node.end_position() >= end`
    // which reduces to this when you consider that for us `start == end == point`
    // - `node.start_position() <= point`
    // - `node.end_position() > point`
    // So, for example, `{ 1 + 1 }@` won't select the braces (we are past them) but
    // `@{ 1 + 1 }` will (we are about to enter them).
    let Some(node) = tree
    .root_node()
    .named_descendant_for_point_range(point, point)
    .

It would be nice to switch to something like descendant_for_point_range(), but I think we probably cannot because of these differences, but it is good to write them down somewhere for future reference.

@jennybc jennybc force-pushed the bugfix/nearest-enclosing-node branch from 52c0241 to 1dd4dd0 Compare May 16, 2025 19:55
@jennybc jennybc force-pushed the bugfix/nearest-enclosing-node branch from 1f10a33 to 046c131 Compare May 17, 2025 00:01

use crate::fixtures::point_from_cursor;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find tests that use this fixture much easier to read, so I've updated anything I touched to use this.

@@ -413,4 +413,39 @@ mod tests {
harp::parse_eval("remove(my_fun)", options.clone()).unwrap();
})
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some tests re: "the rlang::abort() case (using a base R function)".

@@ -302,18 +302,34 @@ mod tests {
let (text, point) = point_from_cursor("Sys.getenv(@)");
assert_has_ark_test_envvar_completion(text.as_str(), point);

// Inside the parentheses, multiline
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I just interleaved the multiline scenarios inside the existing test. This is the Sys.getenv() custom completion case.

@@ -101,6 +101,7 @@ pub(super) enum CallNodePositionType {

pub(super) fn call_node_position_type(node: &Node, point: Point) -> CallNodePositionType {
match node.node_type() {
NodeType::Arguments => return CallNodePositionType::Name,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixup for argument completions inside a call and custom completions such as Sys.getenv().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Newly constructed document context may get the wrong node
2 participants