Use smallest spanning node as the "starter" node in completions #805

jennybc · 2025-05-14T20:14:35Z

Fixes #778

When we create a new DocumentContext, one of the pieces of data reflects some notion of the current node in the AST. Previously this was constructed as ast.root_node().find_closest_node_to_point() and I propose that we switch to ast.root_node().find_smallest_spanning_node() instead. In this PR, I:

Retain the previous notion of the current node under the name closest_node
Repurpose the node field to mean "the node that point is actually in"

This new definition of node is more favorable for completions (next I'll take #772 out of draft form, which will address #770), which is certainly the biggest user of DocumentContext, in terms of downstream code.

(The fact that redefining node has such a small impact on anything else below crates/ark/src/lsp/ is sort of telling. It feels like there's a lot of bespoke fiddling around with the AST and nodes tucked away in various corners that could potentially be centralized/deduplicated over time. But not today.)

I suspect there's broader refactoring that would make sense in node-finding in areas like signature help, hover, etc., but I don't want to open that can of worms at this time.

jennybc · 2025-05-14T20:27:01Z

crates/ark/src/lsp/document_context.rs

+                .to_string(),
+            ")"
+        );
+    }


To make this test more vivid, here's a screenshot of being in this scenario and requesting completions on the empty 2nd line (can be done with Ctrl + Space):

That shows behaviour before this PR, where we think we're completing the ) node and, therefore, don't provide any R completions. (Whether you see "No suggestions" or these fallback word-based completions is a nondeterministic mystery to us at the moment. @DavisVaughan usually sees the former, @jennybc usually sees the latter.)

Here's the same, after this PR. The completions don't change (yet), but from the output channel you can see that now we realize we're not in a node, i.e. the smallest spanning node is the 'Program' node:

Here's a peek at what will be possible once this PR gets merged, once I add some version of #772. Now we will get real completions from R when completing on, e.g., an empty line.

Copilot

Pull Request Overview

This PR refactors how the current AST node is determined in a DocumentContext. The changes switch from using find_closest_node_to_point for node determination to using find_smallest_spanning_node for the primary node (favoring completions) while preserving the old behavior in a new field (closest_node).

Refactored node lookup in DocumentContext
Updated hover handling to use closest_node
Added tests to verify the new behaviors

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
crates/ark/src/lsp/hover.rs	Updated hover function to use closest_node instead of node
crates/ark/src/lsp/document_context.rs	Refactored node lookup for improved completion behavior and added corresponding tests

crates/ark/src/lsp/document_context.rs

…ughan

…ughan

DavisVaughan

I think I have mostly made peace with this. I'm a little nervous that we are going to find edge cases in completions that expected the old behavior (I'll list one below), but I tried this out for a good while and most of the usual completion moves work (namespace, function args, string file paths, special completions for Sys.getenv() and friends, etc).

I am comforted by the fact that we do have quite a lot of tests for the usual cases, all of which go through a "real" DocumentContext.

If we do get reports of broken behavior, we can probably fix them up as required.

Test cases we need to add

Here are a few tests we should add, related to a fix we need to do for call_node_position_type(), because it was written with the old form in mind.

Pressing ctrl+space here brings up the abort argument list

rlang::abort(@)

Pressing ctrl+space here brings up nothing

rlang::abort(
  @
)

Pressing ctrl+space here brings up the special env completion list

Sys.setenv(@)

but does not here

Sys.setenv(
  @
)

These are two separate completion paths, so we should add separate tests for these, but both have a root case of call_node_position_type() being broken, here is a test we can add for that as well in test_call_node_position_type:

        // After `(`, and on own line
        let (text, point) = point_from_cursor("fn(\n  @\n)");
        let document = Document::new(&text, None);
        let context = DocumentContext::new(&document, point, None);

        // On `main`, this passes
        assert_eq!(
            context.node.node_type(),
            NodeType::Anonymous(String::from("("))
        );

        // On this pr, `(`, doesn't contain the user's cursor, so instead it selects
        // the surrounding `Arguments` node
        // assert_eq!(context.node.node_type(), NodeType::Arguments);

        // On `main`, this passes, and this should also still pass on this PR, but currently does not
        assert_eq!(
            call_node_position_type(&context.node, context.point),
            CallNodePositionType::Name
        );

Some more historical background

When we started out, I was convinced that find_closest_node_to_point() would only ever find a node that "enclosed" the user's cursor, and I thought that if it did not do that, it must have been a bug.

I have now convinced myself that I was wrong, and find_closest_node_to_point() CAN choose a node that does not enclose the user's cursor by design. The actual guarantees of this function are more like:

Find the smallest node that is closest to the point in question. The node must be at or before the point in question. The node may be anonymous (i.e. an individual { token may be selected, rather than the entire { -> } named braced expression node).

The find_smallest_spanning_node() helper is more like:

Find the smallest node that encloses the point in question. For containment, bounds of [] are used, meaning that { 1 + 1 }@ will select the } token, @{ 1 + 1 } will select the { token, and { 1 + 1 @ } will select the whole { -> } node.

I think for many cases we really do want to include anonymous (i.e. non-named) nodes like the } and { tokens.

A particularly interesting PR to look at is:
#321

Where I added this test, confirming that find_closest_node_to_point() does not have to "enclose" the point
79f66e2

That PR was about the selection range feature, and ultimately I did not use find_closest_node_to_point() for it, because it was the wrong tool for the job.

I did end up using a native tree-sitter helper, named_descendant_for_point_range() instead. This is very similar to our find_smallest_spanning_node(), except for 2 key differences:

The tree-sitter version finds only named nodes, you'd have to use descendant_for_point_range() instead to also find anonymous ones

The tree-sitter version has bounds of [) rather than [] like we use, which I do think is an important difference for things like dplyr::@. I wrote some extra comments about this here

ark/crates/ark/src/lsp/selection_range.rs

Lines 35 to 47 in 5d504f3

    
           // Checks only named nodes to find the smallest named node that contains 
        
           // the point using the following definition of containment: 
        
           // - `node.start_position() <= start` 
        
           // - `node.end_position() > start` 
        
           // - `node.end_position() >= end` 
        
           // which reduces to this when you consider that for us `start == end == point` 
        
           // - `node.start_position() <= point` 
        
           // - `node.end_position() > point` 
        
           // So, for example, `{ 1 + 1 }@` won't select the braces (we are past them) but 
        
           // `@{ 1 + 1 }` will (we are about to enter them). 
        
           let Some(node) = tree 
        
               .root_node() 
        
               .named_descendant_for_point_range(point, point)

.

It would be nice to switch to something like descendant_for_point_range(), but I think we probably cannot because of these differences, but it is good to write them down somewhere for future reference.

crates/ark/src/lsp/completions/sources/utils.rs

jennybc · 2025-05-17T00:11:45Z

crates/ark/src/lsp/completions/sources/composite/call.rs


+    use crate::fixtures::point_from_cursor;


I find tests that use this fixture much easier to read, so I've updated anything I touched to use this.

jennybc · 2025-05-17T00:13:16Z

crates/ark/src/lsp/completions/sources/composite/call.rs

@@ -413,4 +413,39 @@ mod tests {
            harp::parse_eval("remove(my_fun)", options.clone()).unwrap();
        })
    }
+


Here are some tests re: "the rlang::abort() case (using a base R function)".

jennybc · 2025-05-17T00:13:51Z

crates/ark/src/lsp/completions/sources/unique/custom.rs

@@ -302,18 +302,34 @@ mod tests {
            let (text, point) = point_from_cursor("Sys.getenv(@)");
            assert_has_ark_test_envvar_completion(text.as_str(), point);

+            // Inside the parentheses, multiline


Here I just interleaved the multiline scenarios inside the existing test. This is the Sys.getenv() custom completion case.

jennybc · 2025-05-17T00:14:54Z

crates/ark/src/lsp/completions/sources/utils.rs

@@ -101,6 +101,7 @@ pub(super) enum CallNodePositionType {

 pub(super) fn call_node_position_type(node: &Node, point: Point) -> CallNodePositionType {
    match node.node_type() {
+        NodeType::Arguments => return CallNodePositionType::Name,


The fixup for argument completions inside a call and custom completions such as Sys.getenv().

Use smallest spanning node as the "starter" node in completions

2659aed

I suspect there's broader refactoring that would make sense in node-finding in areas like signature help, hover, etc., but I don't want to open that can of worms at this time.

jennybc commented May 14, 2025

View reviewed changes

jennybc requested review from DavisVaughan and Copilot May 14, 2025 20:27

Copilot AI reviewed May 14, 2025

View reviewed changes

crates/ark/src/lsp/document_context.rs Show resolved Hide resolved

crates/ark/src/lsp/document_context.rs Show resolved Hide resolved

Use node_text()

82ed9e1

jennybc mentioned this pull request May 14, 2025

Be willing to complete in the face of emptiness #772

Open

jennybc added 2 commits May 16, 2025 12:31

Use point_from_cursor() in these tests

597c9a5

Adding a (currently failing) test for scenario identified by @DavisVa…

3eba779

…ughan

DavisVaughan approved these changes May 16, 2025

View reviewed changes

Anticipate being in an "arguments" node when completing inside a call

1dd4dd0

jennybc force-pushed the bugfix/nearest-enclosing-node branch from 52c0241 to 1dd4dd0 Compare May 16, 2025 19:55

DavisVaughan reviewed May 16, 2025

View reviewed changes

crates/ark/src/lsp/completions/sources/utils.rs Outdated Show resolved Hide resolved

jennybc added 4 commits May 16, 2025 13:33

Assert the node type

6ab0bd2

Thoroughly test multiline custom completions with Sys.getenv()

e8ff62a

Use point_from_cursor() in these tests

42ee5be

Add tests for argument completion inside a multiline call

046c131

jennybc force-pushed the bugfix/nearest-enclosing-node branch from 1f10a33 to 046c131 Compare May 17, 2025 00:01

Feels like a better test name

816ba0a

jennybc commented May 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use smallest spanning node as the "starter" node in completions #805

Use smallest spanning node as the "starter" node in completions #805

jennybc commented May 14, 2025 •

edited

Loading

jennybc May 14, 2025 •

edited

Loading

Copilot AI left a comment

DavisVaughan left a comment

jennybc May 17, 2025

jennybc May 17, 2025

jennybc May 17, 2025

jennybc May 17, 2025

	// Checks only named nodes to find the smallest named node that contains
	// the point using the following definition of containment:
	// - `node.start_position() <= start`
	// - `node.end_position() > start`
	// - `node.end_position() >= end`
	// which reduces to this when you consider that for us `start == end == point`
	// - `node.start_position() <= point`
	// - `node.end_position() > point`
	// So, for example, `{ 1 + 1 }@` won't select the braces (we are past them) but
	// `@{ 1 + 1 }` will (we are about to enter them).
	let Some(node) = tree
	.root_node()
	.named_descendant_for_point_range(point, point)

Use smallest spanning node as the "starter" node in completions #805

Are you sure you want to change the base?

Use smallest spanning node as the "starter" node in completions #805

Conversation

jennybc commented May 14, 2025 • edited Loading

jennybc May 14, 2025 • edited Loading

Choose a reason for hiding this comment

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

DavisVaughan left a comment

Choose a reason for hiding this comment

Test cases we need to add

Some more historical background

jennybc May 17, 2025

Choose a reason for hiding this comment

jennybc May 17, 2025

Choose a reason for hiding this comment

jennybc May 17, 2025

Choose a reason for hiding this comment

jennybc May 17, 2025

Choose a reason for hiding this comment

jennybc commented May 14, 2025 •

edited

Loading

jennybc May 14, 2025 •

edited

Loading