Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only parse the initial file the first time it's opened #1722

Merged

Conversation

puremourning
Copy link
Member

@puremourning puremourning commented Dec 15, 2023

When the language_server_completer is first constructed, it's usually a FileReadyToParse callback. But at that time, we can't actually parse anything because we haven't done the initialization exchange with the server, so we queue this up for later.

The way this was implemenented was to register a "on file ready to parse" callback to update the server with the file contents at the time the completer was constructed. In practice, as the completer is usually constructed via a file ready to pase event, this actually happens immediately and we then shunt the request into "OnInitializeComplete" handlers, which are fired when the initialize exchange happens.

Why don't we just put something directly in OnInitializeComplete handlers? Well because in the constructor, we don't have the request_data - we only get that in the OnFileReadyToParse callback, so we have this jank. WCGW?

As a result of this dance, we actually introduced a subtle error. OnFileReadyToParse handlers are called every time we have a file parse event, and - we never remove this "parse the file contents as they were at the time of initialization" request from the "file ready to parse handlers" list. So every time we has a file parse event, we update the server with stale data, then immediately correct it! In practice, this isn't all that noticable, but it is braindead.

Simple solution is to add a "once" flag to the handler creating oneshot handlers, and clear such handlers after firing them (or converting them to initialize handlers). Jank to fix jank. It might be better to find a way to just remove these handler "lists" altogether, as there is only one other place in the codebase where one is registered, and we could always just pass request_data to constructor if we have it...


This change is Reviewable

Copy link
Member Author

@puremourning puremourning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 LGTMs obtained


ycmd/completers/language_server/language_server_completer.py line 1918 at r1 (raw file):

        self._OnInitializeComplete( partial( handler,
                                             request_data = request_data ) )
        ClearOneshotHandlers()

no, do this outside the for loop, muppet

@puremourning puremourning force-pushed the fix-parsing-first-file-every-time branch from 70000e3 to 53ceb94 Compare December 15, 2023 21:45
Copy link
Collaborator

@bstaletic bstaletic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking so long to get to reviewing this.

we never remove this "parse the file contents as they were at the time of initialization"

Please tell me if I am getting this whole thing right.

Both ClangdCompleter and LanguageServerCompleter register a certain callback for after initialization handshake is done.
Those callbacks are accumulated in LanguageServerCompleter._on_file_ready_to_parse_handlers.
These are then processed in LanguageServerCompleter.OnFileReadyToParse.
But! If the handshake is not done, we put the handlers into _on_initialize_complete_handlers.
So, we can accumulate stuff in the _on_initialize_complete_handlers multiple times even though we only have a single parse request pending.

Currently, _on_file_ready_to_parse_handlers is never emptied, which this pull request is trying to solve.

If all that is correct (and I believe it is), then why are we introducing this once flag, instead of just treating both sources of _on_file_ready_to_parse callbacks as "one-shot"?

Reviewed 5 of 5 files at r2, all commit messages.
Reviewable status: 0 of 2 LGTMs obtained (waiting on @puremourning)


ycmd/tests/go/__init__.py line 49 at r2 (raw file):

                 BuildRequest(
                   filepath = main_file,
                   contents = GetTestFileContents( main_file ),

And why were these changes necessary? We should be able to handle this even without explicitly getting contents. Right?

Copy link
Collaborator

@bstaletic bstaletic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer my own question:

Clangd is not a true one shot callback, unlike the other one. It is a once-per-file callback and that optimization is handled in the callback itself.

So that part is fine.

Reviewable status: 0 of 2 LGTMs obtained (waiting on @puremourning)


ycmd/tests/go/__init__.py line 49 at r2 (raw file):

Previously, bstaletic (Boris Staletic) wrote…

And why were these changes necessary? We should be able to handle this even without explicitly getting contents. Right?

Okay, we do need to read the contents at some point... obviously!

But can we change BuildRequest to actually do something lik

contents = kwrgs.get( 'contents', GetTestFileContents( filepath ) )

?
That seems cleaner than having to remember to always pass contents explicitly.

Copy link
Member Author

@puremourning puremourning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 LGTMs obtained (waiting on @bstaletic)


ycmd/tests/go/__init__.py line 49 at r2 (raw file):

Previously, bstaletic (Boris Staletic) wrote…

Okay, we do need to read the contents at some point... obviously!

But can we change BuildRequest to actually do something lik

contents = kwrgs.get( 'contents', GetTestFileContents( filepath ) )

?
That seems cleaner than having to remember to always pass contents explicitly.

We... could. But I fear that it will cause a billion other things to break.

I'll try it, and see how much it breaks. If it breaks the world, I don't think it's worth the effort to change tons of tests that "assume" its current behaviour.

@puremourning puremourning force-pushed the fix-parsing-first-file-every-time branch from 6778d57 to 1d221e4 Compare December 27, 2023 15:55
Copy link
Member Author

@puremourning puremourning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 LGTMs obtained (waiting on @bstaletic)


ycmd/tests/go/__init__.py line 49 at r2 (raw file):

Previously, puremourning (Ben Jackson) wrote…

We... could. But I fear that it will cause a billion other things to break.

I'll try it, and see how much it breaks. If it breaks the world, I don't think it's worth the effort to change tons of tests that "assume" its current behaviour.

Akshually it works great. Done.

Copy link

codecov bot commented Dec 27, 2023

Codecov Report

Merging #1722 (3204179) into master (80a034d) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1722   +/-   ##
=======================================
  Coverage   95.45%   95.45%           
=======================================
  Files          83       83           
  Lines        8136     8141    +5     
  Branches      163      163           
=======================================
+ Hits         7766     7771    +5     
  Misses        320      320           
  Partials       50       50           

When the language_server_completer is first constructed, it's usually a
FileReadyToParse callback. But at that time, we can't actually parse
anything because we haven't done the initialization exchange with the
server, so we queue this up for later.

The way this was implemenented was to register a "on file ready to
parse" callback to update the server with the file contents at the time
the completer was constructed. In practice, as the completer is usually
constructed via a file ready to pase event, this actually happens
immediately and we then shunt the request into "OnInitializeComplete"
handlers, which are fired when the initialize exchange happens.

Why don't we just put something directly in OnInitializeComplete
handlers? Well because in the constructor, we don't have the
request_data - we only get that in the OnFileReadyToParse callback, so
we have this jank. WCGW?

As a result of this dance, we actually introduced a subtle error.
OnFileReadyToParse handlers are called _every time_ we have a file parse
event, and - we never _remove_ this "parse the file contents as they
were at the time of initialization" request from the "file ready to
parse handlers" list. So every time we has a file parse event, we update
the server with _stale_ data, then immediately correct it! In practice,
this isn't all that noticable, but it is braindead.

Simple solution is to add a "once" flag to the handler creating oneshot
handlers, and clear such handlers after firing them (or converting them
to initialize handlers). Jank to fix jank. It might be better to find a
way to just remove these handler "lists" altogether, as there is
only one other place in the codebase where one is registered, and we
could always just pass request_data to constructor if we have it...
The previous commit broke a _ton_ of tests. It turns out that the tests
were relying on the bug that it fixed. Previously our
"StartXServerInDirectory" utils were using a FileReadyToParse to
initialize the server, but without including the actual contents. We
were getting away with this because the previous bug meant that we tried
to refresh the file on every subsequent FileReadyToParse, but as its
contents were not included, we read the proper contents from the disk,
or closed it.

Now the bug is fixed, we no longer have redundant refreshes of the
"first" file opened, so all the tests were broken, having the "first"
file in the sever with empty contents. Solution is to include the actual
contents when starting the completer server.

In fact, given that it seems intuitive, we now just read file contents
in BuildRequest directly when not otherwise supplied. There's no point
trying that if a file path wasn't supplied though. We _could_ remove a
lot of manual ReadFile calls elsewhere, but that's beyond the scope of
this commit.
Perversely, in LSP completer, OnFileReadyToParse never parsed anything.
At least, not in the obvious way.

The original bug where we were doing redundant updates, ironically, was
ensuring that we (eventually) updated the server with file contents (at
some point in time) when we get OnFileReadyToParse.

But by fixing that, we now never actually used the data supplied in
OnFileReadyToParse, which the tests were relying heavily on. In
particular, the Java tests would update file contents using only
OnFileReadyToParse, which is a completely legitimate thing to do in our
API. And worked. Until it didn't.

Now it does again - we explicitly update server file state in
OnFileReadyToParse, which seems perfectly sensible.
@puremourning puremourning force-pushed the fix-parsing-first-file-every-time branch from 1d221e4 to 3204179 Compare December 27, 2023 20:12
Copy link
Collaborator

@bstaletic bstaletic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor comments, but :lgtm_strong: anyway.

Reviewed 4 of 4 files at r3, all commit messages.
Reviewable status: 1 of 2 LGTMs obtained (waiting on @puremourning)


ycmd/tests/test_utils.py line 93 at r3 (raw file):

    if 'contents' in kwargs
    else GetTestFileContents( filepath )
      if 'filepath' in kwargs

Do we need this? There's definitely nothing at /foo in CI or on my machine.


ycmd/tests/rust/__init__.py line 49 at r3 (raw file):

                 BuildRequest( filepath = os.path.join( directory,
                                                        'src',
                                                        'main.rs' ),

I'm guessing this also could have been reverted to fit on one line, but oh well.

Copy link
Member Author

@puremourning puremourning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 5 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status: :shipit: complete! 2 of 2 LGTMs obtained (waiting on @puremourning)


ycmd/tests/test_utils.py line 93 at r3 (raw file):

Previously, bstaletic (Boris Staletic) wrote…

Do we need this? There's definitely nothing at /foo in CI or on my machine.

While /foo probably doesn't exist, trying to open it, throwing an exception, handling that exception and returning '' seems wasteful, as this is a very common case. It's also saying what it means, which I prefer.


ycmd/tests/rust/__init__.py line 49 at r3 (raw file):

Previously, bstaletic (Boris Staletic) wrote…

I'm guessing this also could have been reverted to fit on one line, but oh well.

meh, not gonna change it.

@puremourning puremourning added the Ship It! Manual override to merge a PR by maintainer label Dec 28, 2023
Copy link
Contributor

mergify bot commented Dec 28, 2023

Thanks for sending a PR!

1 similar comment
Copy link
Contributor

mergify bot commented Dec 28, 2023

Thanks for sending a PR!

@mergify mergify bot merged commit 47baf60 into ycm-core:master Dec 28, 2023
18 checks passed
@puremourning puremourning deleted the fix-parsing-first-file-every-time branch December 28, 2023 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ship It! Manual override to merge a PR by maintainer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants