Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow modification of source generated documents #77587

Conversation

davidwengier
Copy link
Member

@davidwengier davidwengier commented Mar 13, 2025

Part of dotnet/razor#10693
Razor side of this is dotnet/razor#11619 which I think makes us ready to internally dogfood cohosting.

This unblocks Razor completion and most of code actions in cohosting. It allows WithText and WithSyntaxRoot to work on source generated documents, creating a forked solution with frozen source generated documents. Solutions can be continually frozen without issue, and unfreezing puts them back to their original state. This also allows code actions to run on modified source generated documents, but only if they have been frozen.

As discussed there aren't any flags for this to only be possible from/for Razor.

Still investigating individual code actions that aren't working in Razor, but those will probably be a follow up. Enough of the Razor tests pass now that it proves the system generally works (see comment below).

@davidwengier davidwengier requested a review from a team as a code owner March 13, 2025 08:43
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Mar 13, 2025
@@ -63,6 +63,19 @@ public static Document GetRequiredDocument(this Solution solution, DocumentId do
}

#if !CODE_STYLE
public static SourceGeneratedDocument GetRequiredSourceGeneratedDocumentForAlreadyGeneratedId(this Solution solution, DocumentId documentId)
Copy link
Member Author

@davidwengier davidwengier Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used this helper method in a few spots in code actions, based on the paths that the current Razor tests hit, and it works but it doesn't fill me with confidence. I wonder how you feel about making GetRequiredDocument above, which is what everything calls, also return source generated documents if they're already been generated? Or to be even more strict, if they're from the Razor generator?

It would certainly allow Razor to fall into the pit of success, but not sure if it allows too many others as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eg, I just pushed a41720b (#77587) because that spot wasn't hit on any of the Razor tests I was running yesterday, but what hit on the next one in line.

Finding each specific GetRequiredDocument call that needs to change is certainly possible, but also a potentially drawn out process.

Copy link
Member Author

@davidwengier davidwengier Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found another! Without updating GetRequiredDocument Razor gets a yellow bar in completion, and no tooltips, because of this line:
https://github.com/dotnet/roslyn/blob/main/src/Features/Core/Portable/Completion/Providers/AbstractRecommendationServiceBasedCompletionProvider.cs#L229

Again, changing GetRequiredDocument to allow returning already generated documents would solve the issue once and for all, without having to debug through every scenario in every feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general exception is you call one of our GetRequiredDocumentAsync helpers which are allowed to return source generated documents. The basic reason is generated documents might be more expensive than regular ones (even if say we're just sending the contents back from the OOP process), so we expect them to be async. In the case you linked above you're async, so you can just call that method there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'd expect this method to not exist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general exception is you call one of our GetRequiredDocumentAsync helpers which are allowed to return source generated documents

I guess I was trying to keep things like completion and code actions on the happy path of not causing generators to run, and only operating on frozen documents because we know those are the ones that a code fix actually changed. If this method was removed, and the callers used GetRequiredDocumentAsync, and processed all changed source generated documents, then you'd end up with the code action producing text edits for the generated documents due to changes that would simply happen anyway the next time the generator runs, if the source document updates.

@davidwengier
Copy link
Member Author

@CyrusNajmabadi for your thoughts

var document = changedSolution.GetRequiredDocument(documentId);
var document = documentId.IsSourceGenerated
? changedSolution.GetRequiredSourceGeneratedDocumentForAlreadyGeneratedId(documentId)
: changedSolution.GetRequiredDocument(documentId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case, i wouldn't mind an optiona lparameter to GetRequiredDocument stating it wants thsi behavior. the default behavior should be to NOT get this i don't think. but jason may feel differently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An optional parameter will certainly make the consumption easier, so I'll go ahead and do that. It doesn't prevent us needing to find each individual spot where we need to make the change though, so I would still argue for it to be the default behaviour, even if only for Razor source generated documents.


internal IEnumerable<DocumentId> GetChangedFrozenSourceGeneratedDocuments()
{
Contract.ThrowIfNull(_newSolution.CompilationState.FrozenSourceGeneratedDocumentStates);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does the caller know this will be not null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They just check the property: https://github.com/dotnet/roslyn/pull/77587/files#diff-c1082184909800b4a6c84cae62ea4ae37758d99f16723417e0f25d1262b527aaR188

Can make this return an empty enumerable instead, if you prefer. I was just making a NRT warning go away :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should be an empty enumerable -- I'll delete my other comments but it looked really fishy that callers had to go check the implementation detail first.

{
// Source generated documents always do a full parse, and source generator authors are only allowed to provide
// strings, so we follow suit here and just take the text from the root.
sourceGeneratedUpdates.Add((identity, DateTime.UtcNow, doc.root.GetText(), doc.root));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting... probably ok. might blow up later. not sure :D.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just update this to be passing around the SourceGeneratedDocumentState, and you can just have this create a text source like we do in the regular document case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just make text optional too, and you fork either the tree or the text and exactly one is not-null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO passing around SourceGeneratedStates would make the other callers of WithFrozenSourceGeneratedDocuments have to work a lot harder, and would end up having to do most of the work of that method here. Making source text optional could work, though it would just move this doc.root.GetText() into the WithFrozenSourceGeneratedDocuments, which isn't much of a difference IMO.

This was as nice as I could get it, but I did dream of a future where we had unions, because if we could do (SourceText or SyntaxNode), I could have unified a bunch of this code.

// which may not represent those inner freezes. Unclear. There may be bugs here.
var replacingItemTracker = (WithFrozenSourceGeneratedDocumentsCompilationTracker)existingTracker;
trackerMap[projectId] = replacingItemTracker.UnderlyingTracker;
// We unwind all the way, since we could have a chain of frozen source generated documents.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensure there are tests for unwrappign multiple times.

and jsut for my understanding, this happens when you do multiple .WithXXX forks on a solution on the generated docs? like a fork off of a fork?

note: in that case, do you really need to do tracker wrapping, or can the second wrap peek into the first and make the changes it needs?

I ask because unbounded tracker wrapping is something we try to avoid. it wuold be nice if it were not possible (and WithFrozenSourceGeneratedDocumentsCompilationTracker could assert it was not passed a WithFrozenSourceGeneratedDocumentsCompilationTracker as the underlying tracker).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Qustion to Jason. do we need the two trackers anymore? i have found the subclassin there pretty confusing. would lovei f this functionality actually meant we could go back to a single tracker. perhaps with computed SG docs, or explicitly provided SG docs. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are tests that cover this, yes, including multiple forks and resetting all the way back.

Jason and I did talk briefly about the long chain of trackers, and my memory was not to worry about it yet. I just took a look though, and its pretty trivial, especially now that I've written the tests and have some (tiny!) understanding of how it works, so I've updated to not create chains.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since the implementation cost was cheap enough, I'm happy with this.

Qustion to Jason. do we need the two trackers anymore?

Possibly not? The original motivation of the wrapping approach was we could easily freeze a specific document without impacting the underlying computation process of generators, since that was always in automatic mode -- i.e. a specific fork can ask for the document cheaply, but ask for any other document and you get the same computation that's already underway. I imagine with the balanced mode work there might be a different approach now that better fits into the new code (except I still haven't sat down to read the new approach so beats me what that might be...)

sourceText = await sourceGeneratedDocument2!.GetTextAsync();
Assert.Equal("// Change doc 2", sourceText.ToString());

solution = solution.WithoutFrozenSourceGeneratedDocuments();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't looked deeply at this test, but can you ensure we have a test where we fork teh solution multiple times, then WithoutFrozen returns all the way to the beginning.

--

Jason, remind me why we need WithoutFrozen in the first place?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithoutFrozen is only used in OOP sync -- when we sync a solution to OOP we want to make sure the same frozen document state is carried over. For reasons I don't remember, when we hold onto the last cached solution we hold on to one that potentially has frozen documents in it, so the start of the OOP sync is to unfreeze it and then re-freeze with whatever is coming along from the OOP sync request. Conceptually I have nothing against saying OOP wouldn't cache these in that case or something, but I don't recall the history why I didn't go down that path originally.

Copy link
Member

@CyrusNajmabadi CyrusNajmabadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signing off, as nothing is wrong here imo. however, there is potential cleanup, doc'ing, and testing that i think would be worthwhile to try or look into for this VERY COMPLEX SPACE :D

This also needs @jasonmalinowski sign off. I reminded him of this during standup today.

Copy link
Member

@CyrusNajmabadi CyrusNajmabadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signing off, as nothing is wrong here imo. however, there is potential cleanup, doc'ing, and testing that i think would be worthwhile to try or look into for this VERY COMPLEX SPACE :D

This also needs @jasonmalinowski sign off. I reminded him of this during standup today.

@CyrusNajmabadi
Copy link
Member

Question. Should this be dev17.15? or does it really need to go into main. This is risky imo.

@davidwengier
Copy link
Member Author

At this point I don't think I care if it goes in 17.956, if it can go somewhere.

@CyrusNajmabadi
Copy link
Member

i think we can get this into 17.956. We just may be retired by then...

@JoeRobich
Copy link
Member

Question. Should this be dev17.15? or does it really need to go into main. This is risky imo.

If this is intended for a near-term release in the C# extension then I don't believe 17.15 is an option. CC: @dibarbet who may have thoughts.

@dibarbet
Copy link
Member

If this is intended for a near-term release in the C# extension then I don't believe 17.15 is an option. CC: @dibarbet who may have thoughts.

We currently pull from main (17.14), but it is possible to start pulling from the 17.15 branch if necessary

@davidwengier
Copy link
Member Author

davidwengier commented Mar 28, 2025

Maybe I'm wrong in understanding Jan's email, but there is no difference after Monday, right? We're certainly in no need of this in the "near term" of "before Monday"

@CyrusNajmabadi
Copy link
Member

@jasonmalinowski please try to get to this soon. Thanks!

@@ -388,13 +388,33 @@ public Document WithSourceCodeKind(SourceCodeKind kind)
/// Creates a new instance of this document updated to have the text specified.
/// </summary>
public Document WithText(SourceText text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a bit weird this method isn't just virtual rather than having the base type explicitly deal with the source-generated case, but it's fine I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you commented this before, and I think it's a good idea, but making this virtual would be a public API change that needs to go through that process, right? So a follow up would be a lot easier


var project = this.GetRequiredProject(documentId.ProjectId);

if (project.TryGetSourceGeneratedDocumentForAlreadyGeneratedId(documentId) is null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still feels like there's a bit of a race here unless the belief is the only callers of this originate from a WithText() method. But maybe its's fine, since most of the callers are being checked in other places? Or does anything else bad happen if somebody tries to freeze a source generated document that they have an ID for but it's no longer present? Or with the other changes does that work now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would work now.

This method (and by extension it's callers) has the same requirement that TryGetSourceGeneratedDocumentForAlreadyGeneratedId has, which is if you want consistent results the onus is on the caller to make sure the document has been generated. Not sure I'd describe it as a race though. If some code calls this and gets unpredictable results, then I would argue that code was already the loser of whatever race existed elsewhere in their code, this method was just letting them know :)

}
}

if (FrozenSourceGeneratedDocumentStates is not null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point no harm; the only real reason it was kept nullable was because it made it clear that things couldn't be frozen twice, but that's now moot with this change.

Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is functionally fine and overall quite clean. I imagine there's a few follow ups that could be done as a separate PR:

  1. We probably should have some tests around what happens if you call these With() methods when the source generated document hasn't been observed yet, to make sure the behavior matches what we expect (one way or the other.)
  2. As @CyrusNajmabadi mentioned too: there's that helper to take a list of document IDs and split them by source generated and not. There's two copies, but that can be refactored back down to one.
  3. Right now if you fork with a root we're realizing a full text for it at the point of the fork, when we have ways to defer that. I think that also cleans up step 2.
    1. Some extra comments to add which I flagged.
  4. Might be some detritus in the tests that got copy/pasted and doesn't apply here.

It might also be wise to merge this, then merge the change to make the frozen document list non-nullable, and then tackle this.

if (syntaxNode != null)
{
newGeneratedState = newGeneratedState.WithSyntaxRoot(syntaxNode);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still scratching my head here -- I'd expect the source text and tree are both nullable, but exactly one is non-null. Then there's a single call to either WithText() or WithSyntaxRoot() that matches which was nullable.

{
// Source generated documents always do a full parse, and source generator authors are only allowed to provide
// strings, so we follow suit here and just take the text from the root.
sourceGeneratedDocuments.Add((identity, DateTime.UtcNow, doc.root.GetText(), doc.root));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment about where I wish both text and root were nullable and one would pass through: in this case then we'd defer this GetText() until the actual creation of the state, since I'd hope nobody is actually thinking this text is available at that point. And since this is calling GetText on SyntaxRoot rather than on the tree (which has a magic deferred implementation) we're realizing the text all at once.

Comment on lines +799 to +821
(ImmutableArray<(DocumentId, SourceText)>,
ImmutableArray<(SourceGeneratedDocumentIdentity, DateTime, SourceText, SyntaxNode?)>) GetOrdinaryAndSourceGeneratedDocuments()
{
if (!texts.Any(static t => t.documentId.IsSourceGenerated))
return (texts, []);

using var _1 = ArrayBuilder<(DocumentId, SourceText)>.GetInstance(capacity: texts.Length, out var ordinaryDocuments);
using var _2 = ArrayBuilder<(SourceGeneratedDocumentIdentity, DateTime, SourceText, SyntaxNode?)>.GetInstance(out var sourceGeneratedDocuments);
foreach (var doc in texts)
{
if (!doc.documentId.IsSourceGenerated)
{
ordinaryDocuments.Add(doc);
}
else if (TryGetSourceGeneratedDocumentStateForAlreadyGeneratedId(doc.documentId) is { Identity: var identity })
{
sourceGeneratedDocuments.Add((identity, DateTime.UtcNow, doc.text, null));
}
}

return (ordinaryDocuments.ToImmutableAndClear(), sourceGeneratedDocuments.ToImmutableAndClear());
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to extract out this helper and make it generic over (DocumentId, T) where T woudd be either SourceText or SyntaxNode, and then have it make the return array just with a few 'as' checks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for unions 😛

Comment on lines +1001 to +1003
? oldDocumentState
: oldDocumentState.UpdateTree(root, mode));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not your doing, but odd that this does this checking here instead of in UpdateTree...

if (documents.IsEmpty)
return this;

// We'll keep track if every document we're reusing is the exact same as the final generated output we already have
using var _ = ArrayBuilder<SourceGeneratedDocumentState>.GetInstance(documents.Length, out var documentStates);
foreach (var (documentIdentity, generationDateTime, sourceText) in documents)
using var _ = PooledDictionary<DocumentId, SourceGeneratedDocumentState>.GetInstance(out var documentStates);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than having a separate dictionary created here, it seems we could just reuse our existing dictionary object (or fill in an empty one if still null.) Then the loop calling TryAdd goes away, and the calls to .Add on this just become a call to SetItem on the dictionary. That way the number of updates is more in line with the number of documents being passed here, as opposed to us copying around the stuff that isn't changing. The check of "// If every document we looked at matched what we've already generated, we have nothing new to do" just becomes an identity check at that point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean by "existing dictionary". Do you mean FrozenSourceGeneratedDocumentStates? If so, thats not quite a dictionary. eg, looks like calling SetState won't add, it will throw if the id doesn't exist. Will leave this for later

this.SourceText,
this.LoadTextOptions,
newTreeSource,
this._lazyContentHash,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I guess this is fine because we're only calling this on a document that we already called WithText() on, so the checksum would have been updated. That still feels a bit icky, especially since I'd like to see us not have that text realization happen. But functionally this should be OK.

@davidwengier
Copy link
Member Author

Given #77896 has merged into release/17.15, I'll retarget this to that branch and fix merge conflicts. Apologies if this produces excessive notifications

…oGetUpdatedSourceGeneratedDocuments

# Conflicts:
#	src/LanguageServer/Protocol/Handler/CodeActions/CodeActionResolveHelper.cs
#	src/Workspaces/Core/Portable/Workspace/Solution/DocumentState.cs
#	src/Workspaces/Core/Portable/Workspace/Solution/SolutionCompilationState.cs
@davidwengier davidwengier changed the base branch from main to release/dev17.15 April 1, 2025 01:01
@davidwengier
Copy link
Member Author

We probably should have some tests around what happens if you call these With() methods when the source generated document hasn't been observed yet, to make sure the behavior matches what we expect (one way or the other.)

Added

@davidwengier davidwengier enabled auto-merge April 1, 2025 07:12
@davidwengier davidwengier merged commit e76382a into dotnet:release/dev17.15 Apr 1, 2025
25 checks passed
@davidwengier davidwengier deleted the TryingToGetUpdatedSourceGeneratedDocuments branch April 1, 2025 07:55
davidwengier added a commit to dotnet/razor that referenced this pull request Apr 12, 2025
Fixes #10693

Not a lot in the way of code changes here, its mostly just unskipping
tests. Those tests won't pass until
dotnet/roslyn#77587 is merged and available, but
this PR leaves only 5 tests skipped in cohosting, which probably puts us
in a position to internally dogfood.
davidwengier added a commit to dotnet/razor that referenced this pull request Apr 12, 2025
First two commits are from #11619,
and like that PR this won't merge (or pass tests) until
dotnet/roslyn#77587 merges and is available, but
the code won't need to change, and I've manually tested and everything
works, and all tests pass on my machine. The power of "Trust me bro!"

Fixes #11101
Fixes #11138
Fixes #10697
Fixes #10947

I managed to stop myself doing anything other than porting resolve, but
I learned a lot doing it, most of which horrified me, so will hopefully
be able to do more cleanup later. Would be nice if we got to a state
where I didn't have to keep saying "this doesn't build yet" first though
:P

Commit at a time review will make it easier, but I admit I didn't do a
great job here, and the 2nd last commit is definitely in "draw the rest
of the owl" territory. This code was written, ported to a branch, then
ported to another branch, then written more and by two different people,
so it didn't stand much of a chance I'm afraid.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead VSCode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants