Skip to content

C#: Automatically use configured private registry feeds #18850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Mar 27, 2025

Conversation

mbg
Copy link
Member

@mbg mbg commented Feb 24, 2025

When private package registries are configured for Default Setup, we initialise the Dependabot proxy to provide an authenticated proxy for those registries. As of #18029, the C# extractor is able to make use of the authenticated Dependabot proxy to connect to those private package registries.

However, in some scenarios, the address of the private package registries is not actually stored in the nuget.config files for a project. Since we only restore dependencies from feeds that are configured in nuget.config files, we are unable to restore dependencies from private registries in this case. Ordinary CI workflows for such projects rely on injecting the feed addresses into the configuration.

This PR modifies the NugetPackageRestorer to "inject" any private package registry feeds that are configured through the UI into the package restore process by passing them to dotnet on the command-line. This allows private package registries to be used without needing the address to be stored in the repository.

The CodeQL Action that initialises the Dependabot proxy provides the private registry feed URLs as an output as of github/codeql-action#2652 (and annotates them with their type as of github/codeql-action#2672). A future version of the Default Setup workflow will propagate that output to the analyze step / CodeQL extractors.

Approach

The approach I chose for this implementation is to detect when the CODEQL_PROXY_URLS environment is set, extract nuget feeds from the JSON array that is expected to be stored in it, and then pass those feed URLs, along with others that we discovered in the nuget.config files, to dotnet on the command-line.

Using the command-line has the advantage that we do not need to modify (and later clean-up) configuration files on disk, but the disadvantage that dotnet will only use feeds given to it on the command-line and ignore those in configuration files entirely. This means that we must provide all feeds that we want to use on the command-line.

To minimise the risk of breaking what was already working before, nuget feeds are only passed to dotnet on the command-line when the Dependabot proxy is configured (i.e. when private package registries are being used). If not, then the old behaviour of implicitly using nuget.config files continues to be used.

@mbg mbg self-assigned this Feb 24, 2025
@github-actions github-actions bot added the C# label Feb 24, 2025
@mbg mbg force-pushed the mbg/csharp/inject-proxy-urls branch from 379f635 to a8dde15 Compare March 14, 2025 13:47
@mbg mbg marked this pull request as ready for review March 14, 2025 14:37
@mbg mbg requested a review from a team as a code owner March 14, 2025 14:37
/// <summary>
/// Represents configurations for package registries.
/// </summary>
public struct RegistryConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason/benefit to declare this as a struct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason. I thought I was mirroring a similar setup elsewhere in the codebase, but happy to change it.

// The value of the environment variable should be a JSON array of objects, such as:
// [ { "type": "nuget_feed", "url": "https://nuget.pkg.github.com/org/index.json" } ]
var array = JsonConvert.DeserializeObject<List<RegistryConfig>>(registryURLs);
if (array != null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect the array to be null? Should we log if it is null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason for the check is just to make sure we don't work with a null value. We could log it, but I have no strong view on it either way.

Copy link
Contributor

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @mbg !
I really appreciate the PR explanation and inline code comments - it makes it much easier to review.
This adds an extra layer of configuration; Is it intended to work in conjunction with the other environment variables feeds (for instance excluding a feed from a reachability check)?

// we have discovered from analysing `nuget.config` files.
sources = configuredSources ?? new();
sources.Add(PublicNugetOrgFeed);
this.dependabotProxy?.RegistryURLs.ForEach(url => sources.Add(url));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider constructing the "source" string at this location instead. Otherwise it will be constructed once for each project that is restored in the parallel restore loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in d564529

// we have discovered from analysing `nuget.config` files.
sources = configuredSources ?? new();
sources.Add(PublicNugetOrgFeed);
this.dependabotProxy?.RegistryURLs.ForEach(url => sources.Add(url));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid timeouts, maybe there should be a feed validation check similar to:

var allFeedsReachable = explicitFeeds.All(feed => excludedFeeds.Contains(feed) || IsFeedReachable(feed, initialTimeout, tryCount));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 4448369

@mbg mbg requested review from michaelnebel and tamasvajk March 24, 2025 17:56
Copy link
Contributor

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking into the comments and thank you for doing this!!! 👍

// in addition to the ones that are configured in `nuget.config` files.
this.dependabotProxy?.RegistryURLs.ForEach(url => feedsToCheck.Add(url));

var allFeedsReachable = this.CheckFeeds(feedsToCheck);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the dependabot proxy is set, we also explicitly add the public nuget org feed as a source - so we should probably check whether that feed is reachable as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed this so that the public feed is not manually added when private registries are configured in 4d3b024, since explicitFeeds should already contain it. I checked and dotnet nuget list source does return it unless there is a <clear /> entry in the list of packageSources in nuget.config.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there's more subtly to this. GetNugetFeeds doesn't include it, because dotnet nuget list source only includes the feeds that are explicitly configured in a given configuration file if it is given an --configfile argument. GetNugetFeedsFromFolder does include the public feed (unless there's a <clear /> element), but we don't check whether these "inherited" feeds are reachable already.

So, we didn't previously check that the public feed is reachable anyway. @tamasvajk left a comment saying that we could check that they are reachable, but don't because of authentication requirements(?).

I think the real lesson for my changes here is that I should really use allFeeds as argument for RestoreProjects rather than just explicitFeeds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK that also still doesn't quite work with the current logic, because GetAllFeeds only invokes GetNugetFeedsFromFolder in folders that contain nuget.config files to begin with. In our case, we may not have a nuget.config file at all, so GetNugetFeedsFromFolder never gets called, even though it would give us the public feed as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the reachability of the public nuget feed wasn't checked before, then maybe there is no need to check it now either.
However, I think its availability is checked when it is added as a fallback feed (not as an inherited feed).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the reachability of the public nuget feed wasn't checked before, then maybe there is no need to check it now either.

In any case, it's probably a separate discussion to the changes in this PR. I have also now removed that part of the code that manually added the public feed in be95d33.

However, I think its availability is checked when it is added as a fallback feed (not as an inherited feed).

That's true.

/// Checks that we can connect to all Nuget feeds that are explicitly configured in configuration files.
/// </summary>
/// <param name="explicitFeeds">Outputs the set of explicit feeds.</param>
/// <returns>True if all feeds are reachable or false otherwise.</returns>
private bool CheckFeeds(out HashSet<string> explicitFeeds)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename one of the CheckFeeds methods. The naming is a bit confusing when their type signatures are so similar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in d2b88ae

Comment on lines +831 to +834
.Where(folder => folder != null)
.SelectMany(folder => GetFeeds(() => dotnet.GetNugetFeedsFromFolder(folder!)))
.ToHashSet();
}

Check notice

Code scanning / CodeQL

Generic catch clause Note

Generic catch clause.

Copilot Autofix

AI 28 days ago

To fix the problem, we should catch specific exceptions that are likely to be thrown by the operation being performed. In this case, FileInfo(config).Directory?.FullName can throw exceptions such as ArgumentException, PathTooLongException, NotSupportedException, UnauthorizedAccessException, and DirectoryNotFoundException. We should catch these specific exceptions and log them accordingly.

Suggested changeset 1
csharp/extractor/Semmle.Extraction.CSharp.DependencyFetching/NugetPackageRestorer.cs

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/csharp/extractor/Semmle.Extraction.CSharp.DependencyFetching/NugetPackageRestorer.cs b/csharp/extractor/Semmle.Extraction.CSharp.DependencyFetching/NugetPackageRestorer.cs
--- a/csharp/extractor/Semmle.Extraction.CSharp.DependencyFetching/NugetPackageRestorer.cs
+++ b/csharp/extractor/Semmle.Extraction.CSharp.DependencyFetching/NugetPackageRestorer.cs
@@ -830,5 +830,21 @@
                         }
-                        catch (Exception exc)
+                        catch (ArgumentException exc)
                         {
-                            logger.LogWarning($"Failed to get directory of '{config}': {exc}");
+                            logger.LogWarning($"Failed to get directory of '{config}' due to an argument exception: {exc}");
+                        }
+                        catch (PathTooLongException exc)
+                        {
+                            logger.LogWarning($"Failed to get directory of '{config}' due to a path too long exception: {exc}");
+                        }
+                        catch (NotSupportedException exc)
+                        {
+                            logger.LogWarning($"Failed to get directory of '{config}' due to a not supported exception: {exc}");
+                        }
+                        catch (UnauthorizedAccessException exc)
+                        {
+                            logger.LogWarning($"Failed to get directory of '{config}' due to an unauthorized access exception: {exc}");
+                        }
+                        catch (DirectoryNotFoundException exc)
+                        {
+                            logger.LogWarning($"Failed to get directory of '{config}' due to a directory not found exception: {exc}");
                         }
EOF
@@ -830,5 +830,21 @@
}
catch (Exception exc)
catch (ArgumentException exc)
{
logger.LogWarning($"Failed to get directory of '{config}': {exc}");
logger.LogWarning($"Failed to get directory of '{config}' due to an argument exception: {exc}");
}
catch (PathTooLongException exc)
{
logger.LogWarning($"Failed to get directory of '{config}' due to a path too long exception: {exc}");
}
catch (NotSupportedException exc)
{
logger.LogWarning($"Failed to get directory of '{config}' due to a not supported exception: {exc}");
}
catch (UnauthorizedAccessException exc)
{
logger.LogWarning($"Failed to get directory of '{config}' due to an unauthorized access exception: {exc}");
}
catch (DirectoryNotFoundException exc)
{
logger.LogWarning($"Failed to get directory of '{config}' due to a directory not found exception: {exc}");
}
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
@mbg mbg requested a review from michaelnebel March 25, 2025 14:41
Copy link
Contributor

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thank you!

else
{
// If we haven't found any `nuget.config` files, then obtain a list of feeds from the root source directory.
allFeeds = GetFeeds(() => dotnet.GetNugetFeedsFromFolder(this.fileProvider.SourceDir.FullName)).ToHashSet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement!

@michaelnebel
Copy link
Contributor

@mbg : We should also run DCA before merging.

@mbg
Copy link
Member Author

mbg commented Mar 27, 2025

DCA looks good, I think. So I will go ahead and merge this.

@mbg mbg merged commit 2aee47b into main Mar 27, 2025
17 checks passed
@mbg mbg deleted the mbg/csharp/inject-proxy-urls branch March 27, 2025 10:11
@michaelnebel
Copy link
Contributor

michaelnebel commented Mar 27, 2025

DCA looks good, I think. So I will go ahead and merge this.

It looks like the changes was not tested by the DCA run.
When testing extractor changes database caching needs to be disabled and it also needs to be specified that the extractor should be built based on the branch. 😃
In the wizard this looks like
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants