Add support for a CDCL-type SAT solver using the Resolvo crate #1179

jrray · 2025-02-10T02:19:46Z

🚧 Work in Progress 🚧

It has been frequently requested over the life of this project to adopt a SAT solver in the hopes that it would provide faster solves and/or better messages when things can't solve. The Resolvo looked promising as it supports lazy solving and attempts to present solve errors in a relatively easy to understand format.

I have implemented Resolvo's DependencyProvider trait for spk packages and got enough of the spk package compatibility concepts right that the existing solver tests are all passing when run against the new solver.

I have also started to hard code changing all uses of the original solver into using the new solver just to expedite being able to test the new solver with some real world solves at SPI. There will be more work required to make it possible to choose between these solvers at runtime.

The new solver doesn't return a Graph because it doesn't use Node or State. It doesn't use Decision or Change. It doesn't use the existing validator types since those are designed to validate against a State. Solves aren't run via a SolverRuntime or through a DecisionFormatter. What is similar is the way a Solver is configured, and it still returns a Solution, although the Solution is not necessarily populated with the same details. It does well enough to pass all the tests (we need more tests!).

The high level view of how Resolvo works is that the DependencyProvider needs to be able to return a list of candidates for a request, filter those down based on compatibility, then enumerate all the dependencies of a given candidate. Rather than how our solver validates a package based on the state of the partial solve, Resolvo just needs to know if a given package (build) is compatible with a given request (python/3.11).

One of the first challenges was dealing with global var requests, meaning var requests that have no namespace. Resolvo requires that all requirements are attached to a specific package name. In order to handle global vars, each global var, for example, debug/true, is turned into a pseudo package called "debug". Then, since the DependencyProvider has to return the list of candidates (values) for "debug", all possible values that var can have has to be known in advance. To make that possible, the solve is restarted as new values are discovered. It is likely going to be necessary to maintain a cache that can be reused between solves to minimize the need to restart the solve.

Another challenge I haven't found a better approach for yet is that since all the candidates need to be returned upfront, when getting the candidates for a package, all the builds of all the versions of that package need to be visited, which includes reading in each build's package spec. This would benefit greatly from some kind of package index in the repo so this data could be read in bulk. I tried something where the list of candidates is first just the list of versions available for a package, but then it wasn't giving me the ability to filter those versions later after reading the builds.

What comes next is getting a test build up and running inside SPI and experimenting with some solves.

jrray · 2025-02-13T06:59:35Z

Something of a PSA for anyone trying this code out, we've discovered that #1130 introduced a bug that makes the original solver not respect host options. We're working on a fix.

jrray · 2025-02-13T19:38:55Z

An update on the topic we discussed in the weekly meeting about getting different results from the solver between different runs. I realized that what is happening is currently builds of a version are not being sorted and so the build the solver will pick is subject to one or more per-process random seeds.

If I ask for python and it depends on gcc, it will first try to use the highest version of python available, but it might by chance pick a build of python that depends specifically on gcc/6. This is all due to how the candidates of python are ordered. I will revisit the sort logic to make it behave more like the original solver. This should make the solve result deterministic between runs if the repo contents are the same.

Resolvo is a CDCL sat solver for package managers and will hopefully prove to be much better at solving quickly than our mostly brute-force approach. Making a checkpoint here after achieving a simple "solve" to get a feel for how spk type map onto Resolvo's concepts. Although currently the smallest unit the solver works with is a package, and that likely needs to change to a component. Signed-off-by: J Robert Ray <[email protected]>

Signed-off-by: J Robert Ray <[email protected]>

This supports package-specific vars but not non-namespaced vars. Signed-off-by: J Robert Ray <[email protected]>

Signed-off-by: J Robert Ray <[email protected]>

A necessary step towards supporting solving for specific components and handling their dependencies. Signed-off-by: J Robert Ray <[email protected]>

Resolvo doesn't offer a way to declare constraints that aren't associated with a package. The strategy employed here is to create synthetic packages to represent global variables. Whenever a new possible value for a global variable is discovered, the solver needs to be restarted so when "candidates" for a global variable are requested, it returns a "package" (solvable) for each possible value that variable might take. Then, the requirement for packages that have that specific value can be models as a dependency on the synthetic package. It is not too common for packages to have install requirements on non-namespaced vars, so in practice this solver restart may not be a frequent occurrence. Also, top level options (not yet implemented) will allow packages that expect vars with other values to be excluded instead of having to ingest those different values. Of course it would be preferable for the solver to support this concept without the restart mechanic, but maybe it is worth considering making it a requirement for packages to only have namespaced var requirements. Another option is to maintain a cache of known var values somehow either in the repo or in an external service. This could cut down on the number of instances where the solver needs to be restarted. Signed-off-by: J Robert Ray <[email protected]>

Since PackageSource::Repository includes the repo, the debug output would include the entire repository contents, making the debug output too verbose. Signed-off-by: J Robert Ray <[email protected]>

For now just focusing on solving for a runtime environment with existing builds. Signed-off-by: J Robert Ray <[email protected]>

Start modifying the original solver tests to run the test against both solvers. Signed-off-by: J Robert Ray <[email protected]>

Not all solver tests are applicable. Where possible, for tests the cdcl solver doesn't pass yet, add #[should_panic] so the test suite still passes as a whole. As the cdcl solver can pass more tests, this will highlight the tests that can then be enabled for the cdcl solver. Signed-off-by: J Robert Ray <[email protected]>

Signed-off-by: J Robert Ray <[email protected]>

At least well enough to make a handful of tests pass. Signed-off-by: J Robert Ray <[email protected]>

Well enough to satisfy the one test. Signed-off-by: J Robert Ray <[email protected]>

Stop completely excluding all source builds. Signed-off-by: J Robert Ray <[email protected]>

This required rethinking the type backing NameIds to subdivide a package into its components. For example, "python:run" and "python:build" are considered separate packages by the solver. It's not possible for a single request to resolve to multiple resolvables. Either a request for "All" would have to be expanded into requests for the individual components, or there has to be a [virtual] package "python:all" that represents a dependency on all the components of that package. Since it is not possible to know the full set of components for a request like "python:all", where the set of components may differ between the different versions of python, then only the latter approach is feasible. Dependencies are needed to ensure the solve only contains components from the same spk package build. Signed-off-by: J Robert Ray <[email protected]>

Though the use of the run component is still hard coded. Signed-off-by: J Robert Ray <[email protected]>

Pass another test. Signed-off-by: J Robert Ray <[email protected]>

Signed-off-by: J Robert Ray <[email protected]>

By using an enum here, the global var handling can use its own enum variant instead of the flawed string prefix approach. Now it is simply not possible for a package name to be confused for a global var. Signed-off-by: J Robert Ray <[email protected]>

Messed up the logic when checking :all cases. Add test for this case. Signed-off-by: J Robert Ray <[email protected]>

These are effectively turned into more requests. Now the host options are being looked at. Signed-off-by: J Robert Ray <[email protected]>

It is safe to update the known values without restarting the solve here, since these functions are only called before starting the solve. But that's not enforced by the code in any way at the moment. Signed-off-by: J Robert Ray <[email protected]>

The requested by column is not populated. Signed-off-by: J Robert Ray <[email protected]>

At SPI the newest python version available on centos is an embedded package, but it is undesirable for `spk env python` to choose an embedded python package. It also follows that it would have preferred to build from source a newer version of a package instead of use an existing build of an older version. Signed-off-by: J Robert Ray <[email protected]>

Improve performance by not reading builds for packages that are for versions that are not applicable to the root requests. Signed-off-by: J Robert Ray <[email protected]>

Just printing something to stderr for now, in order to gauge how many retries are happening in practice. Signed-off-by: J Robert Ray <[email protected]>

The abstract solver now has a `run_and_print_resolve` that takes a formatter, which makes it possible to run either the old or new solver in a consistent way. The original `run_and_print_resolve` function has become `pub(crate)` to catch all the places that needed to be modified to use the new function. Signed-off-by: J Robert Ray <[email protected]>

Making the original function `pub(crate)` to force code to change to use the trait instead. Remove the unused `run_and_log_decisions` function. Signed-off-by: J Robert Ray <[email protected]>

AbstractSolver was split into multiple traits with some blanket implementations for refs while chasing compile errors but this may not have actually been necessary. Add the "solver" flags to more commands so the user can choose what solver to use in more cases. Working towards being able to choose between the original solver and the new solver for all operations. More tests now run the test with both solves. Signed-off-by: J Robert Ray <[email protected]>

Making it possible to choose what solver to use when running spk tests. Signed-off-by: J Robert Ray <[email protected]>

However, this can still become a recursive call that blows out the stack if the bounds on Solver don't fit the bounds on the "real" `test` implementation. Signed-off-by: J Robert Ray <[email protected]>

In order for `flags::Solver::get_solver` to offer a choice in what solver is returned, it needs to have access to the `solver_to_run` setting in `flags::DecisionFormatterSettings`. This change continues the theme of changing from a DecisionFormatter "containing" a solver to a solver "containing" a DecisionFormatter. Signed-off-by: J Robert Ray <[email protected]>

Signed-off-by: J Robert Ray <[email protected]>

`key_entry_names` is unused after it is cloned into `ordered_names`. Signed-off-by: J Robert Ray <[email protected]>

In order to reuse this logic for the resolvo solver, extract it to an external function. It is not practical for the resolvo solver to directly use a SortedBuildIterator. Signed-off-by: J Robert Ray <[email protected]>

Potential speedup by avoiding doing the reversal as a separate step. Signed-off-by: J Robert Ray <[email protected]>

Reuse the same build sorting logic as the original solver in order to get a deterministic result out of the solve. The solve can change drastically based on which build is selected first. Signed-off-by: J Robert Ray <[email protected]>

This is a public repo. Signed-off-by: J Robert Ray <[email protected]>

Signed-off-by: J Robert Ray <[email protected]>

When populating the Solution object, embedded packages need to be categorized properly or else when calculating the layers for the solution it ends up with an empty blob as one of the layers, which is an error. Handle source builds in resolvo solve When not building from source. Signed-off-by: J Robert Ray <[email protected]>

The case where a stub only depended on the Run component of the parent still needs to declare a dependency to bring it in. Signed-off-by: J Robert Ray <[email protected]>

To promote better test coverage, the build_package family of macros now require the name of the solver to use, and all the tests using these macros now run the test against all three solver choices (ignoring "all"). Workaround too many arguments lint and rstest la10736/rstest#226 (comment) Bumped to the latest version of rstest first to see if that would help. Signed-off-by: J Robert Ray <[email protected]>

The solver tests were not passing if this feature is enabled, and this exposed a few problems in not handling ':all' requests correctly. When generating the "VersionSet" for dependencies, the PkgRequest needs to have the correct component populated. The logic for skipping or not skipping candidates when build from source needed to handle ':all' and it was restructured for clarity. When filtering candidates for a request of a component of a specific build, source builds needed to be filtered out. Signed-off-by: J Robert Ray <[email protected]>

Fix commented out code in cmd_test for build stage tests and add a new test that checks the basic behavior of a build stage test that the test environment includes the build dependencies of the package being tested. Lift the old solver's configure_for_build_environment method to AbstractSolver. Although the other test stages have code that only lives in the cmd_test crate to configure the solver properly, the build stage uses this method that was defined in the solver itself. While the logic of configure_for_build_environment could be duplicated in cmd_test, it is also reused by other code in the solver so it made sense to preserve the logic as a reusable solver method. Signed-off-by: J Robert Ray <[email protected]>

jrray added the enhancement New feature or request label Feb 10, 2025

jrray self-assigned this Feb 10, 2025

jrray marked this pull request as draft February 10, 2025 02:19

jrray force-pushed the resolvo branch from dbf8382 to 021b424 Compare February 10, 2025 23:03

jrray force-pushed the resolvo branch 2 times, most recently from 03ee415 to daa4be4 Compare February 15, 2025 02:14

jrray force-pushed the resolvo branch from dc196f8 to f913894 Compare March 13, 2025 22:51

jrray added 21 commits March 13, 2025 20:08

Implement candidate sorting

57097d6

Signed-off-by: J Robert Ray <[email protected]>

Implement candidate filtering by version number

afcc027

Signed-off-by: J Robert Ray <[email protected]>

Add basic unsat test

44a8bee

Signed-off-by: J Robert Ray <[email protected]>

Implement simple dependency handling

9ba2879

Signed-off-by: J Robert Ray <[email protected]>

Handle var dependencies

df456b7

This supports package-specific vars but not non-namespaced vars. Signed-off-by: J Robert Ray <[email protected]>

Fill in missing interner functions

7b8820d

Signed-off-by: J Robert Ray <[email protected]>

Better solver errors

fff6be2

Signed-off-by: J Robert Ray <[email protected]>

Change the solvable to the component level

212b91a

A necessary step towards supporting solving for specific components and handling their dependencies. Signed-off-by: J Robert Ray <[email protected]>

Trim down the debug output for SolvedRequest

5b9fec7

Since PackageSource::Repository includes the repo, the debug output would include the entire repository contents, making the debug output too verbose. Signed-off-by: J Robert Ray <[email protected]>

Exclude source packages from solves

26144bd

For now just focusing on solving for a runtime environment with existing builds. Signed-off-by: J Robert Ray <[email protected]>

Create an abstraction interface for solvers

7683ef1

Start modifying the original solver tests to run the test against both solvers. Signed-off-by: J Robert Ray <[email protected]>

Handle dependency inclusion policy

31c1bfb

Signed-off-by: J Robert Ray <[email protected]>

Support global var options

4bb0e68

At least well enough to make a handful of tests pass. Signed-off-by: J Robert Ray <[email protected]>

Populate Solution options

b7d8bcf

Well enough to satisfy the one test. Signed-off-by: J Robert Ray <[email protected]>

Implement requests for source builds

12bd1de

Stop completely excluding all source builds. Signed-off-by: J Robert Ray <[email protected]>

This other component test passes reliably now

203df8c

Though the use of the run component is still hard coded. Signed-off-by: J Robert Ray <[email protected]>

Implement tracking component "uses" as dependencies

c826dc6

Pass another test. Signed-off-by: J Robert Ray <[email protected]>

jrray and others added 26 commits March 14, 2025 15:10

Use new solver in more places

33ea462

Signed-off-by: J Robert Ray <[email protected]>

Add a crude solve result output to cmd_explain

655bcef

Fix requests for :all not filtering invalid versions

74c3bcb

Messed up the logic when checking :all cases. Add test for this case. Signed-off-by: J Robert Ray <[email protected]>

Implement update_options in new solver

30a0f09

These are effectively turned into more requests. Now the host options are being looked at. Signed-off-by: J Robert Ray <[email protected]>

Wire up the normal solution output in cmd_explain

1379d70

The requested by column is not populated. Signed-off-by: J Robert Ray <[email protected]>

Avoid reading packages where possible

cf2f5b7

Improve performance by not reading builds for packages that are for versions that are not applicable to the root requests. Signed-off-by: J Robert Ray <[email protected]>

Add feedback for solver retries

949d695

Just printing something to stderr for now, in order to gauge how many retries are happening in practice. Signed-off-by: J Robert Ray <[email protected]>

Also add run_and_log_resolve to AbstractSolver

839279a

Making the original function `pub(crate)` to force code to change to use the trait instead. Remove the unused `run_and_log_decisions` function. Signed-off-by: J Robert Ray <[email protected]>

Add solver arg to spk test

f978d0d

Making it possible to choose what solver to use when running spk tests. Signed-off-by: J Robert Ray <[email protected]>

Change these calls to mimic the one in install.rs

b451c5e

However, this can still become a recursive call that blows out the stack if the bounds on Solver don't fit the bounds on the "real" `test` implementation. Signed-off-by: J Robert Ray <[email protected]>

Make it possible to choose resolvo with --solver-to-run

4824332

Signed-off-by: J Robert Ray <[email protected]>

Remove unnecessary clone

3a0920b

`key_entry_names` is unused after it is cloned into `ordered_names`. Signed-off-by: J Robert Ray <[email protected]>

Refactor some build key logic from SortedBuildIterator

6ea2899

In order to reuse this logic for the resolvo solver, extract it to an external function. It is not practical for the resolvo solver to directly use a SortedBuildIterator. Signed-off-by: J Robert Ray <[email protected]>

Do a reverse sort using std::cmp::Reverse

03c7143

Potential speedup by avoiding doing the reversal as a separate step. Signed-off-by: J Robert Ray <[email protected]>

Rework build sorting in the new solver

f9613b4

Reuse the same build sorting logic as the original solver in order to get a deterministic result out of the solve. The solve can change drastically based on which build is selected first. Signed-off-by: J Robert Ray <[email protected]>

Don't inject credentials for resolvo repo

3bd5ff9

This is a public repo. Signed-off-by: J Robert Ray <[email protected]>

Convenience conversion from EmbeddedBuildSource to BuildIdent

ef8e416

Signed-off-by: J Robert Ray <[email protected]>

Resolvo: fix stubs getting added to solve without parent

e236ea8

The case where a stub only depended on the Run component of the parent still needs to declare a dependency to bring it in. Signed-off-by: J Robert Ray <[email protected]>

jrray force-pushed the resolvo branch from f913894 to 9a3c596 Compare March 15, 2025 02:19

jrray force-pushed the resolvo branch from 9a3c596 to c8a558a Compare March 15, 2025 02:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for a CDCL-type SAT solver using the Resolvo crate #1179

Add support for a CDCL-type SAT solver using the Resolvo crate #1179

jrray commented Feb 10, 2025

jrray commented Feb 13, 2025

jrray commented Feb 13, 2025

Add support for a CDCL-type SAT solver using the Resolvo crate #1179

Are you sure you want to change the base?

Add support for a CDCL-type SAT solver using the Resolvo crate #1179

Conversation

jrray commented Feb 10, 2025

jrray commented Feb 13, 2025

jrray commented Feb 13, 2025