standardize CMake linting and auto-formatting across RAPIDS #62

jameslamb · 2025-01-24T21:12:08Z

Description

As of this writing, CMake formatting varies somewhat across RAPIDS. It would be great to standardize the look and feel of CMake code across RAPIDS, and pre-commit seems like a good system to do that.

This could involve creating new hooks or choosing from the existing third-party open-source projects.

Benefits of this work

standardization enforced by tools = less effort put into style-related review comments
static analysis = reduced risks of bugs making it through review (especially helpful for code paths not touched by CI)
creating an easier-to-adopt approach for CMake linting should improve the likelihood that projects adopt it

Acceptance Criteria

RAPIDS projects use pre-commit hook(s) for CMake autoformatting and linting

Approach

1. Review the current approaches used across RAPIDS

Several RAPIDS projects use the following approach:

vendor shell script that invokes cmake-format and cmake-lint in pre-commit-friendly ways (example: cudf/cpp/scripts/ run-cmake-format.sh)
vendor a config file for cmake-format (example: cudf/cpp/cmake/config.json)
at runtime in CI, download more configuration from the rapids-cmake repo like this:

RAPIDS_VERSION_MAJOR_MINOR="$(rapids-version-major-minor)"
FORMAT_FILE_URL="https://raw.githubusercontent.com/rapidsai/rapids-cmake/branch-${RAPIDS_VERSION_MAJOR_MINOR}/cmake-format-rapids-cmake.json"
export RAPIDS_CMAKE_FORMAT_FILE=/tmp/rapids_cmake_ci/cmake-formats-rapids-cmake.json
mkdir -p $(dirname ${RAPIDS_CMAKE_FORMAT_FILE})
wget -O ${RAPIDS_CMAKE_FORMAT_FILE} ${FORMAT_FILE_URL}

(example: cudf/ci/check_style.sh)

Invoke those scripts, using that configuration, using "local" pre-commit hooks, like this:

      - id: cmake-format
        name: cmake-format
        entry: ./cpp/scripts/run-cmake-format.sh cmake-format
        language: python
        types: [cmake]
        additional_dependencies:
          - cmakelang==0.6.13
        verbose: true
        require_serial: true
        exclude: |
          (?x)^(
            cpp/cmake/Modules/FindCUDAToolkit[.]cmake$
          )
      - id: cmake-lint
        name: cmake-lint
        entry: ./cpp/scripts/run-cmake-format.sh cmake-lint
        language: python
        types: [cmake]
          - cmakelang==0.6.13
        verbose: true
        require_serial: true
        exclude: |
          (?x)^(
            cpp/cmake/Modules/FindCUDAToolkit[.]cmake$
          )

(example: cudf/.pre-commit-config.yaml)

2. Review open-source options

Both cmake-format and cmake-lint have official pre-commit hooks.

These might not have existed or might not have fits RAPIDS needs when the approach described above was first introduced.

3. Pick a path and implement it

Some functionality I think we want:

fast execution
lightweight installation (e.g. npm / Node, go)
minimal at-check-time network requests
minimal vendoring of identical files
auto-formatting
RAPIDS-specific configuration (example: enforce cmake-format and cmake-lint cugraph#4889 (comment))
helpful logs

Notes

Historical context:

The text was updated successfully, but these errors were encountered:

vyasr · 2025-01-24T23:34:34Z

I can provide some context that will hopefully answer some of your questions. rapids-cmake was (unsurprisingly) the first place in RAPIDS to use the cmakelang package for formatting. The basic formatting was added way back in rapidsai/rapids-cmake#42. Even before that, since rapids-cmake also defines a large number of CMake functions, the library started vendoring its own format file in rapidsai/rapids-cmake#29 so that consuming projects could format their CMake if they used rapids-cmake functions.

rapidsai/cudf#9484 was the original source for pre-commit-based CMake linting in RAPIDS. The comment at the top of the run-cmake-format.sh script has remained largely unchanged since then and explains why this approach is necessary and we cannot use the existing hooks directly. To answer

These might not have existed or might not have fits RAPIDS needs when the approach described above was first introduced.

These hooks did exist then. The problem is fundamentally that since we use a lot of rapids-cmake functions everywhere, if we do not have the format file from rapids-cmake available we will incorrectly reformat all of our CMake in local runs of the pre-commit hook. To prevent this, the wrapper script is designed to exit out (without failing!) if that file is not available. The assumption is that the number of people who actually modify the CMake is limited and those people will know that they need this file available locally in order to match remote formatting. The script makes some best effort attempts at finding the file based on the repository root and the build directory where the file will be cloned into.

In the time since then we have also started including additional configuration, as you have noted. So to answer

Some functionality I think we want:

The two main things that this hook should do beyond the hooks that are already available is:

To standardize and store the common configuration in our custom pre-commit hook, to be used across all of RAPIDS. This configuration can include standardized formatting for common functions that we use that are not base CMake functions, in particular CPM functions since rapids-cmake does not (and probably should not) provide formatting rules for those functions it does not define itself.
A way for the hook to maintain a reasonably up-to-date version of the rapids-cmake cmake-format JSON file that can be used independently of a project having a build directory. To minimize network requests, that file could be refreshed based on how stale it is. That is no worse than the current approach, which assumes that the current build directory contains an up-to-date file. The main difference would be the mechanism by which the file is refreshed. Currently, it is refreshed when the build directory is cleaned and rebuilt from scratch. If the hook managed the file, the easy but expensive option would be a pre-commit clean, which clears the pre-commit cache and requires redownloading all hooks. A more surgical option would be supporting some environment variable by which the hook could be told to force redownload the file. I'm open to other options here too. Given that the current approach already allows for stale files if someone never nukes their build directory (and many people never do) I am not overly concerned with that being a big problem with the pre-commit approach either.

jameslamb added the enhancement New feature or request label Jan 24, 2025

This was referenced Jan 24, 2025

wheel builds: standardizations for 25.02 rapidsai/build-planning#136

Open

enforce cmake-format and cmake-lint rapidsai/cugraph#4889

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

standardize CMake linting and auto-formatting across RAPIDS #62

standardize CMake linting and auto-formatting across RAPIDS #62

jameslamb commented Jan 24, 2025

vyasr commented Jan 24, 2025

standardize CMake linting and auto-formatting across RAPIDS #62

standardize CMake linting and auto-formatting across RAPIDS #62

Comments

jameslamb commented Jan 24, 2025

Description

Benefits of this work

Acceptance Criteria

Approach

1. Review the current approaches used across RAPIDS

2. Review open-source options

3. Pick a path and implement it

Notes

vyasr commented Jan 24, 2025