Skip to content

Reduce output from copyright checker#198

Open
t00sa wants to merge 4 commits intoMetOffice:mainfrom
t00sa:copyright-messages
Open

Reduce output from copyright checker#198
t00sa wants to merge 4 commits intoMetOffice:mainfrom
t00sa:copyright-messages

Conversation

@t00sa
Copy link
Contributor

@t00sa t00sa commented Feb 26, 2026

PR Summary

Sci/Tech Reviewer: @r-sharp
Code Reviewer: @cameronbateman-mo

Remove a lot of the duplicate output from the copyright checker to make it obvious which files have a problem. Also report cases where no files are found or all files are ignored as non-fatal warnings.

Improve usability by making it possible to specify the location of the template directory on the command line. Previously, the script either used the contents of CYLC_TASK_WORK_PATH or the current directory.

Catch the case where no templates have been found and Exit with an error. Previous behaviour was to continue and to flag up all files as having invalid copyrights.

Code Quality Checklist

  • I have performed a self-review of my own code
  • My code follows the project's style guidelines
  • Comments have been included that aid understanding and enhance the readability of the code
  • My changes generate no new warnings
  • All automated checks in the CI pipeline have completed successfully

Testing

  • This change has been tested appropriately (please describe)

Template Location Changes

Tests of changes to the template location code using the copyright checker bin directory and running script on itself:

vdi> env CYLC_TASK_WORK_PATH=.. ./copyright_checker.py 
[SUCCESS] 1 file has a valid copyright
vdi> ./copyright_checker.py  --template=../file/
[SUCCESS] 1 file has a valid copyright
vdi> ./copyright_checker.py  --template=/var/tmp/empty
[ERROR] no templates found
vdi> ./copyright_checker.py
[ERROR] no templates found
vdi>

Ignore and Empty Changes

Test changes to messages where all files are ignored or where no valid files can be found:

vdi> ./copyright_checker.py  --template=../file/ --ignore .py
[WARNING] only possible file ignored
vdi> touch foo.py
vdi> ./copyright_checker.py  --template=../file/ --ignore .py
[WARNING] all 2 possible files ignored
vdi> ./copyright_checker.py  --template=../file/ /var/tmp/empty
[WARNING] no files found
vdi> mkdir /var/tmp/almost-empty
vdi> touch /var/tmp/almost-empty/foo.pl
vdi> ./copyright_checker.py  --template=../file/ /var/tmp/almost-empty
[WARNING] no files found
vdi>

New Output Format

Examples of the new format of the output using the UM source:

vdi> ./copyright_checker.py --templates ../file/ /var/tmp/um/src

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Checked 2515, ignored 0, with 10 failures                                    %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
/var/tmp/um/src/atmosphere/atmosphere_service/include/qsat_mod_qsat_mix.h
/var/tmp/um/src/atmosphere/convection/conv_type_defs.F90
/var/tmp/um/src/atmosphere/convection/tcs_cmt_params_cg.F90
/var/tmp/um/src/atmosphere/convection/tcs_cmt_params_dp.F90
/var/tmp/um/src/atmosphere/convection/tcs_cmt_params_sh.F90
/var/tmp/um/src/atmosphere/convection/comorph/unit_tests/build_test_calc_cond_properties.sh
/var/tmp/um/src/atmosphere/convection/comorph/unit_tests/build_test_check_bad_values.sh
/var/tmp/um/src/atmosphere/convection/comorph/unit_tests/build_test_comorph.sh
/var/tmp/um/src/atmosphere/convection/comorph/unit_tests/build_test_moist_proc.sh
/var/tmp/um/src/atmosphere/convection/comorph/unit_tests/build_test_solve_detrainment.sh

[ERROR] 10 files have missing copyright notices
vdi> ./copyright_checker.py --templates ../file/ --ignore comorph/unit_tests /var/tmp/um/src

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Checked 2504, ignored 11, with 5 failures                                    %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
/var/tmp/um/src/atmosphere/atmosphere_service/include/qsat_mod_qsat_mix.h
/var/tmp/um/src/atmosphere/convection/conv_type_defs.F90
/var/tmp/um/src/atmosphere/convection/tcs_cmt_params_cg.F90
/var/tmp/um/src/atmosphere/convection/tcs_cmt_params_dp.F90
/var/tmp/um/src/atmosphere/convection/tcs_cmt_params_sh.F90

[ERROR] 5 files have missing copyright notices
vdi> ./copyright_checker.py --templates ../file/ /var/tmp/um/src/control
[SUCCESS] 623 files have valid copyrights
vdi>

Security Considerations

  • I have reviewed my changes for potential security issues
  • Sensitive data is properly handled (if applicable)
  • Authentication and authorisation are properly implemented (if applicable)

AI Assistance and Attribution

  • Some of the content of this change has been produced with the assistance of Generative AI tool name (e.g., Met Office Github Copilot Enterprise, Github Copilot Personal, ChatGPT GPT-4, etc) and I have followed the Simulation Systems AI policy (including attribution labels)

Sci/Tech Review

  • I understand this area of code and the changes being added
  • The proposed changes correspond to the pull request description
  • Documentation is sufficient (do documentation papers need updating)
  • Sufficient testing has been completed

(Please alert the code reviewer via a tag when you have approved the SR)

Code Review

  • All dependencies have been resolved
  • Related Issues have been properly linked and addressed
  • Code quality standards have been met
  • Tests are adequate and have passed
  • Security considerations have been addressed
  • Performance impact is acceptable

Remove a lot of the duplicate output from the copyright checker to
make it obvious which files have a problem.  Also report cases where
no files are found or all files are ignored as non-fatal warnings.

Improve usability by making it possible to specify the location of the
template directory on the command line.  Previously, the script either
used the contents of CYLC_TASK_WORK_PATH or the current directory.

Catch the case where no templates have been found and Exit with an
error.  Previous behaviour was to continue and to flag up all files as
having invalid copyrights.
Copy link
Contributor

@r-sharp r-sharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a 'requested change', more just a comment/observation.
I can't help but feel "main" is doing too much 'heavy lifting'. For example, lines 135 to 152 /might/ be a nice compact function taking inputs and ignore_list and returning files_to_check.

I'm also less than convinced by the methodology, which requires each value in 'inputs' to be checked to see if it's a direcory (to then walk), or a file. Wouldn't 'globbing' each input to return a list of files be simpler as that would accept both a directory to descend and a file.

The small block to actually check them is less obvious to make a function, but it would allow unit testing...

Then the bottom 40+ lines are purely to generate the report/output, which while nicely set up to get plurals and grammer correct, does seem a touch overcomplicated...

Comment on lines +131 to +133
if not templates:
raise SystemExit("[ERROR] no templates found")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a tad picky, but should this check perhaps be at line 124, immediately after the last time templates is updated ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appear to have missed a bit of logic: the check should be at L127 and should report an error if both templates and regex_templates_raw are empty.

default=template_path,
help="path to the templates (default: %(default)s)",
)
excl_group.add_argument(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be the only argument in the exclusive arguments group, which I can't help but feel makes it somewhat difficult for there to be two clashing args....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my fault: the templates argument splits the two exclusive arguments --full_trunk and files in an non-obvious way. I'll move things around to make it more obvious.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I swear I 'hunted' for another excl_group.add_argument and couldn't spot it...

@t00sa t00sa requested a review from r-sharp March 2, 2026 16:46
Copy link
Contributor

@r-sharp r-sharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for those tweaks- Passing on for CR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants