Skip to content

Update LCFS, WAS, and ETB prerequisites#413

Merged
MaxGhenis merged 4 commits into
mainfrom
codex/update-lcfs-was-etb-2024
May 24, 2026
Merged

Update LCFS, WAS, and ETB prerequisites#413
MaxGhenis merged 4 commits into
mainfrom
codex/update-lcfs-was-etb-2024

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

@MaxGhenis MaxGhenis commented May 24, 2026

Summary

  • Add current private-source release metadata for LCFS 2023-24 (UKDS SN 9468), WAS Round 8 / 2006-2022 (SN 7215), and ETB 1977-2024 (SN 8856).
  • Switch private prerequisites to the new expected zip names: lcfs_2023_24.zip, was_2006_22.zip, and etb_1977_24.zip.
  • Point LCFS, WAS, ETB, VAT, fuel, and public-services model training at the current tab filenames, with release-specific QRF cache names and metadata checks so stale .pkl files are not reused.
  • Update the WAS mapping to Round 8 variables and advance ETB VAT training to year == 2023 for FYE 2024.

Data inputs

The current restricted UKDS TAB zips have been fetched through the UKDS catalogue download endpoint and uploaded to the private HF repo policyengine/policyengine-uk-data-private under the filenames this PR expects:

  • lcfs_2023_24.zip: UKDS SN 9468 TAB package 9468tab_A03FF22348E5E7D12FFD971D315D8E54BDC3CE7F6395D0108D85DEEBDF6BE8E3_V1.zip, SHA256 a03ff22348e5e7d12ffd971d315d8e54bdc3ce7f6395d0108d85deebdf6be8e3.
  • was_2006_22.zip: UKDS SN 7215 TAB package 7215tab_DF5E8BE49E51AA70F4BF686B98AB44EF11EE1EB260CBAF51308F89DC62449AE1_V1.zip, SHA256 df5e8be49e51aa70f4bf686b98ab44ef11ee1eb260cbaf51308f89dc62449ae1.
  • etb_1977_24.zip: UKDS SN 8856 TAB package 8856tab_96FFF4868745A2A9BB1169DCECDAF9958C1179D81F9EF67494C1F29622C7D405_V1.zip, SHA256 96fff4868745a2a9bb1169dcecdaf9958c1179d81f9ef67494c1f29622c7d405.

The downloader now flattens each source package's UKDA-*-tab/tab folder, because the current UKDS TAB zips are nested and the data filenames themselves are not prefixed with the study number.

Refs #411

Tests

  • uv run ruff format --check .
  • uv run ruff check policyengine_uk_data/datasets/private_releases.py policyengine_uk_data/storage/download_private_prerequisites.py policyengine_uk_data/datasets/imputations/wealth.py policyengine_uk_data/datasets/imputations/consumption.py policyengine_uk_data/datasets/imputations/vat.py policyengine_uk_data/datasets/imputations/services/etb.py policyengine_uk_data/tests/test_frs_prerequisites.py policyengine_uk_data/tests/test_student_loan_balance.py policyengine_uk_data/tests/test_road_fuel_volume_uprating.py policyengine_uk_data/tests/test_vat_parameters.py policyengine_uk_data/tests/test_private_releases.py
  • uv run pytest policyengine_uk_data/tests/test_frs_prerequisites.py policyengine_uk_data/tests/test_private_releases.py policyengine_uk_data/tests/test_student_loan_balance.py policyengine_uk_data/tests/test_road_fuel_volume_uprating.py policyengine_uk_data/tests/test_vat_parameters.py -q
  • HF round-trip check for lcfs_2023_24.zip, was_2006_22.zip, and etb_1977_24.zip: downloaded from policyengine/policyengine-uk-data-private, verified SHA256, and verified extractor output contains the expected household/person tab files.

@MaxGhenis
Copy link
Copy Markdown
Contributor Author

CI status: lint, changelog, and release-manifest checks pass. The Test job fails during make download because HF does not yet have lcfs_2023_24.zip (and the later WAS/ETB zips would be missing too). This matches the draft blocker in the PR body.

@MaxGhenis MaxGhenis marked this pull request as ready for review May 24, 2026 11:49
@MaxGhenis MaxGhenis merged commit cf37a5b into main May 24, 2026
4 checks passed
@MaxGhenis MaxGhenis deleted the codex/update-lcfs-was-etb-2024 branch May 24, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants