-
Notifications
You must be signed in to change notification settings - Fork 35
Explicitly use default datasets for US state-level simulations #2953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Update _setup_data to pass through hf:// and gs:// URLs directly
- Allows frontend to specify state-specific datasets via full URL
- Maintains backward compatibility with enhanced_cps keyword
- Fallback to pooled CPS for states when no dataset specified
- Add tests for HF URL passthrough behavior
This change works with policyengine-app-v2 to enable state-specific
datasets at hf://policyengine/policyengine-us-data/states/{STATE}.h5
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2953 +/- ##
=======================================
Coverage 74.87% 74.87%
=======================================
Files 53 53
Lines 1966 1966
Branches 254 254
=======================================
Hits 1472 1472
Misses 444 444
Partials 50 50 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
|
@DTrim99 We should chat about this today |
|
As of PolicyEngine/policyengine.py#191, we now by default use Ben's state datasets in the sim API and What I'm going to do is hijack this PR just to have it select the default dataset for US states, which will be Ben's datasets. I don't think we should pass through HF or GCS links here, and in fact, we shouldn't be using HFs at all due to their rate limiting policy. |
|
@PolicyEngine summarize what this PR does now, following a series of changes |
SummaryFollowing the changes made in this PR, here's what it now does: Core Change: This PR modifies the dataset selection logic for US state-level simulations to return Specific Changes:
Files Modified:
The PR effectively simplifies the dataset selection at the API level by delegating to policyengine.py for default dataset management, except for the NYC special case. |
Note: this summary is now deprecated
Summary
_setup_datato pass throughhf://andgs://URLs directlyenhanced_cpskeywordChanges
The
_setup_datamethod now checks if the dataset parameter starts withhf://orgs://and passes it through unchanged. This enables the frontend to specify state-specific datasets like:Dataset Selection Logic
hf://...URLgs://...URLenhanced_cpsgs://policyengine-us-data/enhanced_cps_2024.h5None+ US stategs://policyengine-us-data/pooled_3_year_cps_2023.h5(fallback)None+ US/UKNoneTest plan
Related PRs
🤖 Generated with Claude Code