Skip to content

Conversation

@alexrichey
Copy link
Contributor

@alexrichey alexrichey commented Nov 13, 2025

This modifies a few things:

  • Adds fuzzy version comparisons in the BYTES <> Socrata script so that we can compare version formats like '2025-01-01' with '2025Q1', or "Oct 2025" with "2025-10-01" to see what's potentially out of date. E.g.
image
  • Modifies the output dataframe to make results more actionable. Namely, it sorts:
    1. Products with ANY out of date datasets, where there exists a Socrata version
    2. Products with ANY out of date datasets, but no Socrata version (these may be out of date... or they may just be TODO on the Socrata side, and so have no version)
    3. The rest

This PR is mostly slopped. I took a shot at integrating this with our existing dcpy.utils.versions code, but came away thinking that will be a bigger endeavor (and probably refactor of that module).

@codecov
Copy link

codecov bot commented Nov 13, 2025

Codecov Report

❌ Patch coverage is 53.03030% with 31 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.65%. Comparing base (751229e) to head (65884dc).
⚠️ Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
dcpy/lifecycle/scripts/version_compare.py 53.03% 29 Missing and 2 partials ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
dcpy/lifecycle/scripts/version_compare.py 41.74% <53.03%> (+41.74%) ⬆️

... and 4 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@alexrichey alexrichey force-pushed the ar-fuzzy-versions branch 3 times, most recently from 5e602fc to 86a3045 Compare November 24, 2025 18:59
@alexrichey alexrichey requested review from damonmcc and fvankrieken and removed request for damonmcc and fvankrieken November 24, 2025 20:47
@alexrichey alexrichey marked this pull request as ready for review November 24, 2025 20:52
Comment on lines 21 to 29
if not self.original or not other.original:
return False

# Direct string comparison (handles case differences)
if self.original.lower().strip() == other.original.lower().strip():
return True

# Compare normalized versions
return self.normalized == other.normalized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking about trying to trim down a little...

Could we declare original and normalized as typed class attributes? I don't see how this first if clause would ever be met.

Then, since we're not returning "exact match" or the like, just true either way, should we just compared the normalized versions? We've already computed them and they should be identical if the originals were identical. Meaning we could skip the second if clause as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm with you. Made it terse. For the first case (if you're referring to if not self.original or not other.original) it'd be when it resolves to None == None which want to be False.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got bad news bucko
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow my reading comprehension is terrible today sorry

version = self.original.lower().strip()

# Handle quarter notation (e.g., "25q1", "24q2")
quarter_match = re.match(r"^(\d{2})q([1-4])$", version)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be q|Q? I feel like we see both

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! We've lowered at this point

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed - though latest commit adds an explicit test case for that (there was something close before, but not quite) and changes the comment

@alexrichey alexrichey merged commit c4e9c7c into main Nov 24, 2025
25 checks passed
@alexrichey alexrichey deleted the ar-fuzzy-versions branch November 24, 2025 21:42
("September 2025", "202509", True),
("JUNE 2024", "24q2", True),
("march 2025", "20250315", True),
("Q1 2025", "january 2025", False), # Different months in Q1
Copy link
Member

@damonmcc damonmcc Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry why aren't these probably equal?

edit: oh direction matters! if something is Q1 it could be any of 3 different months so we'd rather not say these're equalish

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants