Skip to content

Add format_timestamps preprocessor #89

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 10, 2025
Merged

Conversation

dougharris
Copy link
Contributor

Description

Added format_timestamps() to preprocessors.

This allows for per-column formatting for date, time, or datetime like data.

Add a column_date_formats section to your config file with separate lines for each column that you'd like to specify a format using name=format. Use standard Python strftime formatting strings.

This works with a related pgcli PR to address my enhancement request dbcli/pgcli#1402

Checklist

  • I've added this contribution to the CHANGELOG.
  • I've added my name to the AUTHORS file (or it's already there).
  • I installed pre-commit hooks (pip install pre-commit && pre-commit install), and ran black on my code.
  • Please squash merge this pull request (uncheck if you'd like us to merge as multiple commits)

@j-bennet
Copy link
Contributor

j-bennet commented Mar 5, 2025

@dougharris This looks fine. But why the column-specific format, rather than formatting all datetime columns? What's the usecase for selectively formatting some datetime columns differently than others?

@dougharris
Copy link
Contributor Author

@j-bennet different columns have different uses. For example, we have a table that contains metadata about files we import for clients. Since clients update the files irregularly, when viewing the import_date column I only need to know details to the day level. If I'm looking into a log table where job runs are logged and the job runs many times an hour, I would like to know details down to the second (or perhaps even smaller).

I discuss this a bit more in my comment on the issue

@dougharris
Copy link
Contributor Author

I force pushed an update which better handles timestamp columns will null values.

@j-bennet
Copy link
Contributor

@dougharris I think it would be great if the column-level format would be in addition to global timestamp format. But we can start with the column-level format and go from there.

Copy link
Contributor

@j-bennet j-bennet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Merging!

@j-bennet j-bennet merged commit dd9e547 into dbcli:main Mar 10, 2025
@dougharris
Copy link
Contributor Author

As I think about it more, I like the idea of a global default for timestamp. What I currently see is 2023-01-18 15:56:58.586675+00. In my work, I never need microsecond accuracy and rarely even accuracy to the second. A sensible default for my needs would reduce this to 2023-01-18 15:56+00 (and even the timezone might not be needed)

@j-bennet
Copy link
Contributor

I never need microsecond accuracy and rarely even accuracy to the second. A sensible default for my needs would reduce this to 2023-01-18 15:56+00

I very much agree!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants