-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG/TST: fix and test for timezone drop in GroupBy.shift/bfill/ffill #27992
BUG/TST: fix and test for timezone drop in GroupBy.shift/bfill/ffill #27992
Conversation
doc/source/whatsnew/v0.25.1.rst
Outdated
@@ -120,6 +120,7 @@ Groupby/resample/rolling | |||
- Bug in :meth:`pandas.core.groupby.DataFrameGroupBy.transform` where applying a timezone conversion lambda function would drop timezone information (:issue:`27496`) | |||
- Bug in windowing over read-only arrays (:issue:`27766`) | |||
- Fixed segfault in `pandas.core.groupby.DataFrameGroupBy.quantile` when an invalid quantile was passed (:issue:`27470`) | |||
- Bug in :meth:`pandas.core.groupby.GroupBy.shift`, :meth:`pandas.core.groupby.GroupBy.bfill` and :meth:`pandas.core.groupby.GroupBy.ffill` where timezone information would be dropped (:issue:`xxxxx`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is not an existing issue for this bug, use the number of this PR.
), | ||
], | ||
) | ||
def test_shift_bfill_ffill_tz(data, expected_shift, expected_bfill, expected_ffill): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put this test in test_groupby.py
Also you can parameterize over shift
, bfill
, and ffill
@mroeschke Many thanks, -noritada |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. One small comment otherwise LGTM.
pandas/tests/groupby/test_groupby.py
Outdated
@@ -1882,3 +1882,69 @@ def test_groupby_axis_1(group_name): | |||
results = df.groupby(group_name, axis=1).sum() | |||
expected = df.T.groupby(group_name).sum().T | |||
assert_frame_equal(results, expected) | |||
|
|||
|
|||
@pytest.mark.parametrize("tz", [None, "Asia/Tokyo"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a tz_naive_fixture
you can use in the function signature that will parameterize over more timezones.
def test_shift_bfill_ffill_tz(tz_naive_fixture, op, expected):
tz = tz_naive_fixture
...
@mroeschke Many thanks, -noritada |
@@ -2220,26 +2220,28 @@ def _get_cythonized_result( | |||
base_func = getattr(libgroupby, how) | |||
|
|||
for name, obj in self._iterate_slices(): | |||
values = obj._data._values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm not sure relying on underlying block values is the best way to go about this. Is it possible to just work with the obj here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my lack of understanding about underlying block values.
Should I leave the code using obj.values
except for the last result = algorithms.take_nd(obj.values, result)
?
Can you update whatsnew for 0.25.2 as well? |
1.0.0, unless this is a regression in 0.25.x.
…On Fri, Aug 23, 2019 at 8:52 AM William Ayd ***@***.***> wrote:
***@***.**** requested changes on this pull request.
------------------------------
In pandas/core/groupby/groupby.py
<#27992 (comment)>:
> @@ -2220,26 +2220,28 @@ def _get_cythonized_result(
base_func = getattr(libgroupby, how)
for name, obj in self._iterate_slices():
+ values = obj._data._values
Hmm I'm not sure relying on underlying block values is the best way to go
about this. Is it possible to just work with the obj here instead?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#27992?email_source=notifications&email_token=AAKAOITRKMZJLG3H2KELYODQF7TQNA5CNFSM4IMSLLYKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCCQWFKQ#pullrequestreview-279012010>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIXQRK4RRN4LSTZ5RR3QF7TQNANCNFSM4IMSLLYA>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. can you do an issue search to see if we have anything about this (IIRC we do). if so, pls reference these issues.
doc/source/whatsnew/v0.25.1.rst
Outdated
@@ -120,6 +120,7 @@ Groupby/resample/rolling | |||
- Bug in :meth:`pandas.core.groupby.DataFrameGroupBy.transform` where applying a timezone conversion lambda function would drop timezone information (:issue:`27496`) | |||
- Bug in windowing over read-only arrays (:issue:`27766`) | |||
- Fixed segfault in `pandas.core.groupby.DataFrameGroupBy.quantile` when an invalid quantile was passed (:issue:`27470`) | |||
- Bug in :meth:`pandas.core.groupby.GroupBy.shift`, :meth:`pandas.core.groupby.GroupBy.bfill` and :meth:`pandas.core.groupby.GroupBy.ffill` where timezone information would be dropped (:issue:`27992`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move to 0.25.2
also pls merge master |
@WillAyd @TomAugspurger @jreback Could you please check this again? Many thanks, -noritada |
thanks @noritada |
Owee, I'm MrMeeseeks, Look at me. There seem to be a conflict, please backport manually. Here are approximate instructions:
And apply the correct labels and milestones. Congratulation you did some good work ! Hopefully your backport PR will be tested by the continuous integration and merged soon! If these instruction are inaccurate, feel free to suggest an improvement. |
ok sure np alright then prob just need to move the actual release note |
closes #19995
Timezone info is dropped in GroupBy.shift, bfill, and ffill
since index calculated by Cythonized functions are applied to
NumPy representation of values without timezone.
Could you please find and review this fix?
Many thanks,
-noritada
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff