-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Open
Labels
Description
Pandas version checks
- I have checked that this issue has not already been reported.
- Query:
is:issue normalize overflow
To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.I have confirmed this bug exists on the latest version of pandas.I have confirmed this bug exists on the main branch of pandas.To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.
Issue description
I was trying to get the minimum date, so I tried pd.Timestamp.min.normalize()
before realizing that would be out of range, but what's weird is that it overflows without warning:
>>> pd.Timestamp.min
Timestamp('1677-09-21 00:12:43.145224193')
>>> pd.Timestamp.min.normalize()
Timestamp('2262-04-11 23:34:33.709551616')
For reference:
>>> pd.Timestamp.max
Timestamp('2262-04-11 23:47:16.854775807')
Compare that to .floor('D')
, which I thought would be equivalent, but it raises:
>>> pd.Timestamp.min.floor('D')
OverflowError: value too large
The above exception was the direct cause of the following exception:
OutOfBoundsDatetime: Cannot round 1677-09-21 00:12:43.145224193 to freq=<Day> without overflow
as well as .round('D')
(same error).
Sidenote: What I really wanted was .ceil('D')
, which gets the first whole day:
>>> pd.Timestamp.min.ceil('D')
Timestamp('1677-09-22 00:00:00')
Installed Versions
INSTALLED VERSIONS
------------------
commit : d41884b2dd0823dc6288ab65d06650302e903c6b
python : 3.12.8
python-bits : 64
OS : Linux
OS-release : 5.4.0-202-generic
Version : #222-Ubuntu SMP Fri Nov 8 14:45:04 UTC 2024
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : fr_CA.UTF-8
LOCALE : fr_CA.UTF-8
pandas : 3.0.0.dev0+1777.gd41884b2dd
numpy : 2.1.0.dev0+git20240403.e59c074
dateutil : 2.9.0.post0
pip : 24.3.1
Cython : None
sphinx : None
IPython : 8.22.2
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
blosc : None
bottleneck : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : None
lxml.etree : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
psycopg2 : None
pymysql : None
pyarrow : None
pyreadstat : None
pytest : None
python-calamine : None
pytz : 2024.1
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None
Activity
[-]Is this a bug? `Timestamp.normalize()`[/-][+]Is this a bug? `Timestamp.normalize()` can overflow silently (vs `.floor()` which raises)[/+]shashankrushiya commentedon Dec 19, 2024
I don't think the above mention is a bug. The reason is because-
Here, when we use
normalize()
method, it sets the time part of the timestamp to00:00:00
. However,pd.Timestamp.min
is already extremely close to the lower boundary of supported date times. And talking aboutfloor()
andround()
methods, both are more strict in their handling. They basically check whether the operation would cause an overflow and raise anOverFlowError
if it would.If you consider
ceil('D')
, it will work as this method rounds up to the next day, which is still within the valid range forpd.Timestamp
.wjandrea commentedon Dec 19, 2024
@shashankrushiya I'm not following. How does that mean it's not a bug?
So then why doesn't
.normalize()
check? Is there a good reason or is it just an oversight? (i.e. a bug, or maybe not a bug per se but a potential enhancement)BTW, note that
pd.Timestamp.min
is the lower boundary, not "extremely close" to the boundary.[-]Is this a bug? `Timestamp.normalize()` can overflow silently (vs `.floor()` which raises)[/-][+]BUG: `Timestamp.normalize()` can overflow silently[/+]rhshadrach commentedon Dec 20, 2024
Thanks for the report! Agreed normalize should raise an OverflowError here. PRs to fix are welcome!
I view this as a violation of pandas' Code of conduct and ask you to refrain from making such statements in the future. We'd like to make pandas a welcome place for all contributors, and this type of comment works against this goal.
wjandrea commentedon Dec 20, 2024
@rhshadrach My bad. I've edited my above comment to be respectful and more constructive. Sorry @shashankrushiya!
anishkarki commentedon Dec 22, 2024
take
Anurag-Varma commentedon Feb 21, 2025
take
normalize()
by raising exception #61083