Skip to content

Error fetching data to an OracleDataFrame with dates prior to 1970-01-01 #483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
G-Berg opened this issue Apr 14, 2025 · 12 comments
Open
Labels
bug Something isn't working

Comments

@G-Berg
Copy link

G-Berg commented Apr 14, 2025

I am using oracledb 3.1 to extract data, and an error occurs when there is any date less than or equal to 1970-01-01 in any date column.
Any date after 1970-01-01 works fine.

The code that demonstrates the problem is as follows:

sql = """
SELECT TO_DATE('1970-01-01', 'yyyy-mm-dd') AS DATA_REF FROM DUAL
"""

odf = conn.fetch_df_all(statement=sql, arraysize=100)

print(odf.column_names())
print(f"{odf.num_columns()} columns")
print(f"{odf.num_rows()} rows")

Error message:

--------------------------------------------------------------------------

OSError                                   Traceback (most recent call last)
Cell In[9], line 5
      1 sql = """
      2 SELECT TO_DATE('1970-01-01', 'yyyy-mm-dd') AS DATA_REF FROM DUAL
      3 """
----> 5 odf = conn.fetch_df_all(statement=sql, arraysize=100)
      7 print(odf.column_names())
      8 print(f"{odf.num_columns()} columns")

File [~\AppData\Roaming\Python\Python311\site-packages\oracledb\connection.py:810](http://localhost:8888/~/AppData/Roaming/Python/Python311/site-packages/oracledb/connection.py#line=809), in Connection.fetch_df_all(self, statement, parameters, arraysize)
    808     cursor.arraysize = arraysize
    809 cursor.prefetchrows = cursor.arraysize
--> 810 cursor.execute(statement, parameters)
    811 return cursor._impl.fetch_df_all(cursor)

File [~\AppData\Roaming\Python\Python311\site-packages\oracledb\cursor.py:708](http://localhost:8888/~/AppData/Roaming/Python/Python311/site-packages/oracledb/cursor.py#line=707), in Cursor.execute(self, statement, parameters, **keyword_parameters)
    706 self._prepare_for_execute(statement, parameters, keyword_parameters)
    707 impl = self._impl
--> 708 impl.execute(self)
    709 if impl.fetch_vars is not None:
    710     return self

File src\\oracledb\\impl/thin/cursor.pyx:277, in oracledb.thin_impl.ThinCursorImpl.execute()

File src\\oracledb\\impl/thin/protocol.pyx:450, in oracledb.thin_impl.Protocol._process_single_message()

File src\\oracledb\\impl/thin/protocol.pyx:451, in oracledb.thin_impl.Protocol._process_single_message()

File src\\oracledb\\impl/thin/protocol.pyx:419, in oracledb.thin_impl.Protocol._process_message()

File src\\oracledb\\impl/thin/protocol.pyx:395, in oracledb.thin_impl.Protocol._process_message()

File src\\oracledb\\impl/thin/messages/execute.pyx:262, in oracledb.thin_impl.ExecuteMessage.process()

File src\\oracledb\\impl/thin/messages/base.pyx:687, in oracledb.thin_impl.Message.process()

File src\\oracledb\\impl/thin/messages/base.pyx:1089, in oracledb.thin_impl.MessageWithData._process_message()

File src\\oracledb\\impl/thin/messages/base.pyx:1170, in oracledb.thin_impl.MessageWithData._process_row_data()

File src\\oracledb\\impl/thin/messages/base.pyx:931, in oracledb.thin_impl.MessageWithData._process_column_data()

File src\\oracledb\\impl/base/converters.pyx:246, in oracledb.base_impl.convert_oracle_data_to_arrow()

OSError: [Errno 22] Invalid argument
@G-Berg G-Berg added the bug Something isn't working label Apr 14, 2025
@cjbj
Copy link
Member

cjbj commented Apr 15, 2025

Interestingly it doesn't reproduce for me:

$ python t.py 
['DATA_REF']
1 columns
1 rows

This is on macOS with Python 3.12.7, python-oracledb 3.1 and 3.0, and Oracle Database 23.x. I also tried even older dates without a problem.

Can you share more detail about your environment?

@usmanjohn
Copy link

usmanjohn commented Apr 15, 2025

The same error is happening for me.
connection = oracledb.connect(...)
df = connection.fetch_df_all(query)
OsError: [Error 22] Invalid Argument

oracledb version: 3.1.0
Mode: Thin Mode
python version: 3.12.2
OS: WIndows

@cjbj
Copy link
Member

cjbj commented Apr 15, 2025

@usmanjohn thanks for sharing. What's your (exact) DB version and flavor (local, cloud, exadata etc etc)?

@usmanjohn
Copy link

Hi @cjbj , The DB version is 19.12.0.0.0 and flavor is local.

@usmanjohn
Copy link

@cjbj I realised that the question name has changed. My problem no longer belongs here, there is nothing to do with the date on my code. The problem is with fetch_df_all I think.

@cjbj
Copy link
Member

cjbj commented Apr 15, 2025

@usmanjohn if you have a testcase for your scenario, we'd be happy to look at it too.

@G-Berg
Copy link
Author

G-Berg commented Apr 15, 2025

I am running on this environment:

Oracle: 19.26.0.0.0
Mode: Thin Mode
python version: 3.11.2
OS: Windows 11

@cjbj
Copy link
Member

cjbj commented Apr 15, 2025

Looks related to this Python Windows issue: https://stackoverflow.com/questions/59199985/why-is-datetimes-timestamp-method-returning-oserror-errno-22-invalid-a. From https://docs.python.org/3/library/datetime.html#datetime.date.fromtimestamp:

This may raise OverflowError, if the timestamp is out of the range of values supported by the platform C localtime() function, and OSError on localtime() failure. It’s common for this to be restricted to years from 1970 through 2038.

@G-Berg
Copy link
Author

G-Berg commented Apr 16, 2025

All other methods work well, except for fetch_df_all and fetch_df_batches.
It must be because of the conversion from DB_TYPE_DATE to TIMESTAMP and not to some date format.

sql = """
SELECT TO_DATE('1970-01-01', 'yyyy-mm-dd') AS DATA_REF FROM DUAL
"""
cursor = conn.cursor()
for row in cursor.execute(sql):
    print(row)
 
cursor.execute(sql)
while True:
    row = cursor.fetchone()
    if row is None:
        break
    print(row)
 
cursor.execute(sql)
num_rows = 10
while True:
    rows = cursor.fetchmany(size=num_rows)
    if not rows:
        break
    for row in rows:
        print(row)
 
cursor.execute(sql)
rows = cursor.fetchall()
for row in rows:
    print(row)

@anthony-tuininga
Copy link
Member

Not that conversion, try this:

import datetime
d = datetime.datetime(1970, 1, 1)
print("to epoch:", d.timestamp())

That should generate the error you are experiencing, as that is what the conversion to dataframe is doing since it is required for the Arrow data format. This issue appears only on Windows. There are workarounds that are possible which can be investigated. For now your current "solutions" are: (a) stop using Windows!, (b) avoid the use of dates prior to 1970 or (c) convert them to string.

If you are able to build yourself, you can validate one of the workarounds (untested):

diff --git a/src/oracledb/impl/base/converters.pyx b/src/oracledb/impl/base/converters.pyx
index 3614a2b1..8460311b 100644
--- a/src/oracledb/impl/base/converters.pyx
+++ b/src/oracledb/impl/base/converters.pyx
@@ -243,8 +243,12 @@ cdef int convert_oracle_data_to_arrow(OracleMetadata from_metadata,
         rb = &data.buffer.as_raw_bytes
         arrow_array.append_bytes(<void*> rb.ptr, rb.num_bytes)
     elif arrow_type == NANOARROW_TYPE_TIMESTAMP:
-        ts = int(convert_date_to_python(&data.buffer).timestamp() *
-                 arrow_array.factor)
+        # for Windows only
+        dt = convert_date_to_python(&data.buffer)
+        epoch_date = datetime.datetime(1970, 1, 1)
+        ts = (dt - epoch_date).total_seconds() * arrow_array.factor
+#        ts = int(convert_date_to_python(&data.buffer).timestamp() *
+#                 arrow_array.factor)
         arrow_array.append_int64(ts)
     elif arrow_type == NANOARROW_TYPE_DECIMAL128:
         convert_number_to_arrow_decimal(arrow_array, &data.buffer)

@G-Berg
Copy link
Author

G-Berg commented Apr 16, 2025

Same error with that code.

For the proposed solutions:
(a) Its not an option for our organization, hehe.
(b) we have dates prior 1970 in our database.
For now, the best solution is (c) convert date to string.

I am not able to build the others workarounds.

Thank you for your help.

@cjbj
Copy link
Member

cjbj commented Apr 24, 2025

@suraj-ora-2020 will work on a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants