-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot insert large data sets in Oracle table using executemany() #548
Comments
Can you update your question with some details? Start by answering the GitHub issues template questions. What's the error - don't make us guess? What's the table description? Give some sample data. Review the cx_Oracle documentation Batch Statement Execution and Bulk Loading. The final example (about CSV files) shows one way of looping over |
Hi cjbj! I have updated the question. Unfortunately I cannot provide any sample data due to privacy concerns. But my dataset consists of several columns with hundreds of thousands or rows (some are even in million). The exact error is this: DPI-1015: array size of is too large data is loaded as dataframe using pandas, then it should be loaded into an existing table. my target method is executemany to significantly decrease insert time but executemany just wont let me. using execute() works perfectly fine, its just that it takes too much time. |
Could someone help on this. I am facing same issue. In my current code we are using Pandas, numpy and cx_Oracle libraries to insert and update data to Oracle table. We process about 30000 to 40000 records on each day. I can't share my source code unfortunately. |
The error you mentioned (DPI-1015: array size of is too large) indicates that you need to reduce the number of rows you are binding at one time. So instead of trying to send all hundreds of thousands (or millions) of rows at the same time, break them up into batches. In other words, something like this:
Choose a batch size that works well for you. Decide whether or not you wish to have batches committed as they reach the database successfully or if you wish to wait until the entire set of data has reached the database successfully. This sort of behavior has been requested within cx_Oracle itself as #200 which may be of interest. |
thank you @anthony-tuininga for the wonderful code you shared. I can now insert millions of rows using executemany. Cheers! |
@megurikun08 thanks for the feedback. If you haven't already optimized the insert, don't forget to look at whether using things like an APPEND hint will help. And check the constraint and index usage. It would be interesting to know whether writing to a disk file and then using SQL*Loader with the Direct Path load also gives a performance boost - it would be data & network dependent. These are all out of scope for the cx_Oracle repo, but there are plenty of resources online that can help. |
Can you please share an example code for the same ? |
@vsanjairaj see the comments above for the code that was shared. Also read the doc Batch Statement Execution and Bulk Loading. |
Question: Is there is pre-defined limitation for array size ? records counts or actual size ? |
The limit is on actual size: the maximum size in bytes of one row of data multiplied by the number of rows must be less than 2 ** 31 -- for each bind variable. |
Hi All,
This problem has been haunting me for quite sometime, and I need to resolve this asap. When I am trying to batch insert a data from a Data Frame, I get an error of array size of ... is too large. this is my current line for the executemany
I have tried some resolution present here in the internet but nothing works. I am quite aware that I can have the dataframe set to chunks but I cannot see any proper codes that work. As I have read the cx_Oracle documentation, the data size limit for executemany is 2GB. Here are my questions:
--EDIT--
Let me rephrase the question into:
how should i chunk my large datasets so i can insert them to my oracle database table using executemany?
Win10 - Python v3.7.9 - cx_Oracle v8.1 client ver 19.10
--END--
Thanks everyone!
The text was updated successfully, but these errors were encountered: