I have ~100,000 to 1,000,000 rows to insert into an Oracle18c database. I'm quite new with Oracle and this order of magnitude of data. I reckon there must be some optimal way to do it, but for now I've only managed to implement a line by line insertion:
def insertLines(connection, tableName, column_names, rows):
cursor = connection.cursor()
if tableExists(connection, tableName):
for row in rows:
sql = 'INSERT INTO {} ({}) VALUES ({})'.format(tableName, column_names, row)
cursor.execute(sql)
cursor.close()
Is there some clear way in Oracle to bulk the rows to reach higher effectivity using cx_Oracle (the python Oracle library)?
EDIT: I read the data from a CSV file.
解决方案
If your data is already in Python, then use executemany(). In your case with so many rows, you probably would still execute multiple calls to insert batches of records. See https://blogs.oracle.com/opal/efficient-and-scalable-batch-statement-execution-in-python-cx_oracle
data = [
(60, "Parent 60"),
(70, "Parent 70"),
(80, "Parent 80"),
(90, "Parent 90"),
(100, "Parent 100")
]
cursor.executemany("""
insert into ParentTable (ParentId, Description)
values (:1, :2)""", data)
As pointed out by others
Avoid using string interpolation in statements because it is a security risk.
It is also generally a scalability problem. Use bind variables. Where you need to use string interpolation for things like column names, make sure you santize any values.
If the data is already on disk, then using something like SQL*Loader or Data Pump will be better than reading it into cx_Oracle and then sending it to the DB.