I want to write a pandas dataframe to a postgres table. I make a connection to db as follows:
import psycopg2
import pandas as pd
import sqlalchemy
def connect(user, password, db, host='localhost', port=5432):
'''Returns a connection and a metadata object'''
url = 'postgresql://{}:{}@{}:{}/{}'
url = url.format(user, password, host, port, db)
# The return value of create_engine() is our connection object
con = sqlalchemy.create_engine(url, client_encoding='utf8')
# We then bind the connection to MetaData()
meta = sqlalchemy.MetaData(bind=con, reflect=True)
return con, meta
con, meta = connect('user_name', 'password', 'db_name', host='host_name')
When I read from a table that is already populated, it works fine:
df = pd.read_sql("SELECT * FROM db.table_name limit 10",con=con)
print df
I would like to be able to write df to a table. To test this, I have a temporary table called 'test' with two fields name and age.
# create a temp df
table = [['name', 'age'], ['nameA' , 20], ['nameB', 30]]
headers = table.pop(0)
df = pd.DataFrame(table, columns=headers)
# write to db
df.to_sql('db.test', con, if_exists = 'replace', index=False)
I then check if the temp table is populated:
df = pd.read_sql("SELECT * FROM db.test limit 10",con=con)
print df
I get an empty dataframe! I got no errors when I use df.to_sql but nothing is getting written to the database (?). What am I missing and how do I go about fixing this?
Versions:
Pandas: 0.19.2
Sqlachemy: 1.1.10
Postgres: 9.4.9
解决方案
I have not figured out why df.to_sql did not write to the table. Writing to table using pd.io.sql.SQLDatabase worked for my test case:
meta = sqlalchemy.MetaData(con, schema='db_name')
meta.reflect()
pdsql = pd.io.sql.SQLDatabase(con, meta=meta)
pdsql.to_sql(df, 'test', if_exists='replace')
I would not consider this THE solution -- I'd be happy to accept better solution or an answer that brings a closure to why df.to_sql() does not behave as expected.