I am writing a pandas df to a csv. When I write it to a csv file, some of the elements in one of the columns are being incorrectly converted to scientific notation/numbers. For example, col_1 has strings such as '104D59' in it. The strings are mostly represented as strings in the csv file, as they should be. However, occasional strings, such as '104E59', are being converted into scientific notation (e.g., 1.04 E 61) and represented as integers in the ensuing csv file.
I am trying to export the csv file into a software package (i.e., pandas -> csv -> software_new) and this change in data type is causing problems with that export.
Is there a way to write the df to a csv, ensuring that all elements in df['problem_col'] are represented as string in the resulting csv or not converted to scientific notation?
Here is the code I have used to write the pandas df to a csv:
df.to_csv('df.csv', encoding='utf-8')
I also check the dtype of the problem column:
for df.dtype, df['problem_column'] is an object
解决方案
Use the float_format argument:
In [11]: df = pd.DataFrame(np.random.randn(3, 3) * 10 ** 12)
In [12]: df
Out[12]:
0 1 2
0 1.757189e+12 -1.083016e+12 5.812695e+11
1 7.889034e+11 5.984651e+11 2.138096e+11
2 -8.291878e+11 1.034696e+12 8.640301e+08
In [13]: print(df.to_string(float_format='{:f}'.format))
0 1 2
0 1757188536437.788086 -1083016404775.687134 581269533538.170288
1 788903446803.216797 598465111695.240601 213809584103.112457
2 -829187757358.493286 1034695767987.889160 864030095.691202
Which works similarly for to_csv:
df.to_csv('df.csv', float_format='{:f}'.format, encoding='utf-8')