嵌套列表生成csv,读取csv,嵌套列表float转换为int
列表生成csv文件
a=[[1,2,3],[4,5,6],[7,8,9]]
path=r'C:\Users\Administrator\Desktop\testimage\20200922_1X.csv'
test=pd.DataFrame(a)
test.to_csv(path, encoding='gbk')
从csv中读取文件
data=pd.read_csv(path,index_col=0) #将csv文件读入并转化为dataframe形式
b = data.values
c = b.tolist()
pandas.read_csv()读取csv文件不读第一列
data=pd.read_csv(path,index_col=0)
加上index_col=0
即可
嵌套列表float转换为int
```
testx = [[int(a) for a in b] for b in self.testy]
```
常见错误
- shape mismatch: indexing arrays could not be broadcast together with shapes
一般都是两个list或者numpy的size不匹配 - cannot convert float NaN to integer
NaN表示是not a number,类似的还有inf。这俩都是float类型,无法直接转换为int.
dataframe.replace(np.nan, 0, inplace=True)
dataframe.replace(np.inf, 0, inplace=True)
直接转换为0或者其他值,也可以用dataframe.isnull().sum()
查Nan
- Initializing from file failed
看看路径有没有中文,如果有中文就换个解释器
import pandas as pd
url = "D:/我的文档/Machine Learning/数据集/tra.csv"
chad = pd.read_csv(url)
这样读取会报错 OSError: Initializing from file failed,这时将代码改为chad = pd.read_csv(url, engine=‘python’)即可
参数engine是要使用的解析器引擎。C引擎更快,而python引擎目前功能更完整.
此部分照搬
code
a=[[1,2,3],[4,5,6],[7,8,9]]
path=r'C:\Users\Administrator\Desktop\testimage\20200922_1X.csv'
test=pd.DataFrame(a)
test.to_csv(path, encoding='gbk')
data=pd.read_csv(path,index_col=0) #将csv文件读入并转化为dataframe形式
data1=pd.read_csv(path)
print(data)
print(data1)
b = data.values
b1 = data1.values
print(type(b))
c = b.tolist()
c1= b1.tolist()
print(c)
print(c1)
输出
0 1 2
0 1 2 3
1 4 5 6
2 7 8 9
Unnamed: 0 0 1 2
0 0 1 2 3
1 1 4 5 6
2 2 7 8 9
<class 'numpy.ndarray'>
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[[0, 1, 2, 3], [1, 4, 5, 6], [2, 7, 8, 9]]