python对csv/excel类型文件的操作大全，有关pandas.read_csv读取方法，遍历表格中每个值，取行，取列

二又

已于 2024-03-12 20:32:56 修改

阅读量420

点赞数 1

文章标签： pandas

于 2024-02-21 09:20:01 首次发布

本文链接：https://blog.csdn.net/weixin_44003026/article/details/135862272

版权

1、pandas读取csv文件

pd.read_csv(data_path, header = None, index_col=0，sep = ',' )

指定文件中没有列名（header）。index_col = 0将第一列作为索引列。sep指定读取的时候分隔符的标点，一般有‘，’，‘\t’这两种。

2、df.iloc 进行切片操作
df_cellinfo.iloc[:,0] 取第0列的全部数据，遍历每一行：

for row in df_cellinfo.iloc[:,0]：

2、分别代表列名称和行名称：df.columns, df.index
3、遍历每个表格中的元素
4、dataframe转Tensor报错：ValueError: expected sequence of length 10 at dim 0 (got 16000)
需要提取

df = pd.DataFrame(data)
 # 将 DataFrame 转换为 PyTorch Tensor
tensor_from_df = torch.tensor(df.values)

5、
df.iloc将某一列转化成label，不能直接使用如下的方法：

sensitive = df_cellinfo.iloc[1:, 4]

而是要先将其转化为list，再转成int整型：

sensitive = df_cellinfo.iloc[1:, 4].tolist()
adata.obs['sensitive'] = [int(i) for i in sensitive]

6、excel表的读取，设置第2行为列索引，删去第一行之前的部分

df = pd.read_excel("./41588_2020_726_MOESM3_ESM-s1-s10.xlsx", sheet_name='Table S9', header=2, skiprows=1, index_col=0)

关注