生信：使用openpyxl处理物种的excel表

最新推荐文章于 2024-09-13 14:57:32 发布

顺天府的小团圆

最新推荐文章于 2024-09-13 14:57:32 发布

阅读量58

点赞数 1

文章标签： excel python vscode

本文链接：https://blog.csdn.net/2301_77255681/article/details/132888143

版权

书接上回，主要想纠正几个错误，其实我上一篇文章那个orthers的算法是错误的，这次我们重新来一次，以及任务描述改一下：

上一篇的orthers其实是这篇的total，并且因为老师给的文件有一列并不是物种，本人水平较差，我索引多减了个1,所以说total也不对。好了，话不多说我们开始吧。

先是前期工作准备

import openpyxl as ol
workbook = ol.load_workbook('E:\og.xlsx')  # 返回一个workbook数据类型的值
sheet = workbook.active 

num_row = sheet.max_row        # 获取行数
num_column = sheet.max_column   # 获取列数
# 读取数据，比如把excel中的一个table按行读取出来，存入一个二维的list
total_list = []
for row in sheet.rows:   #sheet.rows是一个迭代器
    row_list = []
    for cell in row:    # 直接从行中取每个cell
        row_list.append(cell.value)
    total_list.append(row_list)

list1,list2,list3,list4,list5=[0]*25,[0]*25,[0]*25,[0]*25,[0]*25#分别创立1：n,n:1,n:0,total,orthers的一维列表
#IndexError: list index out of range如果不加25
for l in range(1,num_row+1):
    for n in range(1,num_column+1):
        cell = sheet.cell(row=l, column=n)
        list4[n-1]+=cell.value
        
        if cell.value==1 :
            list1[n-1]+=1#n这个物种是1
        elif cell.value >=2 and total_list[l-1].count(0)!=24 and total_list[l-1].count(0)+total_list[l-1].count(1)==24:#n:1其余为1和0，但至少有一个1
            list2[n-1]+=cell.value
        elif cell.value>=2 and total_list[l-1].count(0)==24:
            list3[n-1]+=cell.value
        else:
            list5[n-1]+=cell.value

先按行，后按列，确定25个物种在单元格里的数值，之后再相加。索引之所以从1开始是因为，cell.value必须从1开始，就比如输入cell = sheet.cell(row=1, column=1)，print它会显示<Cell '1.Orthogroups.GeneCount'.A1>，是单元格A1,而如果是0的话会报错。所以后面我们的索引都要减1.如果上面的三个条件都不满足的话，就说明是orthers，加入list5.这里提醒一下，如果直接创立5个空列表的话，会提示IndexError.

然后写入excel表，这里也是一行一行写入的，所以没有创建列名。

workbook.create_sheet("test2")  
sheet2 = workbook["test2"]
data = [
   ['Acgr',	'Adca',	'Amtr',	'Anco',	'Aqco',	'Arfi',	'Arth',	'Busi',	'Cede'	,'Chse'	,'Cika','Cypa',	'Elgu',	'Eufe',	'Gibi',	'Ilve',	'Lich',	'Nenu',	'Nyco',	'Orsa',	'Pita',	'Potr',	'Scch',	'Sppo',	'Vivi'],
   list1,list2,list3,list4,list5
]
for row in data:
    sheet2.append(row)

workbook.save('E:\og.xlsx')

基因总数为7万的话，uncluster直接用70000减去list4[0],list4[1],list4[2]。。。。可以用for循环也可以excel直接=70000-单元格，这里就不过多赘述。

码字不易，童鞋不忘点个赞 ~~