原始数据如下图所示:
根据指定的基因名输出对应的整行数据的信息
#python筛选出满足对应列值的整行
import xlrd
import pandas as pd
from pandas import DataFrame
df1=pd.read_excel("cell-summ.xlsx")#读取数据
s1=df1["基因名"]
s2=np.array(s1.sample(100))#从”基因名“这1列中随机抽取100个值,并把它转换为数组
df2=df1[df1["基因名"].isin(s2)]#筛选出满足s2基因名的记录
df2
输出结果如下:
基因名 基因类别 细胞系 亚细胞定位
4 SNHG1 lncRNA A549 Nucleus\n
45 TH2LCRR lncRNA HepG2 Cytosol\n
49 LINC01515 lncRNA HepG2 Cytosol\n
99 NIFK-AS1 lncRNA HepG2 Insoluble cytoplasm\n
140 LINC01515 lncRNA HepG2 Membrane\n
147 LRRC75A-AS1 lncRNA HepG2 Membrane\n
159 NORAD lncRNA HepG2 Membrane\n
168 LINC01126 lncRNA HepG2 Nucleus\n
225 LINC01515 lncRNA HepG2 Nucleus\n
240 SGMS1-AS1 lncRNA HepG2 Nucleus\n
290 NORAD lncRNA HepG2 Nucleus\n
300 CCNT2-AS1 lncRNA HepG2 Nucleus\n
343 CCNT2-AS1 lncRNA HeLa.S3 Nucleus\n
391 SNHG1 lncRNA HepG2 Nucleolus\n
393 SNHG1 lncRNA HepG2 Nucleoplasm\n
412 LRRC75A-AS1 lncRNA HepG2 Nucleus\n
414 SNHG1 lncRNA HepG2 Nucleus\n
450 SNHG1 lncRNA K562 Nucleolus\n
452 SNHG1 lncRNA K562 Nucleoplasm\n
471 LRRC75A-AS1 lncRNA K562 Nucleus\n
473 SNHG1 lncRNA K562 Nucleus\n