从GDC下载数据集参考:
(8条消息) TCGA数据下载教程:使用官方gdc-client软件下载_Mr番茄蛋的博客-CSDN博客_gdc-clienthttps://blog.csdn.net/qq_35203425/article/details/80882988?ops_request_misc=&request_id=&biz_id=102&utm_term=%E4%BB%8Egdc%E6%95%B0%E6%8D%AE%E5%BA%93%E4%B8%AD%E4%B8%8B%E8%BD%BD%E6%95%B0%E6%8D%AE&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-0-80882988.142%5Ev5%5Epc_search_result_control_group,157%5Ev4%5Econtrol&spm=1018.2226.3001.4187TCGA官网:The Cancer Genome Atlas Program - National Cancer Institute
接下来选择想要的加入购物车:我选的open,svs格式的
如何安装Data Transfer Tool,也就是gdc-client这个接口软件
Data Transfer Tool网址:
打开cmd:
E:\gdc\gdc-client.exe download -m E:\gdc\gdc_manifest_20220407_022532.txt
在刚刚的目录下找到下载的文件:
做好分类:
svs转png:
先装上openslide库
把上面三个文件的内容(不要文件夹)copy到Anaconda虚拟环境的根目录中,如下图所示
打开虚拟环境终端
它就变绿啦
接下来将svs无损转png:
import openslide
import numpy as np
import scipy.misc
# import cv2
import numpy as np
import matplotlib.pyplot as plt
import os
import PIL.Image
test = openslide.open_slide('./dataset/LUAD/9/9.svs')
img = np.array(test.read_region((0, 0), 0, test.dimensions))
# scipy.misc.imsave('G:/data/123/test.tif', img)
output_path = r'./output/LUAD'
if not os.path.exists(output_path):
os.mkdir(output_path)
# inputlist = np.array(readimagearray())
# imgfact = np.array(inputlist[int(fact*10)+i%3])
# predictimg = np.array(inputlist[int(fact*10)])
# plt.axis('off')
# plt.imshow(img,cmap="gray")
# plt.savefig('./means_Output/', bbox_inches='tight', pad_inches=0)
PIL.Image.fromarray(img).save('./output/LUAD/9.png')
把大图裁切成小图: (裁剪见下一篇博客)
LUAD和LUSC两类数据各100张,放入vision transformer中做分类训练:
最终结果验证集acc:0.975左右
预测: