天池医疗AI大赛[第一季] Rank22解决方案:适合新人的工程指南

前言:

        参加本次医疗AI大赛收获良多,由于相关的经验不多,及预计时间会不够,所以一开始确定的参加比赛思路是:“使用最直接简单的方式完成比赛,不过度追求准确率”,实际参赛过程中,使用UNET及传统的VGG结构神经网络分别实现了分割、分类2个步骤,最后的FROC得分最高为0.54(初赛成绩,复赛中因时间有限仅完成了分割部分)。

        尽管没有使用肺部轮廓切割、ensemble的分类、优化过的loss函数等更有效的方式,但是由于工程的核心逻辑比较简单,更适合没有过多基础的新手follow,现将核心部分代码贴出,希望对大家有所帮助。由于时间仓促,所以代码中还存在部分不足,希望大家不吝指正,感谢。

一、数据预处理:

        本部分主要工作是将CT的mhd格式转至numpy,归一化至0~1后,获取存在结节的及相邻的几个切片,生成mask,保存为图片

def normalize(image):
    MIN_BOUND = -1000.0
    MAX_BOUND = 400.0
    image = (image - MIN_BOUND) / (MAX_BOUND - MIN_BOUND)
    image[image > 1] = 1.
    image[image < 0] = 0.
    return image

def world_2_voxel(world_coordinates, origin, spacing):
stretched_voxel_coordinates = np.absolute(world_coordinates - origin)
voxel_coordinates = stretched_voxel_coordinates / spacing
return voxel_coordinates

def voxel_2_world(voxel_coordinates, origin, spacing):
stretched_voxel_coordinates = voxel_coordinates * spacing
world_coordinates = stretched_voxel_coordinates + origin
return world_coordinates

def make_mask(center,diam,z,width,height,spacing,origin): #只显示结节
mask = np.zeros([height,width])
diam = diam/2
v_center = np.absolute((center-origin))/spacing
v_diam = int(diam/spacing[0])
v_xmin = np.max([0,int(v_center[0]-v_diam)-1])
v_xmax = np.min([width-1,int(v_center[0]+v_diam)+1])
v_ymin = np.max([0,int(v_center[1]-v_diam)-1])
v_ymax = np.min([height-1,int(v_center[1]+v_diam)+1])

v_xrange = range(v_xmin,v_xmax+<span class="hljs-number">1</span>)
v_yrange = range(v_ymin,v_ymax+<span class="hljs-number">1</span>)

x_data = [x*spacing[<span class="hljs-number">0</span>]+origin[<span class="hljs-number">0</span>] <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> range(width)]
y_data = [x*spacing[<span class="hljs-number">1</span>]+origin[<span class="hljs-number">1</span>] <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> range(height)]

<span class="hljs-keyword">for</span> v_x <span class="hljs-keyword">in</span> v_xrange:
    <span class="hljs-keyword">for</span> v_y <span class="hljs-keyword">in</span> v_yrange:
        p_x = spacing[<span class="hljs-number">0</span>]*v_x + origin[<span class="hljs-number">0</span>]
        p_y = spacing[<span class="hljs-number">1</span>]*v_y + origin[<span class="hljs-number">1</span>]
        <span class="hljs-keyword">if</span> np.linalg.norm(center-np.array([p_x,p_y,z]))&lt;=diam:
            mask[np.absolute(int((p_y-origin[<span class="hljs-number">1</span>]))/spacing[<span class="hljs-number">1</span>]),int((p_x-origin[<span class="hljs-number">0</span>])/spacing[<span class="hljs-number">0</span>])] = <span class="hljs-number">1.0</span>
<span class="hljs-keyword">return</span>(mask)

def create_samples(df_node,img_file,pic_path):
mini_df = df_node[df_node["file"]==img_file] #get all nodules associate with file
if mini_df.shape[0]>0: # some files may not have a nodule--skipping those

    # load the data once
    patient_id = img_file.split(<span class="hljs-string">'/'</span>)[<span class="hljs-number">-1</span>][:<span class="hljs-number">-4</span>]
    itk_img = SimpleITK.ReadImage(data_path + img_file) 
    img_array = SimpleITK.GetArrayFromImage(itk_img) # indexes are z,y,x (notice the ordering)
    num_z, height, width = img_array.shape        #heightXwidth constitute the transverse plane
    origin = np.array(itk_img.GetOrigin())      # x,y,z  Origin <span class="hljs-keyword">in</span> world coordinates (mm)
    spacing = np.array(itk_img.GetSpacing())    # spacing <span class="hljs-keyword">of</span> voxels <span class="hljs-keyword">in</span> world coor. (mm)
    # go through all nodes (why just the biggest?)
    <span class="hljs-keyword">for</span> node_idx, cur_row <span class="hljs-keyword">in</span> mini_df.iterrows():       
        node_x = cur_row[<span class="hljs-string">"coordX"</span>]
        node_y = cur_row[<span class="hljs-string">"coordY"</span>]
        node_z = cur_row[<span class="hljs-string">"coordZ"</span>]
        diam = cur_row[<span class="hljs-string">"diameter_mm"</span>]
        center = np.array([node_x, node_y, node_z])   # nodule center
        v_center = np.rint(np.absolute(center-origin)/spacing)  # nodule center <span class="hljs-keyword">in</span> voxel space (still x,y,z ordering)
        <span class="hljs-keyword">for</span> i, i_z <span class="hljs-keyword">in</span> enumerate(np.arange(int(v_center[<span class="hljs-number">2</span>])<span class="hljs-number">-1</span>,
                         int(v_center[<span class="hljs-number">2</span>])+<span class="hljs-number">2</span>).clip(<span class="hljs-number">0</span>, num_z<span class="hljs-number">-1</span>)): # clip prevents going out <span class="hljs-keyword">of</span> bounds <span class="hljs-keyword">in</span> Z
            img = img_array[i_z]
            seg_img, overlap = helpers.get_segmented_lungs(img.copy())
            img = normalize(img)
            mask = make_mask(center, diam, i_z*spacing[<span class="hljs-number">2</span>]+origin[<span class="hljs-number">2</span>],
                             width, height, spacing, origin)
            <span class="hljs-keyword">if</span> img.shape[<span class="hljs-number">0</span>] &gt; <span class="hljs-number">512</span>: 
                print patient_id
            <span class="hljs-keyword">else</span>:                    
                cv2.imwrite(pic_path + str(patient_id)+<span class="hljs-string">'_'</span>+str(node_idx)+<span class="hljs-string">'_'</span>+str(i)+ <span class="hljs-string">'_i.png'</span>,img*<span class="hljs-number">255</span>)
                cv2.imwrite(pic_path + str(patient_id)+<span class="hljs-string">'_'</span>+str(node_idx)+<span class="hljs-string">'_'</span>+str(i)+ <span class="hljs-string">'_m.png'</span>,mask*<span class="hljs-number">255</span>)
                cv2.imwrite(pic_path + str(patient_id)+<span class="hljs-string">'_'</span>+str(node_idx)+<span class="hljs-string">'_'</span>+str(i)+ <span class="hljs-string">'_o.png'</span>,overlap*<span class="hljs-number">255</span>)
<span class="hljs-keyword">return</span></font></pre><p><font face="微软雅黑" size="3"><br></font></p><p><font face="微软雅黑" size="3"><img src="http://aliyuntianchipublic.cn-hangzhou.oss-pub.aliyun-inc.com/public/files/image/1095279112958/1508725614302_eyq1N3Jfs1.jpg" alt="1506428400971_vHwA9BvjAi" style="max-width: 100.0%;"><br></font></p><h1><font face="微软雅黑" style="font-weight: normal;" size="3"><br></font></h1><h1 style="line-height: 1;"><p><font face="微软雅黑" size="5">二、训练分割模型(keras):</font></p><p><font face="微软雅黑" style="font-weight: normal;" size="3">UNET的论文地址:<a href="https://arxiv.org/abs/1505.04597" target="_blank">点击转到arxiv</a></font></p><p><font face="微软雅黑" style="font-weight: normal;" size="3">UNET的结构图:</font></p><p><font face="微软雅黑" style="font-weight: normal;" size="3"><img src="http://aliyuntianchipublic.cn-hangzhou.oss-pub.aliyun-inc.com/public/files/image/1095279112958/1508726041616_drdVaM2kYD.jpg" alt="20170417204711784" style="max-width: 100.0%;"><br></font></p><p><font face="微软雅黑" style="font-weight: normal;" size="3">keras代码部分:</font></p><p><font face="微软雅黑" style="font-weight: normal;" size="3">&nbsp; &nbsp; &nbsp; &nbsp; 定义dice作为损失函数,将保存的每组512*512的图像(img)+掩膜(mask)送入训练,使用keras对同组图像和掩膜进行同样的旋转、翻转、平移等图像增强操作</font></p><pre style="max-width: 100.0%;overflow-x: auto;"><font face="微软雅黑" style="font-weight: normal;" size="3"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">dice_coef</span><span class="hljs-params">(y_true, y_pred)</span>:</span>
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
<span class="hljs-keyword">return</span> (<span class="hljs-number">2.</span> * intersection + <span class="hljs-number">1</span>) / (K.sum(y_true_f) + K.sum(y_pred_f) + <span class="hljs-number">1</span>)

def dice_coef_loss(y_true, y_pred):
return -dice_coef(y_true, y_pred)

def dice_coef_np(y_true,y_pred):
y_true_f = y_true.flatten()
y_pred_f = y_pred.flatten()
intersection = np.sum(y_true_f y_pred_f)
return ((2.
intersection + 1) / (np.sum(y_true_f) + np.sum(y_pred_f) + 1))

def unet_model(dropout_rate,learn_rate, width):
inputs = Input((1, 512, 512))
conv1 = Convolution2D(width, (3, 3), padding="same", activation="elu")(inputs)
conv1 = Convolution2D(width, (3, 3), padding="same", activation="elu")(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

conv2 = Convolution2D(width*<span class="hljs-number">2</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(pool1)
conv2 = Convolution2D(width*<span class="hljs-number">2</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv2)
pool2 = MaxPooling2D(pool_size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>))(conv2)

conv3 = Convolution2D(width*<span class="hljs-number">4</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(pool2)
conv3 = Convolution2D(width*<span class="hljs-number">4</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv3)
pool3 = MaxPooling2D(pool_size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>))(conv3)

conv4 = Convolution2D(width*<span class="hljs-number">8</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(pool3)
conv4 = Convolution2D(width*<span class="hljs-number">8</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv4)
pool4 = MaxPooling2D(pool_size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>))(conv4)

conv5 = Convolution2D(width*<span class="hljs-number">16</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(pool4)
conv5 = Convolution2D(width*<span class="hljs-number">16</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv5)

up6 = merge([UpSampling2D(size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>))(conv5), conv4], mode=<span class="hljs-string">'concat'</span>, concat_axis=<span class="hljs-number">1</span>)
conv6 = SpatialDropout2D(dropout_rate)(up6)
conv6 = Convolution2D(width*<span class="hljs-number">8</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv6)
conv6 = Convolution2D(width*<span class="hljs-number">8</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv6)

up7 = merge([UpSampling2D(size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>))(conv6), conv3], mode=<span class="hljs-string">'concat'</span>, concat_axis=<span class="hljs-number">1</span>)
conv7 = SpatialDropout2D(dropout_rate)(up7)
conv7 = Convolution2D(width*<span class="hljs-number">4</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv7)
conv7 = Convolution2D(width*<span class="hljs-number">4</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv7)

up8 = merge([UpSampling2D(size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>))(conv7), conv2], mode=<span class="hljs-string">'concat'</span>, concat_axis=<span class="hljs-number">1</span>)
conv8 = SpatialDropout2D(dropout_rate)(up8)
conv8 = Convolution2D(width*<span class="hljs-number">2</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv8)
conv8 = Convolution2D(width*<span class="hljs-number">2</span>, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv8)

up9 = merge([UpSampling2D(size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>))(conv8), conv1], mode=<span class="hljs-string">'concat'</span>, concat_axis=<span class="hljs-number">1</span>)
conv9 = SpatialDropout2D(dropout_rate)(up9)
conv9 = Convolution2D(width, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv9)
conv9 = Convolution2D(width, (<span class="hljs-number">3</span>, <span class="hljs-number">3</span>), padding=<span class="hljs-string">"same"</span>, activation=<span class="hljs-string">"elu"</span>)(conv9)
conv10 = Convolution2D(<span class="hljs-number">1</span>, (<span class="hljs-number">1</span>, <span class="hljs-number">1</span>), activation=<span class="hljs-string">"sigmoid"</span>)(conv9)

model = Model(input=inputs, output=conv10)
<span class="hljs-comment">#model.summary()</span>
model.compile(optimizer=Adam(lr=learn_rate), loss=dice_coef_loss, metrics=[dice_coef])
<span class="hljs-comment">#model.compile(optimizer=SGD(lr=learn_rate, momentum=0.9, nesterov=True), loss=dice_coef_loss, metrics=[dice_coef])</span>

<span class="hljs-comment">#plot_model(model, to_file='model1.png',show_shapes=True)</span>
<span class="hljs-keyword">return</span> model

def unet_fit(name, check_name = None):
data_gen_args = dict(rotation_range=90.,
width_shift_range=0.3,
height_shift_range=0.3,
horizontal_flip=True,
vertical_flip=True,
)
from keras.preprocessing.image import ImageDataGenerator
image_datagen = ImageDataGenerator(data_gen_args)
mask_datagen = ImageDataGenerator(
data_gen_args)

<span class="hljs-comment"># Provide the same seed and keyword arguments to the fit and flow methods</span>
seed = <span class="hljs-number">1</span>

image_generator = image_datagen.flow_from_directory(
    src,
    class_mode=<span class="hljs-keyword">None</span>,
    classes=[<span class="hljs-string">'lung'</span>],
    seed=seed,
    target_size=(<span class="hljs-number">512</span>,<span class="hljs-number">512</span>),
    color_mode=<span class="hljs-string">"grayscale"</span>,
    batch_size=<span class="hljs-number">1</span>)

mask_generator = mask_datagen.flow_from_directory(
    src,
    class_mode=<span class="hljs-keyword">None</span>,
    classes=[<span class="hljs-string">'nodule'</span>],
    seed=seed,
    target_size=(<span class="hljs-number">512</span>,<span class="hljs-number">512</span>),
    color_mode=<span class="hljs-string">"grayscale"</span>,
    batch_size=<span class="hljs-number">1</span>) 
<span class="hljs-comment"># combine generators into one which yields image and masks</span>
train_generator = itertools.izip(image_generator, mask_generator)   
t = time.time()
callbacks = [EarlyStopping(monitor=<span class="hljs-string">'val_loss'</span>, patience = <span class="hljs-number">40</span>, 
                               verbose = <span class="hljs-number">1</span>),
ModelCheckpoint(model_paths + <span class="hljs-string">'{}.h5'</span>.format(name), 
                    monitor=<span class="hljs-string">'val_loss'</span>, 
                    verbose = <span class="hljs-number">0</span>, save_best_only = <span class="hljs-keyword">True</span>)]

<span class="hljs-keyword">if</span> check_name <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-keyword">None</span>:
    check_model = model_paths + <span class="hljs-string">'{}.h5'</span>.format(check_name)
    model = load_model(check_model, 
                       custom_objects={<span class="hljs-string">'dice_coef_loss'</span>: dice_coef_loss, <span class="hljs-string">'dice_coef'</span>: dice_coef})
<span class="hljs-keyword">else</span>:
    model = unet_model(dropout_rate = <span class="hljs-number">0.35</span>, learn_rate = <span class="hljs-number">1e-5</span>, width = <span class="hljs-number">32</span>)
model.fit_generator(
    train_generator,
    epochs=<span class="hljs-number">300</span>,
    verbose =<span class="hljs-number">1</span>, 
    callbacks = callbacks,
    steps_per_epoch=<span class="hljs-number">256</span>,
    validation_data = train_generator,
    nb_val_samples = <span class="hljs-number">48</span>)
<span class="hljs-keyword">return</span></font></pre><p><font face="微软雅黑" style="font-weight: normal;" size="3"><br></font></p><p><font face="微软雅黑" size="5">三、将训练好的模型应用于训练集数据:</font></p><p><font face="微软雅黑" style="font-weight: normal;" size="3">&nbsp; &nbsp; &nbsp; &nbsp; 训练集的mhd文件已转存为npy文件,用训练好的模型测试训练集,获得的疑似结节的坐标,将疑似结节坐标与训练集的anno数据表对照,错误标注的结节将作为分类模型中的负样本</font></p><pre style="max-width: 100.0%;overflow-x: auto;"><font face="微软雅黑" style="font-weight: normal;" size="3"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">pred_samples</span><span class="hljs-params">(src,img_file,model)</span>:</span>
patient_id = img_file[:<span class="hljs-number">-9</span>]
img_array = np.load(src + img_file)
pos_annos = pd.read_csv(src + img_file[:<span class="hljs-number">-9</span>] + <span class="hljs-string">'_annos_pos.csv'</span>)
origin = np.array([pos_annos.loc[<span class="hljs-number">0</span>][<span class="hljs-string">'origin_x'</span>],pos_annos.loc[<span class="hljs-number">0</span>][<span class="hljs-string">'origin_y'</span>],pos_annos.loc[<span class="hljs-number">0</span>][<span class="hljs-string">'origin_z'</span>]]) 
spacing = np.array([pos_annos.loc[<span class="hljs-number">0</span>][<span class="hljs-string">'spacing_x'</span>],pos_annos.loc[<span class="hljs-number">0</span>][<span class="hljs-string">'spacing_y'</span>],pos_annos.loc[<span class="hljs-number">0</span>][<span class="hljs-string">'spacing_z'</span>]]) 
img_array_new = np.zeros_like(img_array)
<span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(img_array.shape[<span class="hljs-number">0</span>]):
    img = img_array[i]
    <span class="hljs-comment">#img = skimage.morphology.binary_closing(np.squeeze(img), np.ones([3,3]))</span>
    seg_img, overlap = helpers.get_segmented_lungs(img.copy())
    overlap = skimage.morphology.binary_opening(np.squeeze(overlap), np.ones([<span class="hljs-number">5</span>,<span class="hljs-number">5</span>]))
    img = normalize(img) * <span class="hljs-number">255</span>        
    img = np.expand_dims(img,<span class="hljs-number">0</span>)
    img = np.expand_dims(img,<span class="hljs-number">0</span>)
    p = model.predict(img)
    img_array_new[i] = p * overlap
img_array_new[img_array_new &lt; <span class="hljs-number">0.5</span>] = <span class="hljs-number">0</span>
np.save(<span class="hljs-string">'{}{}{}.npy'</span>.format(src, patient_id,str(<span class="hljs-string">'_pred'</span>)), img_array_new)
<span class="hljs-keyword">return</span> </font></pre><p><font face="微软雅黑" size="3"><span style="font-weight: normal;">测试结果:</span></font></p><p><font face="微软雅黑" size="3"><span style="font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; 由于2D的UNET的效果与3D的无法相提并论,结节的坐标使用开运算进行提取的时候,以及设定了概率阈值,使得结果也产生了误差,在实际工程中,将UNET获得的结果与anno数据表进行了对照,统计了候选结节位置与实际结节位置的距离与直径的比值,一般能在0.5之内的候选结节可以认为质量较高,测试集的结果如下:</span></font></p><p><font size="3" style="font-weight: normal;">小于0.125:    67.68%</font></p><p><span style="font-size: medium;font-weight: normal;">小于0.25:    82.15%</span></p><p><font size="3" style="font-weight: normal;">小于0.5:    86.50%</font></p><p><font size="3" style="font-weight: normal;">小于1:        88.42%</font></p><p><font size="3" style="font-weight: normal;">小于2:        91.24%</font></p><p><font size="3" style="font-weight: normal;">小于4:        97.19%</font></p><p><font size="3" style="font-weight: normal;">小于8:        98.71%</font></p><p><font size="3" style="font-weight: normal;"><br></font></p><p><font size="3" style="font-weight: normal;">以0.5为阈值,则:</font></p><p><span style="font-weight: normal;"><font size="3">查全率:        86.5%</font></span></p><p><span style="font-weight: normal;"><font size="3">查准率:        2.64%</font></span></p><p><font face="微软雅黑" style="font-weight: normal;" size="3"><br></font></p><p><font face="微软雅黑" size="5">四、训练分类模型:</font></p><p><font face="微软雅黑" style="font-weight: normal;" size="3">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;将分割模型获得的负样本坐标点,及原始的结节正样本坐标点,取32*32*32的patch,送入分类模型进行训练,获得最后是结节的概率值</font></p><pre style="max-width: 100.0%;overflow-x: auto;"><font face="微软雅黑" style="font-weight: normal;" size="3"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_net</span><span class="hljs-params">(input_shape, load_weight_path=None, features=False, mal=False)</span>:</span>
width = <span class="hljs-number">64</span>
inputs = Input(shape=(<span class="hljs-number">1</span>, <span class="hljs-number">32</span>, <span class="hljs-number">32</span>, <span class="hljs-number">32</span>), name=<span class="hljs-string">"input_1"</span>)
x = inputs
x = AveragePooling3D(pool_size=(<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>), strides=(<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>), border_mode=<span class="hljs-string">"same"</span>)(x)
<span class="hljs-comment">#x = BatchNormalization(axis = 1)(x)</span>
x = Convolution3D(width, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, activation=<span class="hljs-string">'relu'</span>, border_mode=<span class="hljs-string">'same'</span>, name=<span class="hljs-string">'conv1'</span>, subsample=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>))(x)
x = MaxPooling3D(pool_size=(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>), strides=(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>), border_mode=<span class="hljs-string">'valid'</span>, name=<span class="hljs-string">'pool1'</span>)(x)

<span class="hljs-comment"># 2nd layer group</span>
<span class="hljs-comment">#x = BatchNormalization(axis = 1)(x)</span>
x = Convolution3D(width*<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, activation=<span class="hljs-string">'relu'</span>, border_mode=<span class="hljs-string">'same'</span>, name=<span class="hljs-string">'conv2'</span>, subsample=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>))(x)
x = MaxPooling3D(pool_size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>), strides=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>), border_mode=<span class="hljs-string">'valid'</span>, name=<span class="hljs-string">'pool2'</span>)(x)
x = Dropout(p=<span class="hljs-number">0.3</span>)(x)

<span class="hljs-comment"># 3rd layer group</span>
<span class="hljs-comment">#x = BatchNormalization(axis = 1)(x)</span>
x = Convolution3D(width*<span class="hljs-number">4</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, activation=<span class="hljs-string">'relu'</span>, border_mode=<span class="hljs-string">'same'</span>, name=<span class="hljs-string">'conv3a'</span>, subsample=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>))(x)
<span class="hljs-comment">#x = BatchNormalization(axis = 1)(x)</span>
x = Convolution3D(width*<span class="hljs-number">4</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, activation=<span class="hljs-string">'relu'</span>, border_mode=<span class="hljs-string">'same'</span>, name=<span class="hljs-string">'conv3b'</span>, subsample=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>))(x)
x = MaxPooling3D(pool_size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>), strides=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>), border_mode=<span class="hljs-string">'valid'</span>, name=<span class="hljs-string">'pool3'</span>)(x)
x = Dropout(p=<span class="hljs-number">0.4</span>)(x)

<span class="hljs-comment"># 4th layer group</span>
<span class="hljs-comment">#x = BatchNormalization(axis = 1)(x)</span>
x = Convolution3D(width*<span class="hljs-number">8</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, activation=<span class="hljs-string">'relu'</span>, border_mode=<span class="hljs-string">'same'</span>, name=<span class="hljs-string">'conv4a'</span>, subsample=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>))(x)
<span class="hljs-comment">#x = BatchNormalization(axis = 1)(x)</span>
x = Convolution3D(width*<span class="hljs-number">8</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, activation=<span class="hljs-string">'relu'</span>, border_mode=<span class="hljs-string">'same'</span>, name=<span class="hljs-string">'conv4b'</span>, subsample=(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>),)(x)
x = MaxPooling3D(pool_size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>), strides=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>), border_mode=<span class="hljs-string">'valid'</span>, name=<span class="hljs-string">'pool4'</span>)(x)
x = Dropout(p=<span class="hljs-number">0.5</span>)(x)

last64 = Convolution3D(<span class="hljs-number">64</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, activation=<span class="hljs-string">"relu"</span>, name=<span class="hljs-string">"last_64"</span>)(x)
out_class = Convolution3D(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>, activation=<span class="hljs-string">"softmax"</span>, name=<span class="hljs-string">"out_class_last"</span>)(last64)
out_class = Flatten(name=<span class="hljs-string">"out_class"</span>)(x)

out_class = Dense(<span class="hljs-number">2</span>)(out_class) 
<span class="hljs-comment">#out_class = BatchNormalization(axis = 1)(out_class)</span>
out_class = Activation(<span class="hljs-string">'softmax'</span>)(out_class)

model = Model(input=inputs, output=out_class)
model.compile(optimizer=SGD(lr=<span class="hljs-number">1e-3</span>, momentum=<span class="hljs-number">0.9</span>, nesterov=<span class="hljs-keyword">True</span>), loss=<span class="hljs-string">'categorical_crossentropy'</span>, metrics=[<span class="hljs-string">'accuracy'</span>])
<span class="hljs-comment">#model.compile(optimizer=Adam(lr=1e-3), loss='categorical_crossentropy', metrics=['accuracy'])   </span>
<span class="hljs-keyword">return</span> model</font></pre><p><font face="微软雅黑" style="font-weight: normal;" size="3">由于模型比较简单,训练时间较短,收敛后分类模型的测试集的准确率约为80%,最终的初赛FROC得分为0.54</font></p><p><font face="微软雅黑" style="font-weight: normal;" size="3"><br></font></p><p>四、总结:</p><p><span style="font-weight: normal;"><font size="3">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 以上代码只能说是用简单粗暴的方式完成的比赛,不够优美。但是对于新人接触这个比赛会有比较好的帮助作用,胜在简单,且可复制性强,在以上代码的基础上,可以尝试肺部区域的分割、3D的分割模型、更好的分类模型、Xgboost等等使结果更好。</font></span></p><p><span style="font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; <font size="3">对于深度学习在医疗方面的运用,以前没有接触过,感谢阿里、英特尔和零氪科技提供了大量优质的标注样本并举办了这次比赛,让我有机会能尝试用深度学习在医疗领域解决一些问题,通过旁听决赛更学习到很多的一流队伍的方案,现在也在重制一遍工程代码,希望能不断迭代出更好的结果来。</font></span></p><p><span style="font-weight: normal;"><font size="3">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 我的邮箱:185309642@qq.com</font></span></p><p><font face="微软雅黑" style="font-weight: normal;" size="3"><br></font></p><p><font face="微软雅黑" style="font-weight: normal;" size="3">(结束)</font></p><p><font size="3" face="微软雅黑"><br></font></p><p><span style="font-weight: normal;"><font face="微软雅黑" size="3"><br></font></span></p><p><font face="微软雅黑" size="5">附录:</font></p><p><span style="font-weight: normal;"><font face="微软雅黑" size="3">&nbsp; &nbsp; &nbsp; &nbsp; 以下代码为比赛早期贴出的帮助新人可视化数据的一些代码,留在最后,帮助大家对整个数据集有一些感性的认识</font></span></p></h1><h1><span style="font-weight: normal;"><font face="微软雅黑" size="3">一、加载常用库</font></span></h1><pre style="max-width: 100.0%;"><span style="font-weight: normal;"><font face="微软雅黑" size="3"><span class="hljs-keyword">from</span> __future__ <span class="hljs-keyword">import</span> print_function, division

import SimpleITK as sitk
import numpy as np
import csv
from glob import glob
import pandas as pd
import os
import scipy.ndimage
from tqdm import tqdm #进度条
from skimage import measure, morphology
from mpl_toolkits.mplot3d.art3d import Poly3DCollection

import matplotlib.pyplot as plt
%matplotlib inline

import matplotlib.animation as animation
from IPython.display import HTML

import warnings #不显示乱七八糟的warning
warnings.filterwarnings("ignore")

二、定义文件夹

2.1 数据和csv的位置,根据需要修改

luna_path = './'
luna_subset_path = luna_path + 'sample_patients/'
file_list = glob(luna_subset_path + "*.mhd")
df_node = pd.read_csv(luna_path + 'csv_files/' + 'annotations.csv')

2.2 临时文件和输出文件的位置

output_path = luna_path + 'npy/'
working_path = luna_path + 'output/'

2.3 检查下文件夹,如果没有的话新建

if os.path.isdir(luna_path + '/npy'):
    pass
else:
    os.mkdir(luna_path + '/npy')

if os.path.isdir(luna_path + '/output'):
pass
else:
os.mkdir(luna_path + '/output')

三、定义一些函数

查看本文全部内容,欢迎访问天池技术圈官方地址:天池医疗AI大赛[第一季] Rank22解决方案:适合新人的工程指南

  • 23
    点赞
  • 28
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值