图像分割代码分析

图像分割总结
图像分割有传统的分割方法和用深度学习分割的方法,本文的总结主要是对深度学习的分割方法做一些概述,然后对一个分割代码进行解析,后续有新的收获将会继续更新。
主流的图像分割算法都是基于U-net的全卷积神经网络,不同的研究是在这个网络框架的基础上进行改进。关于U-net网络结构,网上概述很多,百度查询即可。
图像分割有点类似于分类算法,不同于分类是对网络提取出来的特征进行分类,其label是数字,图像分割是在卷积后输出的map,其label是一个mask,通常是一个二值图像,目标区域是255,非目标区域是0,其每个像素点都表示一个标签,所以图像分割类似于二分类。
在计算损失函数时,通常计算map和mask中每个对应的像素点的损失值,然后求和再求平均,常用的是在KL散度上引出来的交叉熵函数,有些科研人员会在设个基础上进行改进。

接下来是对程序进行分析,经过一天的阅读分析代码,自己的感悟是不要害怕大段大段的代码,拆分陈小部分,一步一步的调试理解,不懂的语法百度,写个小代码测试就会明白。
本实验的数据集的来源Kaggle - 2018 Data Science Bowl
第一部分是导入模块,在这里面有几个模块比较有意思,首先是warnings模块,这个模块可以去除了程序中可以正常运行但是出跳出来的警告,可以提高编程的愉悦感,其次是tqdm模块,这个模块是添加进度条,可以看到程序还有多久运行完。当然还有chain模块,可以将不同的数据块链起来,一起处理。具体的使用方法百度查询即可。

import os
import sys
import random
import warnings
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
from tqdm import tqdm
from itertools import chain
from skimage.io import imread, imshow, imread_collection, concatenate_images
from skimage.transform import resize
from skimage.morphology import label

from keras.models import Model, load_model
from keras.layers import Input
from keras.layers.core import Lambda
from keras.layers.convolutional import Conv2D, Conv2DTranspose
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import concatenate
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras import backend as K

这部分是读取数据,需要注意的是next的用法,可以一次读取文件中的子文件名称,而具体的读取时通过os.walk来实现的

# Set some parameters
IMG_WIDTH = 128
IMG_HEIGHT = 128
IMG_CHANNELS = 3
TRAIN_PATH = 'F:/segment/stage1_train/'
TEST_PATH = 'F:/segment/stage1_test/'

warnings.filterwarnings('ignore', category=UserWarning, module='skimage')
seed = 42
random.seed = seed
np.random.seed = seed

# Get train and test IDs
train_ids = next(os.walk(TRAIN_PATH))[1] # 670
test_ids = next(os.walk(TEST_PATH))[1]    # 65
#通过自顶向下或自底向上走树来生成目录树中的文件名。
#对于根目录顶部(包括顶部本身)树中的每个目录,它产生一个3元组(dirpath,dirnames,filenames)。
#filenames是文件夹中非目录文件的名称列表

这部分是对读取的数据进行预处理,对于训练集,将所有的数据存储在一个(len(train_ids), IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS)的数组空间中,而训练集的标签,给出的标签是由多个mask组成的文件夹,每一个mask只包含少部分要分割的区域,因此在程序中,每个标签中的所有mask叠加起来,才是真正的一个训练集的标签。测试集同样需要这样的处理。

def data_generator(train_ids,test_ids):
    # Get and resize train images and masks
    X_train = np.zeros((len(train_ids), IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS), dtype=np.uint8)
    Y_train = np.zeros((len(train_ids), IMG_HEIGHT, IMG_WIDTH, 1), dtype=np.bool)
    print('Getting and resizing train images and masks ... ')
 #   sys.stdout.flush()
    for n, id_ in tqdm(enumerate(train_ids), total=len(train_ids)):
        path = TRAIN_PATH + id_
        img = imread(path + '/images/' + id_ + '.png')[:,:,:IMG_CHANNELS]
        img = resize(img, (IMG_HEIGHT, IMG_WIDTH), mode='constant', preserve_range=True) # resize成128*128
        X_train[n] = img
        mask = np.zeros((IMG_HEIGHT, IMG_WIDTH, 1), dtype=np.bool)
        for mask_file in next(os.walk(path + '/masks/'))[2]:
            mask_ = imread(path + '/masks/' + mask_file)
            mask_ = np.expand_dims(resize(mask_, (IMG_HEIGHT, IMG_WIDTH), mode='constant', 
                                          preserve_range=True), axis=-1)
            mask = np.maximum(mask, mask_)#将每个path下的所有mask(标签)叠加在一起,取每个位置的最大值
        Y_train[n] = mask

    # Get and resize test images
    X_test = np.zeros((len(test_ids), IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS), dtype=np.uint8)
    sizes_test = []
    print('Getting and resizing test images ... ')
    sys.stdout.flush()
    for n, id_ in tqdm(enumerate(test_ids), total=len(test_ids)):
        path = TEST_PATH + id_
        img = imread(path + '/images/' + id_ + '.png')[:,:,:IMG_CHANNELS]
        sizes_test.append([img.shape[0], img.shape[1]])
        img = resize(img, (IMG_HEIGHT, IMG_WIDTH), mode='constant', preserve_range=True)
        X_test[n] = img

    print('Data Generate Done!')
    return X_train,Y_train,X_test,sizes_test

X_train,Y_train,X_test,sizes_test = data_generator(train_ids,test_ids)
# X_train (670, 128, 128, 3)
# Y_train (670, 128, 128, 1)
# X_test  (65, 128, 128, 3)
# sizes_test (65,2)

这一部分是定义评估指标,本文用的是IOU,

# Define IoU metric
def mean_iou(y_true, y_pred):
    prec = []
    for t in np.arange(0.5, 1.0, 0.05):
        y_pred_ = tf.to_int32(y_pred > t)
        score, up_opt = tf.metrics.mean_iou(y_true, y_pred_, 2)
        K.get_session().run(tf.local_variables_initializer())
        with tf.control_dependencies([up_opt]):
            score = tf.identity(score)
        prec.append(score)
    return K.mean(K.stack(prec), axis=0)

这部分是构建U-net网络框架

def UNetModel(IMG_HEIGHT,IMG_WIDTH,IMG_CHANNELS):
    inputs = Input((IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS))
    s = Lambda(lambda x: x / 255) (inputs)

    c1 = Conv2D(8, (3, 3), activation='relu', padding='same') (s)
    c1 = Conv2D(8, (3, 3), activation='relu', padding='same') (c1)
    p1 = MaxPooling2D((2, 2)) (c1)

    c2 = Conv2D(16, (3, 3), activation='relu', padding='same') (p1)
    c2 = Conv2D(16, (3, 3), activation='relu', padding='same') (c2)
    p2 = MaxPooling2D((2, 2)) (c2)

    c3 = Conv2D(32, (3, 3), activation='relu', padding='same') (p2)
    c3 = Conv2D(32, (3, 3), activation='relu', padding='same') (c3)
    p3 = MaxPooling2D((2, 2)) (c3)

    c4 = Conv2D(64, (3, 3), activation='relu', padding='same') (p3)
    c4 = Conv2D(64, (3, 3), activation='relu', padding='same') (c4)
    p4 = MaxPooling2D(pool_size=(2, 2)) (c4)

    c5 = Conv2D(128, (3, 3), activation='relu', padding='same') (p4)
    c5 = Conv2D(128, (3, 3), activation='relu', padding='same') (c5)

    u6 = Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same') (c5)
    u6 = concatenate([u6, c4])
    c6 = Conv2D(64, (3, 3), activation='relu', padding='same') (u6)
    c6 = Conv2D(64, (3, 3), activation='relu', padding='same') (c6)

    u7 = Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same') (c6)
    u7 = concatenate([u7, c3])
    c7 = Conv2D(32, (3, 3), activation='relu', padding='same') (u7)
    c7 = Conv2D(32, (3, 3), activation='relu', padding='same') (c7)

    u8 = Conv2DTranspose(16, (2, 2), strides=(2, 2), padding='same') (c7)
    u8 = concatenate([u8, c2])
    c8 = Conv2D(16, (3, 3), activation='relu', padding='same') (u8)
    c8 = Conv2D(16, (3, 3), activation='relu', padding='same') (c8)

    u9 = Conv2DTranspose(8, (2, 2), strides=(2, 2), padding='same') (c8)
    u9 = concatenate([u9, c1], axis=3)
    c9 = Conv2D(8, (3, 3), activation='relu', padding='same') (u9)
    c9 = Conv2D(8, (3, 3), activation='relu', padding='same') (c9)

    outputs = Conv2D(1, (1, 1), activation='sigmoid') (c9)

    model = Model(inputs=[inputs], outputs=[outputs])
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=[mean_iou])
    return model

model = UNetModel(IMG_HEIGHT,IMG_WIDTH,IMG_CHANNELS)

接着是对模型进行训练,在这次的阅读中发现,使用keras为主要框架来编程,在fit中的标签可以是一个矩阵的形式,get

def model_fit(X_train,Y_train,model_name,epochs,batch_size,validation_split):
    earlystopper = EarlyStopping(patience=5, verbose=1)
    checkpointer = ModelCheckpoint(model_name, verbose=1, save_best_only=True)
    results = model.fit(X_train, Y_train, validation_split=validation_split, 
                        batch_size=batch_size, epochs=epochs, 
                        callbacks=[earlystopper, checkpointer])

model_name = 'model-dsbowl2018-1.h5'
epochs = 30
batch_size = 8
validation_split = 0.1
model_fit(X_train,Y_train,model_name,epochs,batch_size,validation_split)

后续还有实验结果,以及数据集,输出结果的显示,可以参考下面给出的了解,本文的主要资料代码,都来自于下面的链接。
https://github.com/mattzheng/U-Net-Demo

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值