U-Net: Convolutional Networks for Biomedical Image Segmentation理解一

最新推荐文章于 2021-06-02 15:10:11 发布

aliyanah_

最新推荐文章于 2021-06-02 15:10:11 发布

阅读量5.7k

点赞数 6

分类专栏：图像处理深度学习

本文链接：https://blog.csdn.net/aliyanah_/article/details/90110959

版权

图像处理同时被 2 个专栏收录

4 篇文章 0 订阅

订阅专栏

深度学习

2 篇文章 0 订阅

订阅专栏

U-net：Convolutional Networks for Biomedical Image Segmentation翻译

这篇博客是简单的对U-Net: Convolutional Networks for Biomedical Image Segmentation文章的简单翻译，具体理解请参考

https://blog.csdn.net/aliyanah_/article/details/90113304

U-Net：用于生物医学图像分割的卷积网络
Abstract. There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.
摘要：深度神经网络的成功训练需要数千张带标记的训练样本这个现象是得到很大的认可的。在这篇文章中，我们提出了一个网络和一种训练策略。该策略是通过对已有的带标记的图像进行图像增强（即通过旋转，随机裁剪，缩放等增加图片的数量），从而更有效地使用已有的标记样本。该体系结构包括捕获上下文信息的收缩路径（contracting path）和实现精确定位的对称扩展路径（expanding path）。我们的实验结果表明，这种网络可以对数量比较少的图像集进行端到端地训练，并且优于ISBI电子显微镜堆叠的神经元结构分割挑战中的最佳方法（滑动窗口卷积网络）。用同样的网络（滑动窗口卷积网络）和我们的网络训练透射光显微镜图像（相位对比和DIC）–2015年ISBI细胞追踪挑战，我们的网络有明显更好的分类结果。而且网络训练速度很快，在目前的GPU上，分割一张512x512图像不到一秒钟。完整的实验过程（基于Caffe）和训练好的网络可在http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.中获得。
1 Introduction
In the last two years, deep convolutional networks have outperformed the state of the art in many visual recognition tasks, e.g. [7,3]. While convolutional networks have already existed for a long time [8], their success was limited due to the size of the available training sets and the size of the considered networks. The breakthrough by Krizhevsky et al. [7] was due to supervised training of a large network with 8 layers and millions of parameters on the ImageNet dataset with 1 million training images. Since then, even larger and deeper networks have been trained [12].
在近两年，深度卷积神经网络在很多视觉识别任务上有很出色的表现，如【7,3】。尽管卷积神经网络已经存在很长时间了【8】，但是由于有限的训练数据集和可考虑网络的规模，它的成功受到了限制。Krizhevsky等人在此有突破[7]，是由于将包含100万张训练图像的ImageNet数据集在8层的网络上进行监督训练学习，并且该网络有数百万个参数。从那以后，就算规模更大或者层数更深的网络也可以被训练。
The typical use of convolutional networks is on classification tasks, where the output to an image is a single class label. However, in many visual tasks, especially in biomedical image processing, the desired output should include localization, i.e., a class label is supposed to be assigned to each pixel. Moreover, thousands of training images are usually beyond reach in biomedical tasks. Hence, Ciresan et al. [1] trained a network in a sliding-window setup to predict the class label of each pixel by providing a local region (patch) around that pixel as input. First, this network can localize. Secondly, the training data in terms of patches is much larger than the number of training images. The resulting network won the EM segmentation challenge at ISBI 2012 by a large margin.
卷积网络的典型应用是分类任务，它的输出是对应一张图像的单个类别标签。然而在很多的视觉任务中，尤其是生物医学图像处理中，它所期待的输出是物体的位置等，或者每个像素所属类别的标签。而且，对于医学图像的任务，搜集到数千张的训练图像往往做不到。因此，Ciresan等人[1]，训练了一种滑动窗口网络，该网络将每个像素位置附近的区域（或称局部区域，一个patch）作为输入，来预测每个像素的类别标签。首先，这个网络可以定位，其次，在训练数据方面，像素局部区域的数量远多于训练数据的数量。（我的理解是，把图像分成了很多的像素块再放到网络中去训练）。这个网络在ISBI 2012 EM分割挑战中获得了冠军。
在这里插入图片描述 Figure1 U-net结构（最小的图像分辨率为32*32）每个蓝色的框表示一个多通过的特征映射，通道的数量写在框的顶部，x和y的大小在框的左下角表明，每种矢量箭头表示不同的操作。

Obviously, the strategy in Ciresan et al. [1] has two drawbacks. First, it is quite slow because the network must be run separately for each patch, and there is a lot of redundancy due to overlapping patches. Secondly, there is a trade-off between localization accuracy and the use of context. Larger patches require more max-pooling layers that reduce the localization accuracy, while small patches allow the network to see only little context. More recent approaches [11,4] proposed a classifier output that takes into account the features from multiple layers. Good localization and the use of context are possible at the same time.
显然，Ciresan等人的网络策略有两个缺点。第一，由于这个网络对每个patch（以某个像素为中心的区域）都要运行检测一遍，而且patch之间有大量的重叠的（这个与划窗的大小和步长有关，即有重叠），所以网络训练很慢。第二，在定位精度和语境使用之间要有权衡。大的局部区域会要求更多的max-pooling（池化）层，这样会降低定位的准确性，而小的局部区域只能获得一点点语义信息。在最近的研究中【11,4】提出了一种输出分类器，她综合考虑了来自多层的特征。因此它在准确定位的同时也尽可能的使用了语义信息。
In this paper, we build upon a more elegant architecture, the so-called fully convolutional network" [9]. We modify and extend this architecture such that it works with very few training images and yields more precise segmentations; see Figure 1. The main idea in [9] is to supplement a usual contracting network by successive layers, where pooling operators are replaced by upsampling operators. Hence, these layers increase the resolution of the output. In order to localize, high resolution features from the contracting path are combined with the up sampled output. A successive convolution layer can then learn to assemble a more precise output based on this information.
在本文中，我们在一个更加优雅的模型结构即全卷积网络上【9】，建立了一个模型。我们修改和拓展了这个结构，让它在训练更少的图片情况下也能得到更准确的分割结果，模型结构如图1所示。【9】中的主要创新点是对常规的收缩网络（我觉得也可以理解为卷积网络）进行了拓展，即增添了一些连续的层，这些层中的池化层被上采样层所代替。因此，这些层增加了输出的分辨率。为了准确定位，收缩路径中的高分辨率特征会与上采样层相结合。然后，连续卷积层基于上述信息可以学习得到更准确的输出。
在这里插入图片描述
Figure2用于任意大图像的无缝分割的重叠平铺策略（这里是EM堆栈中的神经元结构的分割）。预测黄色区域中的分割结果，需要蓝色区域内的图像数据作为输入。且通过镜像推断缺少输入数据

One important modification in our architecture is that in the up sampling part we have also a large number of feature channels, which allow the network to propagate context information to higher resolution layers. As a consequence, the expansive path is more or less symmetric to the contracting path, and yields a u-shaped architecture. The network does not have any fully connected layers and only uses the valid part of each convolution, i.e., the segmentation map only contains the pixels, for which the full context is available in the input image. This strategy allows the seamless segmentation of arbitrarily large images by an overlap-tile strategy (see Figure 2). To predict the pixels in the border region of the image, the missing context is extrapolated by mirroring the input image. This tiling strategy is important to apply the network to large images, since otherwise the resolution would be limited by the GPU memory.
我们架构中的一个重要修改是，在上采样部分，我们还有大量的特征通道(feature channels)，允许网络将上下文信息传播到更高分辨率的层。因此，扩展路径差不多与收缩路径对称，所以就形成了U形结构。网络没有任何完全连接的层，并且仅使用每个卷积的有效部分，即分割图仅包含在输入图像中可获得完整上下文的像素。
As for our tasks there is very little training data available, we use excessive data augmentation by applying elastic deformations to the available training images. This allows the network to learn invariance to such deformations, without the need to see these transformations in the annotated image corpus. This is particularly important in biomedical segmentation, since deformation used to be the most common variation in tissue and realistic deformations can be simulated efficiently. The value of data augmentation for learning invariance has been shown in Dosovitskiy et al. [2] in the scope of unsupervised feature learning.
由于我们的任务可用的训练数据很少，所以我们使用了弹性形变的方法对已有的数据进行大量的扩增（即旋转、平移、随机裁剪等）。这使得网络可以学习这种变形的不变性，且不需要在注释的图像图片库中看到这些变换。这在生物医学分割中非常重要，因为变形是组织中最常见的变化，并且可以有效地模拟真实的变形。Dosovitskiy等人已经证明了在无监督特征学习的范围内学习不变性的数据增加的价值[2]。
Another challenge in many cell segmentation tasks is the separation of touching objects of the same class; see Figure 3. To this end, we propose the use of a weighted loss, where the separating background labels between touching cells obtain a large weight in the loss function.
在细胞分割任务中的另一个挑战就是分割同一类但是相互碰触（就是有重叠的，连在一起的）的物体，如图3所示。为解决这个问题，我们提出使用加权损失，其中重叠细胞之间的分离背景标签在损失函数中获得较大的权重。
The resulting network is applicable to various biomedical segmentation problems. In this paper, we show results on the segmentation of neuronal structures in EM stacks (an ongoing competition started at ISBI 2012), where we out- performed the network of Ciresan et al. [1]. Furthermore, we show results for cell segmentation in light microscopy images from the ISBI cell tracking challenge 2015. Here we won with a large margin on the two most challenging 2D transmitted light datasets.
训练好的网络适用于各种生物医学分割问题。在本文中，我们展示了EM stacks中神经元结构分割的结果（从 ISBI 2012开始一直在进行的比赛），其中我们的网络超越了Ciresan等人的网络。此外，我们展示了来自2015 ISBI细胞追踪挑战中对光学显微镜图像细胞分割的结果。在这里，我们在两个最具挑战性的2D透射光数据集上获得了很好的成绩。
2 Network Architecture
2 网络结构
The network architecture is illustrated in Figure 1. It consists of a contracting path (left side) and an expansive path (right side). The contracting path follows the typical architecture of a convolutional network. It consists of the repeated application of two 3x3 convolutions (unpadded convolutions), each followed by a rectified linear unit (ReLU) and a 2x2 max pooling operation with stride 2 for downsampling. At each downsampling step we double the number of feature channels. Every step in the expansive path consists of an upsampling of the feature map followed by a 2x2 convolution (“up-convolution") that halves the number of feature channels, a concatenation with the correspondingly cropped feature map from the contracting path, and two 3x3 convolutions, each followed by a ReLU. The cropping is necessary due to the loss of border pixels in every convolution. At the final layer a 1x1 convolution is used to map each 64-component feature vector to the desired number of classes. In total the network has 23 convolutional layers.
网络结构如图1所示。她包含了一条收缩路径（左边）和一条扩展路径（右边）。收缩路径这一边就是典型的卷积网络结构。她包含两个叠加的3x3卷积层，然后是relu激活函数层和2x2的max-pooling池化层。池化层的步长是2，以此来下采样。在每个下采样步骤中，我们将特征通道的数量加倍。扩展路径中的每一步都包括对feature map进行上采样，然后进行2x2卷积（up-convolution），将特征通道的数量减半，与来自收缩路径的相应裁剪的特征图串联，以及两个3x3 卷积，每个卷积后跟一个ReLU激活函数。由于每个卷积中边界像素的丢失，所以裁剪是必要的.在最后一层，使用1x1卷积将每个64分量特征向量映射到所属的类。在整体上，u-net网络有一共有23个卷积层。
To allow a seamless tiling of the output segmentation map (see Figure 2), it is important to select the input tile size such that all 2x2 max-pooling operations are applied to a layer with an even x- and y-size.
为了实现输出分割后图的无缝平铺（seamless tiling）（参见图2），选择输入图块大小（tile size）非常重要，这样所有2x2最大池操作都能应用于具有偶数x和y尺寸的图层。
3 Training 训练
The input images and their corresponding segmentation maps are used to train the network with the stochastic gradient descent implementation of Caffe [6]. Due to the unpadded convolutions, the output image is smaller than the input by a constant border width. To minimize the overhead and make maximum use of the GPU memory, we favor large input tiles over a large batch size and hence reduce the batch to a single image. Accordingly we use a high momentum (0.99) such that a large number of the previously seen training samples determine the update in the current optimization step.
网络的输入是已有的图像和其对应的分割图（就是对应的把细胞图像轮廓分割出来的图像），在caffe平台上，用随机梯度下降优化算法来训练模型。由于在卷积操作中没有padding（即边界没有填充），所以输出层图像比输入层图像的的边界要小。为了最大限度地减少开销并最大限度地利用GPU内存，我们倾向于使用size较大的图像，然后将批量（batch）减少到单个图像（一个batch就是一张图像）。我们使用高动量（v=0.99），使得大量先前看到的训练样本确定当前优化步骤中的更新。(就是带动量的随机梯度下降法，动量值为0.99)
The energy function is computed by a pixel-wise soft-max over the final feature map combined with the cross entropy loss function. The soft-max is defined as p_k (x)=exp⁡(a_k (x))/(∑_(k^’=1)K▒〖exp⁡(a_(k^’ ) (x))〗) where a_k (x) denotes the activation in feature channel k at the pixel position x∈ Ω with Ω⊂Z^2 . K
is the number of classes and p_k (x) is the approximated maximum-function. I.e. p_k (x)≈1 for the k that has the maximum activation a_k (x) and p_k (x)≈0 for all other k. The cross entropy then penalizes at each position the deviation of p_(l(x)) (x) from 1 using
E= ∑_(x∈Ω)▒〖w(x)log⁡(〖p_(l(x))〗^((x)))〗 (1)
能量函数是用soft-max计算网络最后一层输出的feature map，用交叉熵损失函数来优化模型。Softmax函数的公式为p_k (x)=exp⁡(a_k (x))/(∑_(k^’=1)K▒〖exp⁡(a_(k^’ ) (x))〗)。a_k (x)表示每一像素点x在对应特征通道k上的得分。K表示类别的数量，p_k (x) 表示像素点x属于第k类的概率。p_k (x) 是向上取值。当约等于1时为一类，约等于0时为另一类（文章中就两类，all other k应该表示除了边缘像素以外的都是另一类，就背景）然后用交叉损失函数来作为惩罚项。就是通过降低用交叉损失为目的来训练模型。交叉损失函数如公式1）所示。
在这里插入图片描述
图3：玻璃上的HeLa细胞用DIC（不同的干扰项对比）显微镜记录。a)原始图像b) 不同细胞的真实分割。不同的颜色表示不同的HeLa细胞。c) b图的掩膜，白色是前景黑色是背景 d) 以像素为单位的损失权重映射来强制网络学习边界像素。（就是学习要表示出来的细胞边缘）
Where l: Ω→{1,…,K} is the true label of each pixel and w∶ Ω→R is a weight map that we introduced to give some pixels more importance in the training.l: Ω→{1,…,K} 表示每个像素的真实标签。w∶ Ω→R 是我们引入的权重映射，用于在训练中使一些像素更重要（前面有讲过）。
We pre-compute the weight map for each ground truth segmentation to compensate the different frequency of pixels from a certain class in the training data set, and to force the network to learn the small separation borders that we introduce between touching cells (See Figure 3c and d).
我们预先计算每个样本真实分割图（就是细胞轮廓）的权重映射，以补偿训练数据集中某个类的像素的不同频率，并迫使网络学习得出接触细胞之间的分离边界。
The separation border is computed using morphological operations. The weight map is then computed as
w(x)= w_c (x)+ w_0·exp(-〖(d_1 (x)+d_1 (x))〗^2/(2σ2 )) (2)
where w_c ∶ Ω→R is the weight map to balance the class frequencies, d_1 ∶ Ω→R denotes the distance to the border of the nearest cell and d_2 ∶ Ω→R the distance to the border of the second nearest cell. In our experiments we set w_0=10 and σ≈5 pixels.
使用形态学运算来计算分离边界。权重映射计算公式如（2）所示。w_c是用来平衡类别频率的weight map。d_1表示最细胞最接近的边界的距离；d_2表示与细胞次接近的距离。经过试验，将w_0设为10，σ 大约5个像素大小
In deep networks with many convolutional layers and different paths through the network, a good initialization of the weights is extremely important. Otherwise, parts of the network might give excessive activations, while other parts never contribute. Ideally the initial weights should be adapted such that each feature map in the network has approximately unit variance. For a network with our architecture (alternating convolution and ReLU layers) this can be achieved by drawing the initial weights from a Gaussian distribution with a standard deviation of √(2/N) , where N denotes the number of incoming nodes of one neuron [5]. E.g. for a 3x3 convolution and 64 feature channels in the previous layer N = 9·64 = 576.
在有很多层是深度神经网络或者网络中有几条路径时（不知道怎么翻译，就是可以想象在u-net网络中口有卷积和反卷积两条路径一样），权重的初始化是非常重要的。否则，网络的某些部分可能会进行过多的激活，而其他部分则不会有所贡献。理想情况下，应调整初始权重，使得网络中的每个特征图具有近似单位方差。对于一个网络有我们设计的架构（交替卷积和Relu层）时，可以用标准偏差为√(2/N)的高斯分布来实现权重的初始化，其中N表示一个神经元的传入节点数；例如，对于卷积核为3x3大小，有64个feature channels的初始层 N=9*64=576
3.1 Data Augmentation 数据增强
Data augmentation is essential to teach the network the desired invariance and robustness properties, when only few training samples are available. In case of microscopical images we primarily need shift and rotation invariance as well as robustness to deformations and gray value variations. Especially random elastic deformations of the training samples seem to be the key concept to train a segmentation network with very few annotated images. We generate smooth deformations using random displacement vectors on a coarse 3 by 3 grid. The displacements are sampled from a Gaussian distribution with 10 pixels standard deviation. Per-pixel displacements are then computed using bicubic interpolation. Drop-out layers at the end of the contracting path perform further implicit data augmentation.
当只有少数训练样本可用时，数据增加对于网络学习其所需的不变性和鲁棒性是至关重要的。例如显微镜图像，我们初期要做平移、旋转不变性的复制，还有变形的健壮性，以及灰度值差异（亮度）。尤其是对训练数据进行随机形变，感觉这是在用少量训练数据做数据增强时的核心方法。我们使用随机位移矢量在粗糙的3×3网格上生成平滑变形。位移矢量是从标准差为10的高斯分布中采样的。然后使用双三次方插值计算每个像素位移。收缩路径末端的dropout层执行进一步的隐式数据增强。
4 Experiments
We demonstrate the application of the u-net to three different segmentation tasks. The first task is the segmentation of neuronal structures in electron microscopic recordings. An example of the data set and our obtained segmentation is displayed in Figure 2. We provide the full result as Supplementary Material. The data set is provided by the EM segmentation challenge [14] that was started at ISBI 2012 and is still open for new contributions. The training data is a set of 30 images (512x512 pixels) from serial section transmission electron microscopy of the Drosophila first instar larva ventral nerve cord (VNC). Each image comes with a corresponding fully annotated ground truth segmentation map for cells (white) and membranes (black). The test set is publicly available, but its segmentation maps are kept secret. An evaluation can be obtained by sending the predicted membrane probability map to the organizers. The evaluation is done by thresholding the map at 10 different levels and computation of the"warping error", the “Rand error” and the “pixel error” [14].
我们将u-net网络应用到了3个不同的分割任务。第一个任务是电子显微镜记录中神经元结构的分割。图2中显示了数据集和我们获得的分割结果示例。我们提供完整的结果作为补充材料。该数据集由EM分段挑战[14]提供，该挑战始于ISBI 2012，并且仍然对新的贡献开放。训练数据是来自果蝇第一龄幼虫腹侧神经索（VNC）的连续切片透射电子显微镜，共30张图像（512×512）。每个图像都带有相应的完全注释的地面真实分割图，用于细胞（白色）和膜（黑色）。该测试集是公开的，但其分割图保密。可以通过将预测的膜概率图发送给组织者来获得评估。通过在10个不同级别对map进行阈值处理并计算“扭曲误差”，“ Rand error”和“像素误差”来完成评估。
The u-net (averaged over 7 rotated versions of the input data) achieves without any further pre- or postprocessing a warping error of 0.0003529 (the new best score, see Table 1) and a rand-error of 0.0382. This is significantly better than the sliding-window convolutional network result by Ciresan et al. [1], whose best submission had a warping error of 0.000420 and a rand error of 0.0504. In terms of rand error the only better performing algorithms on this data set use highly data set specific post-processing methods1 applied to the probability map of Ciresan et al. [1].
u-net（输入数据的平均超过7个旋转版本）在没有任何进一步预处理或后处理的情况下实现0.0003529的warping error（新的最佳分数，参见表1）和0.0382的rand-error。这明显优于Ciresan等人的滑动窗口卷积网络结果[1]。，其最佳提交的warping error为0.000420，rand-error为0.0504。就rand-error而言，该数据集上唯一性能更好的算法使用高度数据集特定的后处理方法应用于Ciresan等人的probability map。

Table 1. Ranking on the EM segmentation challenge [14] (march 6th, 2015), sorted by warping error.
EM分割挑战中的排名，根据warping error.排序
在这里插入图片描述

在这里插入图片描述
Fig. 4. Result on the ISBI cell tracking challenge. (a) part of an input image of the “PhC-U373” data set. (b) Segmentation result (cyan mask) with manual ground truth (yellow border) © input image of the “DIC-HeLa” data set. (d) Segmentation result (random colored masks) with manual ground truth (yellow border).
ISBI细胞跟踪挑战的结果。（a）“PhC-U373”数据集的输入图像的一部分。（b）具有手动基础事实（黄色边界）的分割结果（青色掩模）（c）“DIC-HeLa”数据集的输入图像。（d）具有手动基础事实（黄色边界）的分割结果（随机彩色掩模）。

Table 2. Segmentation results (IOU) on the ISBI cell tracking challenge 2015.
在这里插入图片描述
We also applied the u-net to a cell segmentation task in light microscopic images. This segmentation task is part of the ISBI cell tracking challenge 2014 and 2015 [10,13]. The first data set "PhC-U373"2 contains Glioblastoma-astrocytoma U373 cells on a polyacrylimide substrate recorded by phase contrast microscopy (see Figure 4a,b and Supp. Material). It contains 35 partially annotated training images. Here we achieve an average IOU (“intersection over union”) of 92%, which is significantly better than the second best algorithm with 83% (see Table 2). The second data set "DIC-HeLa"3 are HeLa cells on a at glass recorded by difierential interference contrast (DIC) microscopy (see Figure 3, Figure 4c,d and Supp. Material). It contains 20 partially annotated training images. Here we achieve an average IOU of 77.5% which is significantly better than the second best algorithm with 46%.
我们还将u-net应用于光学显微图像中的细胞分割任务。此分段任务是2014年和2015年ISBI小区跟踪挑战的一部分[10,13]. 第一组数据“PhC-U373”2包含通过相差显微镜记录的聚丙烯酰亚胺底物上的成胶质细胞瘤 - 星形细胞瘤U373细胞（参见图4a，b和Supp。材料）。它包含35个部分注释的训练图像。在这里，我们实现了92％的平均IOU（“交联结合”），这明显优于第二个最佳算法（83％）（见表2）。第二数据集“DIC-HeLa”3是通过差分干涉对比（DIC）显微镜记录的玻璃上的HeLa细胞（参见图3，图4c，d和补充材料）。它包含20个部分注释的训练图像。在这里，我们的平均IOU为77.5％，明显优于第二好的算法，为46％。
5 Conclusion
The u-net architecture achieves very good performance on very different biomedical segmentation applications. Thanks to data augmentation with elastic deformations, it only needs very few annotated images and has a very reasonable training time of only 10 hours on a NVidia Titan GPU (6 GB). We provide the full Caffe[6]-based implementation and the trained networks4. We are sure that the u-net architecture can be applied easily to many more tasks.
u-net架构在不常见的生物医学分割应用有很好的表现。由于可以利用弹性变形的数据增强方法，它只需要非常少的注释图像，并且在NVidia Titan GPU（6 GB）上只有10小时的非常合理的训练时间。我们提供完整的基于Caffe [6]的实现和训练好的网络。我们确信u-net架构可以轻松应用于更多任务。