analysis of image dataset checking result (image segmentation experiment)

Why we should ckeck data set before doing experiments on it

1. Although dataset from different organization is protected by copyright, maybe part of data in it is from open source dataset. So we could not guarantee that there is not the same data in different dataset from different organization until we check and prove that.
2. Extra
So what is a pair of same data? In the case of medical image, are the images the same or not from different time point of the same person? So why shouldn't we use the machine learning method to analysis and check it?
condition: 
if we have trained a 3d end-to-end segmentation network, so in fact we have extracted point wise fore-back ground classification features.


Using SPP( Spatital Pyramid Pooling) to gain position relativated features

a breif introduction of SPP feature selected method:
a layer of SPP is like this:


concating five one-dimension features which is reshape from five semantic level features maps on VNet, we get the final features which has 36208 dimensions.

with SPP feature selected method, which has a well robust on feature map transform, we have the posibility to meature the distance of two samples, or analysis the distrubution of the certain dataset.


using dimension redusing method to analysis data distribution

there is totally 221 CT scans((131+70 CodaLab)+(20 sliver07)) joined in this experiment.
Using PCA to reform linear correlation features and reduce dimension.
after this operation, 220 dimensions is left. Eigenvalues(first 20) and using the first 3 Principal Companent visualized on Matlab is shown below:

evaluation:
knowing the distribution of samples is helpful for generate or discreminate an unknow samples. So maybe we can use it to generate new samples with it, and then we cound have a stronger CNN network and get a more accurate segmentation result.


using distance matrix to find the similar data

1. We calculate the distance matrix (221x221).
2. reshape it to one dimension and sort it (221x(221-1)/2) in the order of ascending.
3.the result is below (first 50):

conclusion:
After reading these volumes, I know there is some CT scans similar to each other. They are from the same person but scan in a different time (1-33). 


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值