Human Protein Image Classification

该项目旨在使用深度学习对显微镜图像中的蛋白质模式进行分类,特别是预测蛋白质细胞器定位标签。本文介绍了数据预处理(包括数据加载、归一化和增强)、模型选择(选用MobileNet)和训练预测过程,详细阐述了每个步骤的关键代码和理论依据。
摘要由CSDN通过智能技术生成

Table of Contents

Introduction

1. Data Pre-processing

1.1 Data Loading and Preperation

1.2 Mean and Variance Normalization

1.3 Data Augmentation

2. Model Selection and construction

2.1 MobileNet

2.2 Model Construction

3. Model training and prediction

3.1 Loss function

3.2 Optimizer[5]

3.3 Performance evaluation function

3.4 Model compile

3.5 Training

3.6 Prediction

References


Introduction

The project is to classify mixed patterns of proteins in microscope images. More specifically, the project is to predict protein organelle localization labels for each sample. The dataset is provided by The Human Protein Atlas. There are 28 different labels present in the dataset as shown in Table 1. All image samples are represented by four filters, the protein (green) and three cellular landmarks: nucleus (blue), microtubules (red), endoplasmic reticulum (yellow). Therefore, green filters should be used to predict labels and other filters used as references[1]. 

                                           Table 1 Categories and corresponding labels

label Category
0 Nucleoplasm
1 Nuclear membrane
2 Nucleoli
3 Nucleoli fibrillar center
4 Nuclear speckles
5 Nuclear bodies
6 Endoplasmic reticulum
7 Golgi apparatus
8 Peroxisomes
9 Endosomes
10 Lysosomes
11 Intermediate filaments
12 Actin filaments
13 Focal adhesion sites
14 Microtubles
15 Microtubule ends
16 Cytokinetic bridge
17 Mitotic spindle
18 Microtubule organizing center
19 Centrosome
20 Lipid droplets
21 Plasma membrane
22 Cell junctions
23 Mitochondria
24 Aggresome
25 Cytosol
26 Cytoplasmic bodies
27 Rods&rings

The code to get the image and four images of one sample are shown below (size of the image in each channel: (512, 512, 3)).

test_img_sample=cv2.imread("../input/human-protein-atlas-image-classification/train/7030d472-bb9a-11e8-b2b9-ac1f6b6435d0_blue.png")
plt.imshow(test_img_sample)
test_img_sample.shape

                                  Figure 1 (a) Yellow filter, (b) Blue filter , (c) Green filter, (d) Red filter

1. Data Pre-processing

In machine learning, data and features determine the upper limit of the results, model selection and optimization is to approach this upper limit. Feature engineering is used to remove impurities and redundancy from the original data, allowing the model to fit quickly. Unlike traditional machine learning, there's no need to artificially synthesize advanced complex features in deep learning. It only needs human's prior knowledge to process first-order features, and then deep learning will learn relevant complex features by itself.  

1.1 Data Loading and Preperation

The images and their corresponding targets are stored in the following format:

path_to_train = '../input/human-protein-atlas-image-classification/train'
data = pd.read_csv('../input/human-protein-atlas-image-classification/train.csv')
print(data)

Convert the data into array using the following code:

train_dataset_info = []
for name, la
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值