caffe/tensorflow中样本的label一定要从序号0开始标注--caffe学习（15）

最新推荐文章于 2024-09-03 17:35:19 发布

code_Rocker

最新推荐文章于 2024-09-03 17:35:19 发布

阅读量1.1w

点赞数 3

分类专栏： Machine Learning caffe tensorflow 文章标签：标签 caffe labeol从0开始 tensorflow

本文链接：https://blog.csdn.net/u014381600/article/details/54319030

版权

Machine Learning 同时被 3 个专栏收录

33 篇文章 0 订阅

订阅专栏

caffe

26 篇文章 2 订阅

订阅专栏

tensorflow

5 篇文章 0 订阅

订阅专栏

这两天在跑实验时思考一个问题，为什么在别的帖子里面和自己之前的实验中，对于data的标注大家都默认使用的是从0开始标注样本，之前的一次finetune中，自己的样本从34567开始标注的时候一直没有开始收敛loss，但是后来在另一个帖子中看到标注必须要从0开始，后来自己改成01234之后loss也收敛了，因此开贴记录并验证。

这是序号为01234的test.txt文件内容：
这里写图片描述
先看标注从0开始的样本的训练结果：

这是序号为34567的test.txt文件内容：
这里写图片描述
这是序号为34567的test.txt训练结果：

从结果来看好像在finetue时结果并不会受到影响。
因为finetun使用的模型是imagenet训练好的bn模型，所以性能很好，分辨率为1。
现在取消pre-trained model，从0开始训练网络，看结果会受到影响吗：
同样从序号为01234的样本开始训练，文件和前面唯一的不同就是没有fintune而是从0开始训练网络：
结果如下：
这里写图片描述

更新，在源码中找到对应的理论依据：
caffe.proto文件中：


message AccuracyParameter {
  // When computing accuracy, count as correct by comparing the true label to
  // the top k scoring classes.  By default, only compare to the top scoring
  // class (i.e. argmax).
  optional uint32 top_k = 1 [default = 1];

  // The "label" axis of the prediction blob, whose argmax corresponds to the
  // predicted label -- may be negative to index from the end (e.g., -1 for the
  // last axis).  For example, if axis == 1 and the predictions are
  // (N x C x H x W), the label blob is expected to contain N*H*W ground truth
  // labels with integer values in {0, 1, ..., C-1}.
  optional int32 axis = 2 [default = 1];

  // If specified, ignore instances with the given label.
  optional int32 ignore_label = 3;
}

默认的预测label：
labels with integer values in {0, 1, ..., C-1}.
,C个类，是从0到C-1开始标记的label，因此大家在给自己的样本标记label的时候记得一定要从0开始！

接下篇：caffe中loss函数代码分析–caffe学习（16）

差不多的探讨

2017年7月6日 UPDATE：

**
最近在看tensorflow的用法，发现和caffe里面其实是类似的，但是如果是label全都obehot了的话，其实从几开始标注label并不重要，因为后面算loss的时候都一样，

x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

注意这里的correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
y是你自己的label标注onehot之后的，假设三个类，标注为0,1，2,或者是2,3,4都没关系，因为第一个类onehot之后都是100，在tf.argmax的时候得到的类别标号也是从index0开始的。
但是如果你的label没有经过onehot，而网络的输出是 [None, n_classes]，这时候

correct_pred = tf.equal(tf.argmax(pred, 1),y)

这里的tf.argmax的时候得到的类别标号也是从index0开始的,因此你的label这时候没有onehot，是整形数，因此也必须从0开始，不然的话会出错。

结论：

无论是caffe还是tensorflow，做分类时自己的label都确保从0开始标号，这样后面你自己扩展的时候也不用考虑这种问题，另外tensorflow的话，只要label是onehot之后的，虽然可以不从0开始标注，或者说可以任意标注label，但是从0开始标记label依然是最安全的办法！

code_Rocker

关注

3
点赞
踩
10

收藏

觉得还不错? 一键收藏
2
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录