give up
competition questions and data
guangdong_defect_instruction_20180916.xlsx
guangdong_round1_submit_sample_20180916.csv
guangdong_round1_test_a_20180916.zip
guangdong_round1_train1_20180903.zip
Solutions
Using Kaggle cat and dog classification code,
even using there depth deeping networks ResNet50,Inception V3,
Xception to extract image features,
and using neural networkf DNN classification,
verification set shows over-fitting.Kaggle cat and dog classification
ResNet50
resnetv2-50
tensorflow.Keras use Resnet50 to realize CatDogDistinguish
pretriained-models
比赛思路
Direct image classificaton,select a network to extract features,followed by a fully connection layer classification,plus regularization to reduce over-fitting.Then let go of all levels of training.The final accuracy is about 0.92,in fact,as long as the default parameters do not depart from the spectrum on the line,adjusting the parameters does not have much impact on the results.
select a network to extract features
competition solution 2:Standard DenseNet,softmax12 classification,
made data enhancement;
tried to tune learning_rate,
batch_size,num_layers
SHARED BASELINE
WIN10+anaconda+Pytorch 0.4+CUDA 8.0
pytorch gpuactivate Pytorch
pip install http://download.pytorch.org/whl/cu80/torch-0.4.0-cp36-cp36m-win_amd64.whl
pip install torchvision
test:
import torch
print(torch.__version__)
torch.cuda.is_avaliable()
ResNet
Residual Network
Inception
RexNeXt
Xception
inception
开始,盗梦空间
Xception
Extreme Inception
verify CUDA
nvcc -V
Ground2
The first column | The second column |
---|---|
1 | 用TensorFlow训练一个物体检测器 |
2 | Object detection |
3 | single shot multibox detector |
The First Column | The Second Column |
---|---|
re | regular expression 正则表达式 |
anchor | SPP(spatial pyramid pooling) |
spatial pyramid pooling | 空间金字塔池化层 |
VGG | 224*224 |
SPP | spatial pyramid pooling |
mlp | multilayer perceptron 多层感知机 |
ffm | Field-aware Factorization Machine |
CTR | click-through rate 点击率 |
CVR | conversion rate 转化率 |
conversion | 变换 |
SGD | Stochastic Gradient Descent |
ICPR | international conference on pattern recognition |
Stochastic Gradient Descent | 随机梯度下降 |
GD algorithm's two shortcomings | GD |
click rate prediction algorithm | 点击率预估算法 |
CVR | conversion rate 点击后的转化率 |
LR | Logistic Regression |
GBDT | Gradient Boosting Decision Tree |
FM | Factorization Machine |
factorization | 因式分解 |
时间复杂度 | comprehensive |
Bagging | 套袋法 |
Adaboosting | the difference between bagging and boosting |
bagging | Bagging is a reduction of variance |
boosting | Boosting is a reduction of bias |
bias and variance | N-fold Cross Validation |
N-fold Cross Validation | N-折交叉验证,将训练集分成N份,比如分成3份 |
FFM | Field-aware Factorization Machine |
TP | True Positive |
FP | False positive |
TN | True negative |
FN | False negative |
TPR | $\frac{TP}{TP+FN} |
FPR | false positive rate |
recall | recall the number of positive sample |
precision | \(P= \fac{TP}{TP+FP}\) |
Precision and recall
在机器学习模型评估中,准确率和召回率是一对相互制约的性能度量指标。对于一个二分类问题,样本本身有正有负,而我们的学习器的判断也是有正有负。由于数据和算法等因素,我们的学习器的判断的结果往往不会和测试样本的真实答案完全吻合,这时就需要度量指标来表征该学习器的性能,常见的是准确率与召回率。
准确率和召回率定义非常明确,但是由于名称比较费解,尤其是召回率,致使很多人将这两个概念混淆。在周志华老师的西瓜书里面,将这两个词分别翻译为查准率(precision)和查全率(recall),这样可以顾名思义,了解到这两个词的意思,查准率就是对于所有机器判定为正的里面,有多大的比例是真的正样本,写成公式就是P = T P/( T P + F P ) 其中TP,FP分别表示true positive和false positive,即所谓真阳性和假阳性,而对于查全率,顾名思义,就是实际的正样本中,有多大比例被检出了,写成公式就是: R = T P /(T P + F N )其中TP同前,FN表示false negative,也就是假阴性,(注意在真/假 阴/阳性中,阴阳性是指的分类器的判断结果是阴性还是阳性,而真假指代的是是否和真实一样)周老师的翻译虽然很巧妙,但是学界常常还是用准确率和召回率来称呼这两个概念,而且这两个概念的英文本意也是准确和召回,周老师算是意译。那么为了方便记忆和分辨,举一个场景来解释这件事:
假设有一家汽车公司,突然发现某一批次已经卖出去的车是有问题的,于是厂家给出了一种判断方法,告诉所有车主(假设所有车主都不知道自己的车的批次),让大家根据这种方法看一看自己的车子是否有问题,然后把问题车进行召回。但是由于判据过于简单,导致有些正常车的车主也发现自己车子有问题,而相反,也有些问题车的车主以为自己的车正常。这样,工厂召回来的车中,有TP,也有FP,而市面上没有召回的车子中,有TN,也有FN。好,那么现在,工厂希望计算一下,这一批次的问题车中,到底有多少真正被召回,那么就是R = TP/(TP+FN)。这里的FN就是那些未被召回的问题车。所以这个比率自然称为召回率。另外,制定这条判据的技术人员也想知道,这条判据的准确率有多高,那么他们就会计算一下在召回来的这些车里,有多少真的是问题车。所以P = TP/(TP+FP)。
通过以上场景,可以用经验的思维而非数学的定义,顾名(precision和recall)思义地辨析两者的区别和应用方式。
cross entropy
the first column | the second column |
---|---|
F1 | \(\fac{precision \times recall}{precision + recall}\) |
JSON | javascript Object Notation,JS对象简谱 |
notation | 标记法 |
curve | 曲线,弧线 |
AP | Average Precision |
mAP | mean Average Precision |
confidence score | 置信度 |
ground truth label | |
Cross Entropy | cross Entropy |
BoundingBox | 边界线 |
RectClass | |
TestEvaluator | 测试评估器 |
Three levels of image understanding | classification |
Three levels of image understanding | Detection |
2-stage | Region-based |
R-CNN | 1.Region Proposal 2.AlexNet |
region proposal | 后选区域 |
gound truth | the IoU of ground truth |
SSD | Single Shot Detection |
SPP | spatial pyramid pooling |
threshold | 0.5 |
RPN | Region of Interests |
anchor box | anchor |
bbox | Bounding Boxes |
Ground Truth | 检测框 |
IoU | (R∩G)/(R∪G) |
Bounding-Boxed | Bounding-Box regression |
k anchor boxes | 20k anchor boxes:60409 |
The First Column | The Second Column |
---|---|
ESS | ESS测量模型中的方差(信息)量 |
object mask | 物体掩膜 |
mask | 掩膜 |
SSD | Single Shot MultiBox Detector |
ont-stage | refineDet |
RefineDet | Single-Shot Refinement Neural Network for Object Dectection |
RefineDet | two-step cascaded regression |
deep_learning_object_detection | RefineDet |
deep_learning_object_detection | ![]() |
binary mask | 二值掩膜 |
bounding box | bounding box |
非极大值抑制 | Non-Maximum Suppression |
R-CNN | Region-based convolutional Neural Networks |
Ground Truth | 目标窗口 |
Faster R-CNN | Input Image proposal Region feature extraction feature map |
Non-Maximum Suppression | NMS |
mean shift | algorithm |
common IoU | 0.3~0.5 |
R-FCN | Region-based Fully convolutional Networks |
RoI pooling | end-to-end |
region proposal | ROI Pooling |
R-FCN | Region-based Fully Convolutional Networks |
ROI | regions of interest |
end-to-end training |
The First Column | The Second Column |
---|---|
compat | 兼容性 |
COCO dataset | object instances 目标实例 |
PASCAL VOC | pascal |
JSON | info_licenses_images_annotaions_categories |
Object Instance | info_licenses_images_categories_annotations |
annotation | 注释 |
PRN | Region Proposal Network |
caffe | convolutional Architecture for Fast Feature Embedding |
caffe | 快速嵌入卷积结构 convolutional architecture for fast feature embedding |
train | 训练数据 |
val | 验证数据 |
val | verify data(VAL) because of its tagging |
RetinaNet | ICCV 2017 the best paper |
RetinaNet | one-stage |
CE | cross-entropy Error |
FL | Focal Loss焦点误差 |
Installing Detectron | pytorch |
labelme to coco | Object detection_data_interface |
.. | parent directory |
Absolut path | 绝对路径:从盘符开始的路径 |
Absolut path | C:\windows\system32 |
vcs.xml | Do you want to add the following file to Git? |
background tasks | 后台任务 |
update indice | 更新索引 |
process finished with exit code 1 | 说明程序出错 |
annotaion | 注释 |
external file changes sync may be slow | 外部文件更改 同步可能会很慢 |
append() | append(2018),结果后面[....,2018] |
Detectron's backbone | ResNext{50,101,152}、ResNet、FPN、VGG16 |
indexing | 索引 |
compliation | |
val2017 | evaluation 2017 |
() | tuple datatype |
{} | dict |
[] | list,variable sequence |
range() | for in 范围 |
os.path.basename(imglist[i]) | Gets the name of the file under the corresponding path |
imglist[i] | corresponding path |
list = [] | define Empty list |
list.sort() | list is Ascending order |
cv2.imread() | the return type |
rect = {} | [rectangle = key1 : value1,key2 : …… |
background | NUM_CLASSES |
Object Keypoint Annotations | annotation |
len(boxes) | if boxes[b][2]<boxes[b][0] |
result["filename"] = filename | result = {} |
Adjusting parameter
the first column | the second column |
---|---|
1 | test function parameter |
2 | score(it means that confidence of the json) |
3 | ![]() |
FPN | Feature Paremeter Network |
Loss Function | Error of a sample |
MSE | mean squard error |
mean squard error | 均方误差 |