机器学习图像源代码_使用带有代码的机器学习进行快速房地产图像分类

最新推荐文章于 2023-04-01 21:26:50 发布

weixin_26746401

最新推荐文章于 2023-04-01 21:26:50 发布

阅读量380

点赞数

文章标签：机器学习 python 人工智能计算机视觉深度学习

原文链接：https://towardsdatascience.com/fast-real-estate-image-classification-using-machine-learning-with-code-32e0539eab96

版权

机器学习图像源代码

RoomNet is a very lightweight (700 KB) and fast Convolutional Neural Net to classify pictures of different rooms of a house/apartment with 88.9 % validation accuracy over 1839 images. I have written this in python and TensorFlow.

RoomNet是一种非常轻巧的( 700 KB )快速快速的卷积神经网络，可对房屋/公寓不同房间的图片进行分类，对1839张图片的验证准确性为88.9％ 。我已经用python和TensorFlow编写了这个。

Btw, I made the artwork and yes, I’m actually not in 3rd grade despite tempting popular opinions stemming from it.

顺便说一句，我制作了艺术品，是的，尽管我引诱了公众的意见，但实际上我还不是三年级。

This is a custom architecture I designed to classify an input image into one of the following 6 classes (in order of their class IDs) -

这是我设计的自定义体系结构，旨在将输入图像分类为以下6个类之一(按其类ID的顺序)-

Backyard-0, Bathroom-1, Bedroom-2, Frontyard-3, Kitchen-4, LivingRoom-5

后院 -0，浴室 -1，卧室 -2，前院 -3，厨房 -4，客厅 -5

Image for post — Photo credits from left to right in clockwise manner — Photo by Chastity Cortijo on Unsplash (Bathroom), Photo by Roberto Nickson on Unsplash (Bedroom), Photo by Jason Briscoe on Unsplash (Kitchen), Photo by Shaun Montero on Unsplash (Backyard), Photo by Roberto Nickson on Unsplash (Living Room), Photo by Roberto Nickson on Unsplash (Frontyard)

建筑积木- (Architecture Building Blocks -)

These blocks are used to construct the final neural net. They’re comprised of basic elemental neural layers like convolution, average pooling, and batch normalization.

这些块用于构建最终的神经网络。它们由卷积，平均池化和批归一化等基本元素神经层组成。

完整的网络架构- (Full Network Architecture -)

Using the building blocks described above, presented below is the full neural net architecture with the relevant arguments.

使用上述构建模块，下面介绍的是带有相关参数的完整神经网络体系结构。

The code to train and deploy this can be found on GitHub at —

可以在GitHub上找到用于训练和部署此代码的代码-

开箱即用的推论- (Out-of-box Inference -)

Optimized inference code in infer.py. Refer to the short code in the main method calling the classify_im_dir method.

优化了infer.py中的推理代码。请参考main方法中调用classify_im_dir方法的简短代码。

培训- (Training -)

Input image size = 224 x 224 (tried 300 x 300, 600 x 600)
输入图像大小= 224 x 224(尝试300 x 300、600 x 600)
Softmax Cross Entropy Loss used with L2 Weight normalization
L2权重归一化使用的Softmax交叉熵损失
Dropout varied from 0 (initially) to 0.3 (intermittently near the end of training). Dropout layers placed after every block.
辍学率从0(最初)到0.3(在训练结束时间歇性地)变化。在每个块之后放置辍学层。
Batch Normalization moving means & vars were frozen when being trained with dropout.
在进行辍学训练时， 批次归一化移动工具和变量被冻结。
Adam Optimizer used with exponential learning rate decay.
Adam Optimizer用于指数学习率衰减。
Initially trained with in-batch computation of BatchNorm moving means/vars. Followed this by training net, by disabling this computation and using frozen means/vars during training. Resulted in ~20% immediate jump in validation accuracy (noticeable around train step 150,000). I’ll be publishing another article delving into this phenomenon shortly.
最初使用BatchNorm移动工具/变量的批内计算进行训练。其次是训练网，在训练过程中禁用此计算并使用冻结的均值/变量。 导致验证准确性立即提高约20％ (在火车第150,000步左右明显)。 不久我将发表另一篇文章探讨这种现象。
Batch Size varied from 8 (in the beginning) to 45 (towards training end) as — 8 -> 32 -> 40 -> 45
批次大小从8(开始时)到45(训练结束时)为— 8-> 32-> 40-> 45
Asynchronous Data Reader designed with a Queue based architecture that allows for quick data I/O during training even with large batch sizes.
异步数据读取器采用基于队列的体系结构设计，即使批量较大，也可以在训练期间快速进行数据I / O。

转换为推理优化版本- (Conversion to Inference Optimized Version -)

Discarded all backpropagation/training related compute node from the Tensorflow Graph.
从Tensorflow图中丢弃所有与反向传播/训练相关的计算节点。
Model size reduced from ~2 MB to ~800 KB.
模型大小从〜2 MB减少到〜800 KB。
network.py contains class defining the model called “RoomNet”
network.py包含定义称为“ RoomNet”的模型的类
The output is an excel file mapping each image path to its label. There is also a provision to split an input directory to directories corresponding to the class names and automatically fill the relevant image in its respective directory.
输出是一个将每个图像路径映射到其标签的excel文件。还提供了将输入目录拆分为与类名称相对应的目录，并自动在其相应目录中填充相关图像的条款。

培训环境- (Training Environment -)

Training done using Tensorlfow + CUDA 10.0 + cuDNN on NVIDIA GTX 1070 laptop grade GPU with 8GB of GPU memory
使用Tensorlfow + CUDA 10.0 + cuDNN在具有8GB GPU内存的NVIDIA GTX 1070笔记本电脑级GPU上完成培训
Compute system used is an Alienware m17 r4 laptop.
使用的计算系统是Alienware m17 r4笔记本电脑。
CPU used is an Intel Core i7–6700HQ with 8 logical cores at 2.6 GHz of base speed (turbo boost to ~3.3 GHz)
使用的CPU是Intel Core i7-6700HQ，具有8个逻辑内核，基本速度为2.6 GHz(涡轮增压提升至〜3.3 GHz)
The number of training steps from scratch to reach best model is 157,700.
从头开始到达到最佳模型的培训步骤数为157,700。
Time spent on training — ~48 hours
培训时间–约48小时

尝试过的先前方法- (Previous Approaches tried -)

Tried training the final dense NASnet mobile but accuracy never crosses 60%.
尝试训练最终的密集NASnet移动设备，但准确性从未超过60％。
Tried the same with InceptionV3 but convergence takes too damn long.
与InceptionV3进行了相同的尝试，但是收敛时间太长了。

性能图- (Performance Plots -)

验证类F分数 (Validation Class-wise F-Score)

F-Score is the harmonic mean of precision and recall.

F分数是精度和查全率的谐波平均值。

验证类精度 (Validation Class-wise Precision)

验证逐级召回 (Validation Class-wise Recall)

If you found this helpful, feel free to follow me for more upcoming articles :)

如果您认为这有帮助，请随时关注我以获取更多即将发表的文章:)

I’m the editor of the following publication which publishes Tech articles related to the usage of AI & ML in digital mapping of the Earth. Feel free to follow to stay updated :)

我是以下出版物的编辑，该出版物发表有关在地球数字地图中使用AI和ML的技术文章。随时关注以保持更新:)