机器学习图像合成高分辨_合成数据：模拟无数种可能性来训练强大的机器学习模型

最新推荐文章于 2024-06-01 15:09:18 发布

culiao6493

最新推荐文章于 2024-06-01 15:09:18 发布

阅读量700

点赞数

文章标签：机器学习人工智能深度学习 python 计算机视觉

原文链接：https://blogs.unity3d.com/2020/05/01/synthetic-data-simulating-myriad-possibilities-to-train-robust-machine-learning-models/

版权

机器学习图像合成高分辨

Synthetic data helps many organizations overcome the challenge of acquiring labeled data needed for training machine learning models. This blog kicks off our series on synthetic data for training perception systems. In this first post, we will provide a brief overview of synthetic data and the breadth of use cases it enables.

合成数据可帮助许多组织克服获取训练机器学习模型所需的标记数据的挑战。该博客开始了我们关于训练感知系统的综合数据的系列。在第一篇文章中，我们将简要概述综合数据及其启用的用例的广度。

Almost every industry has been touched by the promise of Machine Learning (ML) over the last few years. However, gathering high-quality labeled data to train ML models continues to be a major challenge. A recent survey found 96% of enterprises encounter training data quality and labeling challenges in machine learning projects.

在过去的几年中，几乎每个行业都被机器学习(ML)的前景所感动。但是，收集高质量的标记数据以训练ML模型仍然是一个重大挑战。一个最近的一项调查发现，企业的96％，遇到机器学习项目的训练数据质量和标签的挑战。

Factors that make it difficult for an organization to collect sufficient labeled data necessary for robust ML models include:

使组织难以收集强大的ML模型所需的足够的标签数据的因素包括：

Privacy and regulatory concerns. For example, video footage within a retail store may contain facial information, which is considered personally identifiable and governed by various regulations.
隐私和监管问题。 例如，零售商店内的录像可能包含面部信息，这些信息被认为是个人可识别的，并受各种法规约束。
Non-exhaustive examples of real-world scenarios leading to selection bias in ML models. For example, a dataset containing examples of an object in a singular pose may lead to an ML performing poorly on the same object seen in a very different pose.
导致ML模型中选择偏见的真实场景的非详尽示例。 例如，包含以单一姿势的对象示例的数据集可能会导致ML在以完全不同的姿势看到的同一对象上表现不佳。
Corner cases which are rare, expensive, or dangerous to recreate in real life. For example, unusual obstacles or extreme weather conditions on a road which may not have been seen during training, by an autonomous vehicle.
在现实生活中重现的罕见，昂贵或危险的极端案例。 例如，自动驾驶车辆在训练过程中可能未曾见过的道路上的异常障碍物或极端天气情况。

Synthetic data is emerging as an answer to solving many of these challenges. Researchers at OpenAI (Tobin et al., 2017) and Google (Hinterstoisser et al., 2019) have successfully demonstrated the efficacy of synthetic data for real-world tasks such as object detection. With advances in graphics processing and reduction in scalable computing costs, it has become possible to leverage the same tools and systems that are used to develop immersive video games and movies, to simulate photorealistic synthetic images of the real-world. Here we illustrate an example of creating highly realistic images of household products from 3D objects using the Unity engine.

合成数据正在成为解决许多挑战的答案。 OpenAI( Tobin等人，2017 )和Google( Hinterstoisser等人，2019 )的研究人员已经成功地证明了合成数据在诸如物体检测等现实世界任务中的功效。随着图形处理的进步和可扩展计算成本的降低，利用开发沉浸式视频游戏和电影所用的相同工具和系统来模拟真实世界的逼真的合成图像已成为可能。在此，我们举例说明一个使用Unity引擎从3D对象创建高度逼真的家用产品图像的示例。

3D assets in Unity*

Unity *中的3D资产

A photorealistic synthetic image generated from 3D models*

从3D模型生成的逼真的合成图像*

Labeled synthetic images*

标记的合成图像*

Images, thus generated, are also labeled at no extra cost and can be used as training data for computer vision algorithms. These algorithms can be applied to several real-world applications spanning multiple industry verticals, some of which are illustrated below.

这样生成的图像也被标记为免费，可以用作计算机视觉算法的训练数据。这些算法可以应用于跨越多个行业垂直领域的多个实际应用程序，其中一些示例如下所示。

仓库和物流中的机器人自动化 (Robotic automation in warehouse & logistics)

Tasks such as pick-and-place and depalletization of heavy objects in environments such as e-commerce distribution centers are not only repetitive but also hazardous. The introduction of robots in such environments have made our supply chains safer and more efficient. However, warehouse robots continue to face the challenge of recognizing diverse products that vary in size, shape, and weight, on an ongoing basis. By creating synthetic images of new product variants, the cost and turnaround time to gather labeled training data can be reduced significantly. Research from Open AI demonstrates the efficacy of synthetic data for training robots on complex manipulations and performing them in the real world.

在诸如电子商务配送中心之类的环境中进行重物的取放和卸垛等任务不仅重复而且很危险。在这样的环境中引入机器人使我们的供应链更安全，更高效。然而，仓库机器人不断面临识别不断变化的大小，形状和重量变化的多种产品的挑战。通过创建新产品变型的合成图像，可以显着降低收集标记的训练数据的成本和周转时间。 Open AI的研究证明了合成数据对于训练机器人进行复杂操作并在现实世界中执行它们的功效。

零售无收银结帐 (Cashierless checkout in retail)

Another interesting example are the cashierless checkout stores in retail. Through the use of overhead cameras and weight sensors on shelves, retail chains can track people and products in a privacy-compliant way. By using realistic images of product SKUs and simulating complex in-store shopper behavior such as picking items from a shelf, removing or swapping an item from a cart, a computer vision model can more accurately detect nuanced real-world scenarios and create a positive experience for the shopper and the retailer. There is concrete evidence to demonstrate that computer vision models trained purely on synthetic data perform well on grocery products.

另一个有趣的示例是零售中的无收银台结帐商店。通过在架子上使用高架摄像机和重量传感器，零售连锁店可以以符合隐私权的方式跟踪人员和产品。通过使用产品SKU的逼真的图像并模拟复杂的店内购物者行为，例如从架子上取货，从购物车中取出或交换商品，计算机视觉模型可以更准确地检测出细微的真实场景并创造积极的体验对于购物者和零售商。有具体证据表明，仅在合成数据上训练的计算机视觉模型在杂货产品上表现良好。

Ceiling-mounted cameras make cashierless checkout possible in Amazon Go stores (Image credit: SounderBruce)

吊装式摄像头使Amazon Go商店中的无收银台结账成为可能(图片来源： SounderBruce )

制造业的视觉质量检查 (Visual quality inspection in manufacturing)

Other valuable areas could benefit from synthetic data, such as manual quality inspection on a product assembly line. It is a tedious and error-prone exercise to which modern computer vision based automation has brought tangible efficiency gains. However, achieving a high level of precision during the automated inspection process requires significant amounts of data to represent a wide variety of scenarios. Imagine the number of variables an assembly line that is packing bottles of hot sauce has to learn: the possible distortions in labels, missing information, levels of sauce in the bottle, any discoloration of the sauce, among others. These anomalies infrequently occur in the real world. By simulating a large number of such anomalies and generating a large enough dataset to encompass all plausible real-world situations, an ML model can be made more robust.

其他有价值的领域也可以从合成数据中受益，例如在产品装配线上进行手动质量检查。这是一个乏味且容易出错的练习，基于现代计算机视觉的自动化为该练习带来了明显的效率提升。但是，要在自动检查过程中达到很高的精度，就需要大量的数据来表示各种各样的情况。想象一下包装热酱油瓶的装配线必须学习的变量数量：标签可能出现的变形，信息丢失，瓶中酱油的含量，酱油的变色等等。这些异常现象很少发生在现实世界中。通过模拟大量此类异常并生成足够大的数据集以涵盖所有可能的现实世界情况，可以使ML模型更加健壮。

Image credit: OAL

图片来源： OAL

In the next installment of this blog series, we will share more information on how you can leverage tools from Unity to generate large scale synthetic datasets and train a machine learning model for tasks such as object detection.

在本博客系列的下一部分中，我们将分享有关如何利用Unity中的工具生成大规模综合数据集以及训练诸如对象检测之类的机器学习模型的更多信息。

有兴趣了解更多吗？ (Interested in learning more?)

Start generating your own synthetic data with our new service, Unity Simulation. You can also watch this GTC 2020 session to learn more about how to use synthetic data to efficiently train Perception.

使用我们的新服务 Unity Simulation 开始生成自己的综合数据。您还可以观看本GTC 2020会议，以了解更多有关如何使用合成数据来有效训练Perception的信息。

We would love to hear from you – leave a comment below, or contact us at perception@unity3d.com for questions/feedback.

我们很高兴收到您的来信-在下面发表评论，或发送电子邮件至 ception@unity3d.com 与我们联系，以提出问题/反馈。

* All trademarks are the property of their respective owners

* 所有商标均为其各自所有者的财产

Header image created in collaboration with CVC Barcelona and the City of Bellevue

与巴塞罗那CVC 和贝尔维尤市合作创建的页眉图片