怎么用ai恢复老照片_基于AI的照片恢复

最新推荐文章于 2024-07-25 13:36:08 发布

cullen2012

最新推荐文章于 2024-07-25 13:36:08 发布

阅读量7.5k

点赞数

文章标签：计算机视觉机器学习人工智能深度学习神经网络

原文链接：https://habr.com/en/company/mailru/blog/459696/

版权

本文介绍了一个利用AI技术修复旧军事照片的项目，包括图像缺陷检测、图像修复和颜色恢复三个步骤。文章详细阐述了每一步的技术挑战、解决方案及优化策略，如使用Unet模型进行图像分割和修复，以及如何通过鉴别器和感知损失提高颜色恢复的准确性。此外，还强调了评估者在数据标注和结果验证过程中的重要性。

摘要由CSDN通过智能技术生成

怎么用ai恢复老照片

Hi everybody! I’m a research engineer at the Mail.ru Group computer vision team. In this article, I’m going to tell a story of how we’ve created AI-based photo restoration project for old military photos. What is «photo restoration»? It consists of three steps:

大家好！我是Mail.ru集团计算机视觉团队的研究工程师。在本文中，我将讲一个故事，说明我们如何为旧军事照片创建基于AI的照片恢复项目。什么是“照片恢复”？它包括三个步骤：

we find all the image defects: fractures, scuffs, holes;
我们发现所有图像缺陷：断裂，磨损，破洞；
we inpaint the discovered defects, based on the pixel values around them;
我们根据发现的缺陷周围的像素值对其进行修补；
we colorize the image.
我们为图像着色。

Further, I’ll describe every step of photo restoration and tell you how we got our data, what nets we trained, what we accomplished, and what mistakes we made.

此外，我将描述照片恢复的每个步骤，并告诉您我们如何获取数据，培训了哪些网络，完成了哪些工作以及犯了哪些错误。

寻找缺陷 (Looking for defects)

We want to find all the pixels related to defects in an uploaded photo. First, we need to figure out what kind of pictures will people upload. We talked to the founders of «Immortal Regiment» project, a non-commercial organization storing the legacy photos of WW2, who shared their data with us. Upon analyzing it, we noticed that people upload mostly individual or group portraits with a moderate to a large number of defects.

我们要查找与上传的照片中的缺陷相关的所有像素。首先，我们需要弄清楚人们会上传什么样的图片。我们与“不朽军团”项目的创始人进行了交谈，该项目是一个非商业性组织，存储了WW2的旧照片，并与我们共享了数据。经过分析，我们注意到人们上传的个人或团体肖像大多带有中等到大量的缺陷。

Then we had to collect a training set. The training set for a segmentation task is an image and a mask where all the defects are marked. The easiest way to do it is to let the assessors create the segmentation masks. Of course, people know very well how to find defects, but that would take too long.

然后我们必须收集训练集。分割任务的训练集是标记所有缺陷的图像和蒙版。最简单的方法是让评估者创建细分蒙版。当然，人们非常了解如何找到缺陷，但这将花费很长时间。

It can take one hour or the whole workday to mark the defect pixels in one photo. Therefore, it’s not easy to collect a training set of more than 100 images in a few weeks. That’s why we tried to augment our data and created our own defects: we’d take a good photo, add defects using random walks on the image, and end up with a mask showing the image parts with the defects. Without augmentations, we’ve got 68 manually labeled photos in training set and 11 photos in the validation set.

标记一张照片中的缺陷像素可能需要一个小时或整个工作日。因此，在几周内收集超过100张图像的训练集并不容易。这就是为什么我们试图扩充数据并创建自己的缺陷的原因：我们将拍摄一张好照片，使用图像上的随机游走添加缺陷，最后得到一个遮罩，以显示带有缺陷的图像部分。不进行扩充，我们在训练集中有68张手动标记的照片，在验证集中有11张照片。

The most popular segmentation approach: take Unet with pre-trained encoder and minimize the sum of BCE (binary cross-entropy) and DICE (Sørensen–Dice coefficient).

最受欢迎的分割方法：将Unet与经过预训练的编码器结合使用，并最小化BCE( 二进制交叉熵 )和DICE( Sørensen–Dice系数 )之和。

What problems arise when we use this segmentation approach for our task?

当我们使用这种细分方法完成任务时会出现什么问题？

Even if it looks like there are tons of defects in the photo, that it’s very old and shabby, the area with defects is still much smaller than the undamaged one. To solve this issue, we can increase the positive class weight in BCE; an optimal weight would be the ratio of clean pixels to defective ones.
即使看起来照片中有很多缺陷，而且非常破旧，但缺陷区域仍然比未损坏的区域小得多。为了解决这个问题，我们可以