DeepStyle（第1部分）：使用最新的深度学习生成逼真的高级时装和风格-CSDN博客

本项目利用深度学习生成高定时装设计，旨在激发时尚设计师灵感并预测行业趋势。通过构建定制深度学习框架DeepStyle，从时尚秀场图片中生成新颖服装，结合Faster R-CNN和DCGAN技术，实现高级时装的创新与预测。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

This is a project that I did for the Deep Learning in Computer Vision course back in my home university — Hong Kong University of Science and Technology during Spring 2019. Back then, I was slowly getting to know the world of fashion and style: how to dress better and keep up with the modern trend. I started to watch catwalk shows of high fashion luxury brands that even a random person on the street knows. Dior, Gucci, Louis Vuitton, Chanel, Hermes, Georgio Armani, Cartier, Burberry, and much more. As I watch more, I began to be gradually immersed in the fashion world. The time came when I need to figure out a final project topic for my Computer Vision course, and I figured, why not create a Deep Learning System that will be able to generate good-looking and creative high-fashion clothing? That would be interesting to visualize right?

这是我在2019年Spring在家乡香港科技大学的 计算机视觉深度学习课程中所做的一个项目。那时，我逐渐了解时尚和风格的世界：如何穿得更好，并跟上现代潮流。我开始观看高级时装奢侈品牌的时装秀，即使是大街上的随机人也知道。 Dior，Gucci，Louis Vuitton，Chanel，Hermes，Georgio Armani，Cartier，Burberry等。随着观看次数的增加，我开始逐渐融入时尚世界。到了我需要为我的计算机视觉课程确定最后一个项目主题的时候到了，我想到了， 为什么不创建一个能够生成美观且富有创意的高级服装 的深度学习系统呢？可视化会很有趣吗？

And so my teammate and I created DeepStyle.

因此，我和我的队友创建了DeepStyle 。

什么是DeepStyle？ (What is DeepStyle?)

In short, DeepStyle is the custom deep learning framework that has the ability to generate high fashion clothing items. It can serve as inspiration for fashion designers and also predict the next trendy items in the fashion industry. DeepStyle takes in trendy fashion images and create new items as a way for effective future trend prediction.

简而言之， DeepStyle是自定义的深度学习框架，可以生成高级时装。它可以为时装设计师提供灵感，也可以预测时装界的下一个潮流项目。 DeepStyle吸收流行时尚图像并创建新商品，以进行有效的未来趋势预测。

Our research consists of two parts: building the high luxury fashion database and using AI to generate similar fashion items. For the first part, we will need a reliable source where we can gather all the high luxury fashion images from runways. But other than that, we also want to have a model that can identify the clothing and crop out the rest of the image, because ultimately, it’d be weird if we are generating fake models and audience in the background 😂. After we crop the images to only contain the clothing item itself, then we can feed the images into another model which will be able to generate new clothing items from scratch. Cropping the image is essential to remove noise as much as possible.

我们的研究包括两个部分： 建立高级奢侈品时尚数据库和使用AI生成相似的时尚商品 。对于第一部分，我们将需要一个可靠的来源，在这里我们可以从跑道上收集所有高级时装的图像 。但是除此之外，我们还希望有一个可以识别衣服并裁剪出其余图像的模型，因为最终，如果我们在后台生成假模型和观众会很奇怪。在将图像裁剪为仅包含衣物本身之后，我们可以将图像输入另一个模型，该模型将能够从头开始生成新的衣物 。裁剪图像对于尽可能多地消除噪点至关重要。

框架 (The Framework)

After a brief analysis of the thing we are trying to build, here is a rough framework.

在简要分析我们要构建的东西之后，这里是一个粗略的框架。

The first part of DeepStyle contains Faster R-CNN which is a real-time object detection model that has been proven to achieve state-of-the-art accuracy using its Region Proposal Network. You can read about the official paper here for more details. We will train our Faster R-CNN with the DeepFashion Database which is released by the Chinese University of Hong Kong.

DeepStyle的第一部分包含Faster R-CNN 这是一种实时对象检测模型，已通过“ 区域提议网络”证明了该算法可达到最先进的准确性。您可以在此处阅读有关官方论文的更多详细信息。我们将使用香港中文大学发布的DeepFashion数据库来训练Faster R-CNN。

A quick intro to the DeepFashion Database: it is the largest fashion dataset to date, which consists of around 800k diverse fashion images with various backgrounds, angles, lighting conditions, etc. This dataset consists of four benchmarks used for different purposes, and the one we use for our project is the Category and Attribute Prediction benchmark. This benchmark has 289,222 clothing images and each image is annotated by the bounding box coordinates and corresponding clothing categories.

DeepFashion数据库的快速入门 ：它是迄今为止最大的时装数据集 ，包含约800,000种具有不同背景，角度，光照条件等的各种时装图像 。该数据集包含用于不同目的的四个基准，其中一个我们在项目中使用的是“ 类别和属性预测”基准 。该基准测试包含289,222个服装图像，并且每个图像都由边界框坐标和相应的服装类别进行注释。

After training the Faster R-CNN against the DeepFashion database, the network will be able to predict where the clothing piece is, given any test image. Here is where the Pinterest database comes in. We can build a scraper to scrape the high fashion runways of several large luxury brands from Pinterest and use those as test images for our Faster R-CNN. The reason why we chose Pinterest is because Pinterest provides a lot of clean and high-quality images and also is easy to scrape.

在针对DeepFashion数据库训练了Faster R-CNN之后，网络将能够在给定任何测试图像的情况下预测衣服的位置。这就是Pinterest数据库的来源。我们可以构建一个刮板，以刮擦Pinterest上几个大型奢侈品牌的高级时装跑道，并将这些用作我们Faster R-CNN的测试图像。我们选择Pinterest原因是因为Pinterest提供了许多清晰，高质量的图像，而且很容易刮擦。

After inference, the bounding boxes of the Pinterest images will be predicted and the rest of the image can be cropped out since we only need the specific item. We then finally pass it to our Fashion GAN, which will be implemented using DCGAN or Deep-Convolutional Generative Adversarial Network. Another quick tutorial for GAN: A Generative Adversial Network basically contains two main components: the generator and the discriminator. The generator works hard to create images that look real while the discriminator tries to distinguish real images from fake images. Over time during training, the generator becomes better at generating real images while the discriminator becomes better at figuring what’s real and what’s fake. The final equilibrium is reached when discriminator can no longer figure out whether the images produced by the generator is real or fake.

推断之后，将预测Pinterest图像的边界框，并且可以裁剪出图像的其余部分，因为我们只需要特定的项目。然后，我们最终将其传递给我们的Fashion GAN ，这将使用DCGAN实现或深度卷积生成对抗网络。 GAN的另一个快速教程：生成对抗网络基本上包含两个主要组件：生成器和鉴别器。当鉴别器试图将真实图像与假图像区分开时，生成器努力创建看起来真实的图像。在训练过程中，随着时间的流逝，生成器将更擅长生成真实的图像，而鉴别器则将更擅长判断真实和虚假的事物。当辨别器不再能够确定生成器生成的图像是真实的还是伪造的时，就达到了最终的平衡。

The final result is the set of images produced by the DCGAN. And hopefully, they will look high fashion!

最终结果是DCGAN生成的图像集。希望他们会看起来很时尚！

实作 (Implementation)

步骤1安装Detectron和DeepFashion数据集 (Step 1 Install Detectron and DeepFashion Dataset)

To implement the Faster R-CNN, we are going to use the Detectron library provided by Facebook AI. The Detectron library contains code for implementing state-of-the-art object detection algorithms, like Faster R-CNN, Mask R-CNN, Retina-Net, etc. The official library can be installed via the following steps:

为了实现Faster R-CNN，我们将使用Facebook AI提供的Detectron库。 Detectron库包含用于实现最新对象检测算法的代码，例如Faster R-CNN，Mask R-CNN，Retina-Net等。可以通过以下步骤安装官方库：

Install Caffe2, depending on your CUDA version

安装Caffe2，具体取决于您的CUDA版本

For Caffe2 with CUDA 9 and CuDNN 7 support:
对于具有CUDA 9和CuDNN 7支持的Caffe2：

conda install pytorch-nightly -c pytorch

2. For Caffe2 with CUDA 8 and CuDNN 7 support:

2.对于具有CUDA 8和CuDNN 7支持的Caffe2：

conda install pytorch-nightly cuda80 -c pytorch

After installing Caffe2, now go ahead and install the COCO API.

安装Caffe2之后，现在继续安装COCO API。

# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python setup.py install --user

Now, you can download the official repo and install Detectron.

现在，您可以下载官方存储库并安装Detectron。

git clone https://github.com/facebookresearch/Detectron.git

You can install Python dependencies by:

您可以通过以下方式安装Python依赖项：

cd Detectron
pip install -r requirements.txt

Setup Python modules by:

通过以下方式设置Python模块：

make

You can verify that Detectron has been successfully installed by:

您可以通过以下方式验证Detectron是否已成功安装：

python detectron/tests/test_spatial_narrow_as_op.py

For more installation details, you can refer to here.

有关更多安装详细信息，请参阅此处。

Note: Facebook AI recently released Detectron 2, which is the updated version of Detectron. I used Detectron back when I was working on the project, you can look into Detectron 2 if you’d like.

注意：Facebook AI最近发布了 Detectron 2 ，它是Detectron的更新版本。 在我从事该项目时，我回用了Detectron，您可以根据需要查看Detectron 2。

Next, we want to download the DeepFashion Dataset. You can read more about the dataset specifics from their main page. You can download the datasets from their google drive. The one we want is the Category and Attribute Prediction Benchmark which can be downloaded from here.

接下来，我们要下载DeepFashion数据集。您可以从主页上详细了解数据集的详细信息。您可以从其Google驱动器下载数据集。我们想要的是类别和属性预测基准，可以从此处下载。

步骤2将DeepFashion数据集转换为COCO格式 (Step 2 Convert DeepFashion Dataset to COCO format)

In order to train a model using a custom dataset using Detectron, we will have to first convert the dataset into COCO format. COCO is a large-scale object detection, segmentation, and captioning dataset, but also a standard format for object detection datasets. We can convert the DeepFashion Dataset into COCO format using the following code:

为了使用Detectron使用自定义数据集训练模型，我们必须首先将数据集转换为COCO格式。 COCO是大规模的对象检测，分割和字幕数据集，也是对象检测数据集的标准格式。我们可以使用以下代码将DeepFashion数据集转换为COCO格式：

COCO basically saves the outline of a dataset in a .json file that includes basic information about the object detection dataset. The most notable thing to note is the coco_dict['images'] and coco_dict['annotations'] that tells info about images and their corresponding annotations, respectively.

COCO基本上将数据集的轮廓保存在.json文件中，该文件包含有关对象检测数据集的基本信息。需要注意的最值得注意的事情是coco_dict['images']和coco_dict['annotations']告诉有关图像的信息及其相应的注释。

步骤3在DeepFashion上训练更快的R-CNN模型 (Step 3 Train the Faster R-CNN model on DeepFashion)

After successfully converting the dataset into COCO format, we can finally train our Faster R-CNN model! Before that, we need to first choose the specific variant of the Faster R-CNN we want to use. There are plenty of end-to-end Faster R-CNN variants for us to choose from:

将数据集成功转换为COCO格式后，我们终于可以训练我们的Faster R-CNN模型！在此之前，我们需要首先选择我们要使用的Faster R-CNN的特定变体。有很多端到端的 Faster R-CNN变体供我们选择：

Back then, I trained three different variants: R-50-FPN_1x, R-101-FPN_1x, and X-101–32x8d-FPN_1x. However, for simple illustration purposes, I will just show you how to train R-50-FPN as the steps are the same. You can get a list of the models Detectron support by accessing here.

那时，我训练了三种不同的变体：R-50-FPN_1x，R-101-FPN_1x和X-101–32x8d-FPN_1x。但是，出于简单说明的目的，由于步骤相同，我将仅向您展示如何训练R-50-FPN。您可以通过访问此处获取Detectron支持的型号列表。

After we’ve settled on using Faster R-CNN R-50-FPN_1x, we can head over to configs/12_2017_baselines/ to view the existing model configurations provided. We can find the one we want — which is named e2e_faster_rcnn_R-50-FPN_1x.yaml . There you can see and modify the model and training configuration according to our wish.

在决定使用Faster R-CNN R-50-FPN_1x之后，我们可以转到configs/12_2017_baselines/查看提供的现有模型配置。我们可以找到一个想要的名称，即名为e2e_faster_rcnn_R-50-FPN_1x.yaml 。您可以在此处根据我们的意愿查看和修改模型和训练配置。

Most important thing is to change the training and testing datasets to our DeepFashion dataset, which is already in COCO format. We can also change the solver parameters to those that we want. Mine looks like this:

最重要的是将训练和测试数据集更改为我们的DeepFashion数据集，该数据集已经采用COCO格式。我们还可以将求解器参数更改为所需的参数。我的看起来像这样：

After we finish selecting our model as well as its configuration, now we are finally ready to start training this model! Depending on how many GPUs you have, you can start training by executing:

选择完模型及其配置后，现在我们终于可以开始训练该模型了！根据您拥有的GPU数量，可以通过执行以下步骤开始训练：

Single GPU Training
单GPU培训

python tools/train_net.py \
    --cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_1x.yaml \
    OUTPUT_DIR

2. Multi-GPU Support

2. 多GPU支持

python tools/train_net.py \
    --multi-gpu-testing \
    --cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_1x.yaml \
    OUTPUT_DIR

After running either one of the commands above, the model specified as the --cfg will start training. The output which includes the model parameters, validation set detections, etc. will be saved under /tmp/detectron-output . For more training details, you can refer to here.

运行以上任一命令后，指定为--cfg的模型将开始训练。包含模型参数，验证集检测等/tmp/detectron-output将保存在/tmp/detectron-output 。有关更多培训详细信息，请参阅此处。

步骤4建立高级时装的Pinterest数据库 (Step 4 Build Pinterest database of High Fashion Clothing)

Note: Because of copyright issues, the images above are not actually items in the Pinterest database but just for reference of what the items generally look like.

注意：由于版权问题，上面的图像实际上不是Pinterest数据库中的项目，而只是参考这些项目的外观。

Before we test how well our Faster R-CNN is at locating the clothing items, we need to first have our own test database. We can build a scraper that can scrape fashion runway images from Pinterest. Back then, we searched up the fashion shows from Fall and Spring 2017, 2018, and 2019. The list of brands we scraped included: Burberry, Chanel, Chloe, Dior, Givenchy, Gucci, Hermes, Jimmy Choo, Louis Vuitton, Michael Kors, Prada, Versace, Yves Saint Laurent. We ultimately scraped a total of 10,095 number of fashion images scattered across all those brands.

在测试Faster R-CNN在定位衣服上的性能之前，我们首先需要拥有自己的测试数据库。我们可以构建一个可以从Pinterest上获取时装跑道图像的刮板。那时，我们搜索了2017年秋季和Spring，2018年和2019年的时装秀。我们挑选的品牌列表包括：Burberry，Chanel，Chloe，Dior，Givenchy，Gucci，Hermes，Jimmy Choo，Louis Vuitton，Michael Kors ，普拉达(Prada)，范思哲(Versace)，伊夫·圣洛朗(Yves Saint Laurent)。最终，我们在所有这些品牌中总共刮取了10095个时装图片 。

As much as I’d like to give out the scraper code for Pinterest, I don’t have the code anymore :(. Since I no longer have access to the virtual machine where we used for this project back then. However, the scraper can be built easily with BeautifulSoup as Pinterest images are all static, and not dynamically generated using Javascript.

尽管我想给出Pinterestscraper代码，但我不再有该代码了：( 。由于我再也无法访问当时用于该项目的虚拟机了。可以使用BeautifulSoup轻松构建，因为Pinterest图像都是静态的，而不是使用Javascript动态生成的。

Congrats on making this far! The next step would be to run our Faster R-CNN on the Pinterest Database. And after that, we can start building a DCGAN that will be able to take the clothing images and generate something nice and similar to them. Because this article is already pretty long, we will leave that for Part 2.

恭喜！下一步将是在Pinterest数据库上运行我们的Faster R-CNN。然后，我们可以开始构建DCGAN，该DCGAN将能够拍摄服装图像并生成与它们相似的漂亮图像。因为本文已经很长了，所以我们将其留给第二部分。