Fashion-Gen: The Generative Fashion Dataset and Challenge 论文解读&数据集介绍

最新推荐文章于 2025-04-20 08:31:21 发布

曹家小圆宝

最新推荐文章于 2025-04-20 08:31:21 发布

阅读量3k

点赞数 1

分类专栏： python 数据集文章标签： python 人工智能

本文链接：https://blog.csdn.net/fighting_Kitty/article/details/120005988

版权

本文详细解读了Fashion-Gen论文，介绍了包含48个主类和121个子类的时尚图像数据集，分析了训练集和测试集的类别分布，以及图像和文本描述的统计信息。此外，讨论了使用P-GANs进行高分辨率图像生成和Text-to-Image合成的挑战，评估方法包括Inception Score和Human Evaluation。提供了数据集下载链接和初步分析代码，展示了数据集结构和部分键的内容。最后，预告了后续的数据内容分析和可视化工作。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

论文解读

论文地址：https://arxiv.org/abs/1806.08317

数据集划分

数目	train	val	test
293, 008	260, 480	32, 528	32, 528

类别介绍

数据集中有48个主类，121个子类。
如下是训练集、测试集中类别占比
在这里插入图片描述

图片统计

如下是训练集中主类、子类的图片数目统计
在这里插入图片描述

文本描述

如下是文本描述长度的统计
在这里插入图片描述
如下是从文本中提取的颜色分布

chanllenge

Generating high-resolution images using P-GANs
Text-to-Image synthesis

评估方法

Inception Score
Human Evaluation（因为Inception Score没考虑文本图片之间的相关性）

数据集下载

看到FashionBERT论文里的数据集FashionGEN，想了解一下，但是官网上填了个表单就没信了，地址为：https://fashion-gen.com/于是又在网上找了相关内容，找到一个网址https://github.com/menardai/FashionGenAttnGAN

上面有3个文件（注：没有提供测试集，论文中说不会提供测试集，被集成在了论文的docker中）

fashiongen_256_256_train.h5
fashiongen_256_256_validation.h5
fashiongen_consume_data_example.pdf

分析代码

参考https://docs.h5py.org/en/stable/quick.html用以下代码进行分析

import h5py
import numpy as np
BATCH_SIZE = 32

def get_batch(file_h5, features, batch_number, batch_size=32):
    """Get a batch of the dataset
    Args:
        file_h5(str): path of the dataset
        features(list(str)): list of names of features present in the dataset
        that should be returned.
        batch_number(int): the id of the batch to be returned.
        batch_size(int): the mini-batch size
    Returns:
        A list of numpy arrays of the requested features"""
    list_of_arrays = []
    lb, ub = batch_number * batch_size, (batch_number + 1) * batch_size
    for feature in features:
        list_of_arrays.append(file_h5[feature][lb: ub])
    return list_of_arrays

# open the file
# file_h5 = h5py.File('fashiongen_256_256_train.h5', mode='r')
file_h5 = h5py.File('fashiongen_256_256_validation.h5', mode='r')
# define the features to be retrieved
list_of_features = ['input_image', 'input_description']
dataset_len = len(file_h5['input_image'])  
nb_batches = int(dataset_len / BATCH_SIZE)
batch_nb = np.random.randint(0, nb_batches)
# get the first batch of the data
list_of_arrays = get_batch(file_h5, list_of_features, batch_nb, BATCH_SIZE)
# close the file
file_h5.close()

得到训练集数目260490、验证集数目32528
数据集是个类似dict的结构，keys分别为
[‘index’, ‘index_2’, ‘input_brand’, ‘input_category’, ‘input_composition’, ‘input_concat_description’, ‘input_department’, ‘input_description’, ‘input_gender’, ‘input_image’, ‘input_msrpUSD’, ‘input_name’, ‘input_pose’, ‘input_productID’, ‘input_season’, ‘input_subcategory’]

图片的维度为：

(256, 256, 3)

内容分析

以验证集为例，接下来一个一个分析内容

index

file_h5['index'].shape
# (32528, 1)
file_h5['index'][0:][0:]
# 输出以下
[[    24]
 [    25]
 [    26]
 ...
 [342153]
 [342154]
 [342155]]

index_2

file_h5['index_2'].shape
# (32528,)
file_h5['index_2'][0:]
# 输出以下
[    0     1     2 ... 32525 32526 32527]

input_brand

file_h5[

最低0.47元/天解锁文章