深度学习常用公开数据集

高启强668

已于 2022-01-27 23:45:44 修改

阅读量6.1k

点赞数

分类专栏：深度学习文章标签：深度学习人工智能

于 2022-01-26 21:37:06 首次发布

本文链接：https://blog.csdn.net/zhognsc08/article/details/122708751

版权

深度学习专栏收录该内容

7 篇文章

订阅专栏

一、inpainting相关算法

1、Place2数据集。Places365-Standard 180万张，365个类别，用于训练，每个类别50张用于验证，900张用于测试。有高分别率和低分辨率256x256。

Places2: A Large-Scale Database for Scene Understandinghttp://places2.csail.mit.edu/download.html

2、CelebA数据集

CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including

10,177 number of identities,
202,599 number of face images, and
5 landmark locations, 40 binary attributes annotations per image.

The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face recognition, face detection, landmark (or facial part) localization, and face editing & synthesis.

http://mmlab.ie.cuhk.edu.hk/projects/CelebA.htmlhttp://mmlab.ie.cuhk.edu.hk/projects/CelebA.html

二、超分辨率SR

1、DIV2K数据集

We are making available a large newly collected dataset -DIV2K- of RGB images with a large diversity of contents.

The DIV2K dataset is divided into:

train data: starting from 800 high definition high resolution images we obtain corresponding low resolution images and provide both high and low resolution images for 2, 3, and 4 downscaling factors
validation data: 100 high definition high resolution images are used for genereting low resolution corresponding images, the low res are provided from the beginning of the challenge and are meant for the participants to get online feedback from the validation server; the high resolution images will be released when the final phase of the challenge starts.
test data: 100 diverse images are used to generate low resolution corresponding images; the participants will receive the low resolution images when the final evaluation phase starts and the results will be announced after the challenge is over and the winners are decided.

DIV2K Datasethttps://data.vision.ee.ethz.ch/cvl/DIV2K/

2、Fickr2K

https://cv.snu.ac.kr/research/EDSR/Flickr2k.tarhttps://cv.snu.ac.kr/research/EDSR/Flickr2k.tar

3、Manga109

This data set (hereafter referred to as Manga109) has been compiled by the Aizawa Yamasaki Matsui Laboratory, Department of Information and Communication Engineering, the Graduate School of Information Science and Technology, the University of Tokyo. The compilation is intended for use in academic research on the media processing of Japanese manga. Manga109 is composed of 109 manga volumes drawn by professional manga artists in Japan. These manga were commercially made available to the public between the 1970s and 2010s, and encompass a wide range of target readerships and genres (see the table in Explore for further details.) Most of the manga in the compilation are available at the manga library “Manga Library Z” (formerly the “Zeppan Manga Toshokan” library of out-of-print manga).

Manga109http://www.manga109.org/en/

三、分类识别检测分割等

1、MNIST

The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.

It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burgeshttp://yann.lecun.com/exdb/mnist/

2、The CIFAR-10/100 dataset

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.

The CIFAR-100 dataset

This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs).

CIFAR-10 and CIFAR-100 datasetshttp://www.cs.toronto.edu/~kriz/cifar.html

3、 COCO

COCO is a large-scale object detection, segmentation, and captioning dataset. COCO has several features:

Object segmentation

Recognition in context

Superpixel stuff segmentation

330K images (>200K labeled)

1.5 million object instances

80 object categories

91 stuff categories

5 captions per image

250,000 people with keypoints

COCO - Common Objects in Contexthttps://cocodataset.org/#home

4、ImageNet

The most highly-used subset of ImageNet is the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012-2017 image classification and localization dataset. This dataset spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. This subset is available on Kaggle.

ImageNethttps://image-net.org/download.php

5、Pascal VOC

The Pascal VOC challenge is a very popular dataset for building and evaluating algorithms for image classification, object detection, and segmentation. However, the website goes down like all the time.

Pascal VOC Dataset MirrorHere is a mirror for the Pascal VOC files in case, you know, you want to download them at a somewhat decent rate.https://pjreddie.com/projects/pascal-voc-dataset-mirror/

6、PASCAL-Context数据集

This dataset is a set of additional annotations for PASCAL VOC 2010. It goes beyond the original PASCAL semantic segmentation task by providing annotations for the whole scene. The statistics section has a full list of 400+ labels.

PASCAL-Context Datasethttps://cs.stanford.edu/~roozbeh/pascal-context/

7、Fashion MNIST

Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Zalando intends Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

The original MNIST dataset contains a lot of handwritten digits. Members of the AI/ML/Data Science community love this dataset and use it as a benchmark to validate their algorithms. In fact, MNIST is often the first dataset researchers try. "If it doesn't work on MNIST, it won't work at all", they said. "Well, if it does work on MNIST, it may still fail on others."

Zalando seeks to replace the original MNIST dataset

Fashion MNIST | KaggleAn MNIST-like dataset of 70,000 28x28 labeled fashion imageshttps://www.kaggle.com/zalando-research/fashionmnist

8、Caltech 101

Pictures of objects belonging to 101 categories. About 40 to 800 images per category. Most categories have about 50 images. Collected in September 2003 by Fei-Fei Li, Marco Andreetto, and Marc 'Aurelio Ranzato. The size of each image is roughly 300 x 200 pixels.
We have carefully clicked outlines of each object in these pictures, these are included under the 'Annotations.tar'. There is also a matlab script to view the annotaitons, 'show_annotations.m'.

Caltech101http://www.vision.caltech.edu/Image_Datasets/Caltech101/

9、Helen dataset 人脸特征定位

In our effort of building a facial feature localization algorithm that can operate reliably and accurately under a broad range of appearance variation, including pose, lighting, expression, occlusion, and individual differences, we realize that it is necessary that the training set include high resolution examples so that, at test time, a high resolution test image can be fit accurately. Although a number face databases exist, we found none that meet our requirements, particularly the resolution requirement.

http://www.ifp.illinois.edu/~vuongle2/helen/http://www.ifp.illinois.edu/~vuongle2/helen/

10、LFW Labeled Faces in the Wild Home 人脸识别

Labeled Faces in the Wild is a public benchmark for face verification, also known as pair matching. No matter what the performance of an algorithm on LFW, it should not be used to conclude that an algorithm is suitable for any commercial purpose. There are many reasons for this. Here is a non-exhaustive list:

Face verification and other forms of face recognition are very different problems. For example, it is very difficult to extrapolate from performance on verification to performance on 1:N recognition.
Many groups are not well represented in LFW. For example, there are very few children, no babies, very few people over the age of 80, and a relatively small proportion of women. In addition, many ethnicities have very minor representation or none at all.
While theoretically LFW could be used to assess performance for certain subgroups, the database was not designed to have enough data for strong statistical conclusions about subgroups. Simply put, LFW is not large enough to provide evidence that a particular piece of software has been thoroughly tested.
Additional conditions, such as poor lighting, extreme pose, strong occlusions, low resolution, and other important factors do not constitute a major part of LFW. These are important areas of evaluation, especially for algorithms designed to recognize images “in the wild”.

For all of these reasons, we would like to emphasize that LFW was published to help the research community make advances in face verification, not to provide a thorough vetting of commercial algorithms before deployment.

http://vis-www.cs.umass.edu/lfw/http://vis-www.cs.umass.edu/lfw/

11、The Cityscapes Dataset

We present a new large-scale dataset that contains a diverse set of stereo video sequences recorded in street scenes from 50 different cities, with high quality pixel-level annotations of 5 000 frames in addition to a larger set of 20 000 weakly annotated frames. The dataset is thus an order of magnitude larger than similar previous attempts

Cityscapes Dataset – Semantic Understanding of Urban Street Sceneshttps://www.cityscapes-dataset.com

12、ADE20K

The annotated images cover the scene categories from the SUN and Places database. Here there are some examples showing the images, object segmentations, and parts segmentations:

ADE20K datasethttp://groups.csail.mit.edu/vision/datasets/ADE20K/

13、BSDS500

This new dataset is an extension of the BSDS300, where the original 300 images are used for training / validation and 200 fresh images, together with human annotations, are added for testing. Each image was segmented by five different subjects on average. Performance is evaluated by measuring Precision / Recall on detected boundaries and three additional region-based metrics. UC Berkeley Computer Vision Group - Contour Detection and Image Segmentation - Resources

四、效果风格

1、MIT-Adobe FiveK Dataset

We collected 5,000 photographs taken with SLR cameras by a set of different photographers. They are all in RAW format; that is, all the information recorded by the camera sensor is preserved. We made sure that these photographs cover a broad range of scenes, subjects, and lighting conditions. We then hired five photography students in an art school to adjust the tone of the photos. Each of them retouched all the 5,000 photos using a software dedicated to photo adjustment (Adobe Lightroom) on which they were extensively trained. We asked the retouchers to achieve visually pleasing renditions, akin to a postcard. The retouchers were compensated for their work.

MIT-Adobe FiveK datasethttps://data.csail.mit.edu/graphics/fivek/