【笔记】Cifar-100数据集介绍：它有100个类，每个类包含600个图像，其中500个训练图像和100个测试图像；100类实际是由20个类(每个类又包含5个子类)构成(5*20=100)

程序猿的探索之路

已于 2022-07-30 20:23:21 修改

阅读量7.5k

点赞数

分类专栏：小菜鸡加油文章标签： python

于 2022-07-30 20:09:50 首次发布

原文链接：https://blog.csdn.net/hxxjxw/article/details/115529194?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522165918277416780357232168%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fblog.%2522%257D&request_id=165918277416780357232168&biz_id=0&

版权

小菜鸡加油专栏收录该内容

399 篇文章

订阅专栏

这篇笔记介绍了如何将CIFAR-100数据集从原始格式转换为图片，并保存为jpg格式。同时，提供了加载CIFAR-100数据集的代码，包括训练集和测试集，用于图像处理和分析。文章还强调了CIFAR-100数据集的特点，即包含100个类别，每个类别有600张图像，其中500张用于训练，100张用于测试，类别分为20个大类。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

笔记1：

cifar10与cifar100下载地址：CIFAR-10 and CIFAR-100 datasets

Cifar100 转图片：

def cifar100_to_images():
  tar_dir='../data/cifar-100-python/' #原始数据库目录
  train_root_dir='../data/cifar100/train/' #图片保存目录
  test_root_dir='../data/cifar100/test/'
  if not os.path.exists(train_root_dir):
    os.makedirs(train_root_dir)
  if not os.path.exists(test_root_dir):
    os.makedirs(test_root_dir)
 
  #获取label对应的class，分为20个coarse class，共100个 fine class
  meta_Name = tar_dir+"meta" 
  Meta_dic= unpickle(meta_Name)
  coarse_label_names=Meta_dic['coarse_label_names']
  fine_label_names=Meta_dic['fine_label_names']
  print(fine_label_names)
 
  #生成训练集图片，如果需要png格式，只需要改图片后缀名即可。
  dataName = tar_dir+"train" 
  Xtr = unpickle(dataName)
  print(dataName + " is loading...")
  for i in range(0,Xtr['data'].shape[0]):
      img = np.reshape(Xtr['data'][i], (3, 32, 32))  # Xtr['data']为图片二进制数据
      img = img.transpose(1, 2, 0)  # 读取image
      ###img_name:fine_label+coarse_label+fine_class+coarse_class+index
      picName = train_root_dir + str(Xtr['fine_labels'][i])+ '_' + str(Xtr['coarse_labels'][i]) + '_&' +fine_label_names[Xtr['fine_labels'][i]]+'&_'+coarse_label_names[ Xtr['coarse_labels'][i]]+'_'+str(i) + '.jpg' 
      cv2.imwrite(picName, img)
  print(dataName + " loaded.")
 
  print("test_batch is loading...")
  # 生成测试集图片
  testXtr = unpickle(tar_dir+"test")
  for i in range(0, testXtr['data'].shape[0]):
      img = np.reshape(testXtr['data'][i], (3, 32, 32))
      img = img.transpose(1, 2, 0)
      picName = test_root_dir +str(testXtr['fine_labels'][i])+ '_' + str(testXtr['coarse_labels'][i]) + '_&' +fine_label_names[testXtr['fine_labels'][i]]+'&_'+coarse_label_names[ testXtr['coarse_labels'][i]]+'_'+str(i) + '.jpg' 
      cv2.imwrite(picName, img)
  print("test_batch loaded.")

结果：文件名为 # fine_label+coarse_label+fine_class+coarse_class+index.jpg

fine表示类，coarse表示大类。

笔记2：

CIFAR100和CIFAR-10类似，它有100个类，每个类包含600个图像，其中500个训练图像和100个测试图像

100类实际是由20个类(每个类又包含5个子类)构成(5*20=100)。

# -*- coding:utf-8 -*-
import pickle as p
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as plimg
from PIL import Image
 
def load_CIFAR_batch(filename):
    """ load single batch of cifar """
    with open(filename, 'rb')as f:
        datadict = p.load(f,encoding='bytes')
        #X = datadict[b'data']
        #Y = datadict[b'labels']
        #X = X.reshape(10000, 3, 32, 32)
        X = datadict[b'data']
        Y = datadict[b'coarse_labels']+datadict[b'fine_labels']
        X = X.reshape(50000, 3, 32, 32)
        Y = np.array(Y)
        return X, Y
 
 
if __name__ == "__main__":
    #imgX, imgY = load_CIFAR_batch("./cifar-10-batches-py/data_batch_1")
    imgX, imgY = load_CIFAR_batch("dataset/cifar-100-python/train")
    print(imgX.shape)
    print("正在保存图片:")
    for i in range(imgX.shape[0]):
        imgs = imgX[i]
        if i < 100:#只循环100张图片,这句注释掉可以便利出所有的图片,图片较多,可能要一定的时间
            img0 = imgs[0]
            img1 = imgs[1]
            img2 = imgs[2]
            i0 = Image.fromarray(img0)
            i1 = Image.fromarray(img1)
            i2 = Image.fromarray(img2)
            img = Image.merge("RGB",(i0,i1,i2))
            name = "img" + str(i)+".png"
            img.save("dataset/cifar-100-python/extract-pic1/"+name,"png")#文件夹下是RGB融合后的图像
            for j in range(imgs.shape[0]):
                img = imgs[j]
                name = "img" + str(i) + str(j) + ".jpg"
                print("正在保存图片" + name)
                plimg.imsave("dataset/cifar-100-python/extract-pic2/" + name, img)#文件夹下是RGB分离的图像
    print("保存完毕.")