PyTorch MNIST Dataset

1. MNIST Dataset

https://pytorch.org/vision/main/generated/torchvision.datasets.MNIST.html

torchvision.datasets.MNIST(root: Union[str, Path], train: bool = True, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, download: bool = False)

Parameters:

  • root (str or pathlib.Path) - Root directory of dataset where MNIST/raw/train-images-idx3-ubyte and MNIST/raw/t10k-images-idx3-ubyte exist.

  • train (bool, optional) - If True, creates dataset from train-images-idx3-ubyte, otherwise from t10k-images-idx3-ubyte.

  • download (bool, optional) - If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

  • transform (callable, optional) - A function/transform that takes in a PIL image and returns a transformed version. E.g, transforms.RandomCrop

  • target_transform (callable, optional) - A function/transform that takes in the target and transforms it.

Special-members:

__getitem__(index: int) -> Tuple[Any, Any]
  • Parameters: index (int)

  • Returns: (image, target) where target is index of the target class.

  • Return type: tuple

2. Source code for torchvision.datasets.mnist

https://pytorch.org/vision/main/_modules/torchvision/datasets/mnist.html

    mirrors = [
        "http://yann.lecun.com/exdb/mnist/",
        "https://ossci-datasets.s3.amazonaws.com/mnist/",
    ]

    resources = [
        ("train-images-idx3-ubyte.gz", "f68b3c2dcbeaaa9fbdd348bbdeb94873"),
        ("train-labels-idx1-ubyte.gz", "d53e105ee54ea40749a09fcbcd1e9432"),
        ("t10k-images-idx3-ubyte.gz", "9fb629c4189551a2d022fa330f9573f3"),
        ("t10k-labels-idx1-ubyte.gz", "ec29112dd5afa0611ce80d1b7f02629c"),
    ]

    training_file = "training.pt"
    test_file = "test.pt"
    classes = [
        "0 - zero",
        "1 - one",
        "2 - two",
        "3 - three",
        "4 - four",
        "5 - five",
        "6 - six",
        "7 - seven",
        "8 - eight",
        "9 - nine",
    ]

可以通过下面的链接在浏览器中下载,复制到 data/MNIST/raw/ 目录下。

https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz

2.1. mnist-dataset.py

/home/yongqiang/llm_work/ggml_25_02_15/ggml/examples/mnist/mnist-dataset.py

import torch
import torchvision
import torchvision.datasets
import torchvision.transforms
import numpy as np
import matplotlib.pyplot as plt

print(torch.__version__)

train_data = torchvision.datasets.MNIST(root='./data', train=True, transform=torchvision.transforms.ToTensor(), download=True)
test_data  = torchvision.datasets.MNIST(root='./data', train=False, transform=torchvision.transforms.ToTensor(), download=True)

assert len(train_data) == 60000
assert len(test_data)  == 10000
print("len(train_data) =", len(train_data))
print("len(test_data) =", len(test_data))

print("type(train_data[0]):", type(train_data[0]))

print("train_data[0].shape:", train_data[0][0].shape)

classes = train_data.classes
print("train_data.classes:", classes)
print("train_data.class_to_idx: ", train_data.class_to_idx)

def ImShow(sample_element, shape = (28, 28)):
    plt.imshow(sample_element[0].numpy().reshape(shape), cmap='gray')
    plt.title('Label = ' + str(sample_element[1]))
    plt.show()

ImShow(train_data[0])

(base) yongqiang@yongqiang:~/llm_work/ggml_25_02_15/ggml/examples/mnist$ python mnist-dataset.py
2.5.1
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 9.91M/9.91M [00:03<00:00, 3.20MB/s]
Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 28.9k/28.9k [00:00<00:00, 117kB/s]
Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 1.65M/1.65M [00:02<00:00, 656kB/s]
Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4.54k/4.54k [00:00<00:00, 3.05MB/s]
Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw

len(train_data) = 60000
len(test_data) = 10000
type(train_data[0]): <class 'tuple'>
train_data[0].shape: torch.Size([1, 28, 28])
train_data.classes: ['0 - zero', '1 - one', '2 - two', '3 - three', '4 - four', '5 - five', '6 - six', '7 - seven', '8 - eight', '9 - nine']
train_data.class_to_idx:  {'0 - zero': 0, '1 - one': 1, '2 - two': 2, '3 - three': 3, '4 - four': 4, '5 - five': 5, '6 - six': 6, '7 - seven': 7, '8 - eight': 8, '9 - nine': 9}
(base) yongqiang@yongqiang:~/llm_work/ggml_25_02_15/ggml/examples/mnist$

在这里插入图片描述

3. SSL: CERTIFICATE_VERIFY_FAILED

https://github.com/pytorch/pytorch/issues/33288

<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)>

<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)>

<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)>

<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)>

<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833)>

<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)>

取消证书验证:

import ssl
ssl._create_default_https_context = ssl._create_unverified_context

References

[1] Yongqiang Cheng, https://yongqiang.blog.csdn.net/

为了在Windows安装ADB工具,你可以按照以下步骤进行操作: 1. 首先,下载ADB工具包并解压缩到你自定义的安装目录。你可以选择将其解压缩到任何你喜欢的位置。 2. 打开运行窗口,可以通过按下Win+R键来快速打开。在运行窗口中输入"sysdm.cpl"并按下回车键。 3. 在系统属性窗口中,选择"高级"选项卡,然后点击"环境变量"按钮。 4. 在环境变量窗口中,选择"系统变量"部分,并找到名为"Path"的变量。点击"编辑"按钮。 5. 在编辑环境变量窗口中,点击"新建"按钮,并将ADB工具的安装路径添加到新建的路径中。确保路径正确无误后,点击"确定"按钮。 6. 返回到桌面,打开命令提示符窗口。你可以通过按下Win+R键,然后输入"cmd"并按下回车键来快速打开命令提示符窗口。 7. 在命令提示符窗口中,输入"adb version"命令来验证ADB工具是否成功安装。如果显示版本信息,则表示安装成功。 这样,你就成功在Windows安装ADB工具。你可以使用ADB工具来执行各种操作,如枚举设备、进入/退出ADB终端、文件传输、运行命令、查看系统日志等。具体的操作方法可以参考ADB工具的官方文档或其他相关教程。\[1\]\[2\]\[3\] #### 引用[.reference_title] - *1* [windows环境安装adb驱动](https://blog.csdn.net/zx54633089/article/details/128533343)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insertT0,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* *3* [Windows安装使用ADB简单易懂教程](https://blog.csdn.net/m0_37777700/article/details/129836351)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insertT0,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Yongqiang Cheng

梦想不是浮躁,而是沉淀和积累。

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值