吴恩达 DeepLearning 第一课第二周数据库制作教程

最新推荐文章于 2024-08-07 15:35:11 发布

Gingkens

最新推荐文章于 2024-08-07 15:35:11 发布

阅读量3.1k

点赞数 5

分类专栏： AI 文章标签：吴恩达DeepLearning 数据库及制作教程神经网络和深度学习

本文链接：https://blog.csdn.net/qq_34859482/article/details/79223311

版权

AI 专栏收录该内容

13 篇文章 0 订阅

订阅专栏

关于，我为什么要写这篇博客

我的 DeepLearning 是在网易云那里上的课，所以没有作业，也没有一些模块文件以及训练用的数据集，所以就自己编写了数据集，对于没使用过 Python 的我当初是挺烦恼的，也很难找到一些我想需模块的使用文档。所以决定在这里帮助跟我一样遇到麻烦，C币或积分又寥寥无几的朋友们。话不多说，下面就开始！

1- Packages

这里将是我们需要用到的包，请先导入！

对于用到的包，我就不解释了，有兴趣的可以自己搜搜，更详细。

import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from os import walk

% matplotlib inline

2 - 从外存中读取图片

def readImg(roots ="" , label = 1): 
"""
roots表示的是目录路径，可以直接将图片放在一个文件夹A，文件夹A与此模块在同一个目录下便可以直接访问到。
"""

    for (root, dirs, files) in walk(roots):
        images = []

        for image in files:
            fname = root + "/" + image
            image = np.array(ndimage.imread(fname, flatten=False))
            image = scipy.misc.imresize(image, size=(64,64)) #此处的size 可以自己调整，但要确保所有数据的都一致；（64，64）是课程中的Size。
            images.append(image)

        images = np.array(images)        
        labels = (np.zeros((1,images.shape[0])) + label)
        labels =  labels.astype(int)

    return images,labels

3 - 整合数据

利用上面所写的函数，将外存中属于同一个DB的图片读取出来并整合

#所传的第一个参数是我的文件夹的名字
images_test_cat,labels_test_cat = readImg("test_set_cat", label = 1);
images_test_nocat,labels_test_nocat =  readImg("test_set_nocat", label = 0);

images = np.vstack((images_test_cat,images_test_nocat))#纵向合并
labels = np.hstack((labels_test_cat,labels_test_nocat))#横向合并

4- 写入数据库

def write_dataset(dbName,images,labels):
    try:
        f = h5py.File(dbName,"w")
        f.create_dataset("test_set_x",data = images)#第一个参数是数据集的名字
        f.create_dataset("test_set_y",data = labels)
    finally:
        f.close()

5 - 读取数据库

def read_dataset(dbName,xName,yName):
    try:
        f = h5py.File(dbName,"r")
        X = f[xName][:]
        Y = f[yName][:]
    finally:
        f.close()
    return X,Y

6 - lr_utils.py 文件

这个模块文件代码，我贴在下面，大家写作业的时候会用到。希望大家根据这个文件来编写数据库以及数据集的名字，以便完成作业时，自己的变量名与课程的一致。不过自己根据自己写的数据集，运行结果肯定是会有不同的。

import numpy as np
import h5py


def load_dataset():
    train_dataset = h5py.File('train_catvnoncat.h5', "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:])  # your train set features
    print(train_set_x_orig.shape)
    train_set_y_orig = np.array(train_dataset["train_set_y"][:])  # your train set labels
    print(train_set_y_orig.shape)

    test_dataset = h5py.File('test_catvnoncat.h5', "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:])  # your test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:])  # your test set labels

    classes = np.array(test_dataset["list_classes"][:])  # the list of classes

    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    print(train_set_y_orig.shape)
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))

    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes