cnn初学者—从这入门
Image classification is not a hard topic anymore. Tensorflow has all the inbuilt functionalities that take care of the complex mathematics for us. Without knowing the details of the neural network, we can use a neural network now. In today’s project, I used a Convolutional Neural Network (CNN) which is an advanced version of the neural network. It condenses down a picture to some important features. If you worked with the FashionMNIST dataset that contains shirts, shoe handbags, etc., CNN will figure out important portions of the images. For example, if you see a shoelace, it might be a shoe, if there are a collar and buttons, that might be a shirt or if there is a handle, that might be a handbag.
图像分类不再是硬性话题。 Tensorflow具有所有内置功能,可以为我们解决复杂的数学问题。 在不知道神经网络的细节的情况下,我们现在可以使用神经网络。 在今天的项目中,我使用了卷积神经网络(CNN),它是神经网络的高级版本。 它使图片浓缩为一些重要特征。 如果您使用包含衬衫,鞋子手袋等的FashionMNIST数据集,则CNN会找出图像的重要部分。 例如,如果您看到鞋带,则可能是鞋子,如果有衣领和纽扣,可能是衬衫,或者有提手,可能是手提包。
总览 (Overview)
The simple CNN we will build today to classify a set of images will consist of convolutions and pooling. Inputs get to modify in the convolution layers. You can put one or more convolutions depending on your requirement. Inputs go through several filters and those filters slice through the inputs to learn portions of an input such as the buttons of shirts, the handle of a handbag, or lace of a shoe. I am not going too deeper on it today. Because this article is for beginners.
我们今天将构建的用于对一组图像进行分类的简单CNN包括卷积和池化。 输入可以在卷积层中进行修改。 您可以根据需要放置一个或多个卷积。 输入通过几个过滤器,这些过滤器对输入进行切片,以了解输入的各个部分,例如衬衫的纽扣,手提袋的手柄或鞋子的鞋带。 我今天不会对此做进一步的介绍。 因为本文是针对初学者的。
Pooling is another very important part of CNN. Pooling works on each local region like convolutions but they do not have filters and it is a vector to scalar transformation. The simply compute the average of the region and recognize the pixels with the highest intensity and eliminate the rest. A 2 x 2 pooling will reduce the size of feature maps by a factor of 2. Even if you don’t know the mathematical part of it, you can still solve a deep learning problem. I will explain every line of code for that. Nowadays we have such rich libraries to perform all this amazing work without even knowing much math or coding. Let’s dive in.
池化是CNN的另一个非常重要的部分。 池像卷积一样在每个局部区域上工作,但是它们没有过滤器,它是标量变换的向量。 只需计算区域的平均值,即可识别强度最高的像素,并消除其余像素。 2 x 2的池会将特征图的大小减少2倍。即使您不知道其中的数学部分,仍然可以解决深度学习问题。 我将解释每一行代码。 如今,我们拥有如此丰富的库来执行所有这些令人惊奇的工作,甚至不了解很多数学或编码。 让我们潜入。
CNN开发 (CNN Development)
I used a Google Colab notebook. If you don’t have anaconda and Jupiter notebook installed you can still work on it. Google’s collaboratory notebook is available to everyone. There are lots of youtube videos that are there to learn how to use Google Colab. Please feel free to check those out if Google Colab is not known to you. We will use a dataset that contains the images of cats and dogs. Our goal is to develop a convolutional neural network that will successfully classify cats and dogs from a picture. We are using the dataset from Kaggle.
我使用了Google Colab笔记本。 如果您没有安装anaconda和Jupiter笔记本,您仍然可以使用它。 每个人都可以使用Google的协作笔记本。 有很多youtube视频可供您学习如何使用Google Colab。 如果您不认识Google Colab,请随时查看这些内容。 我们将使用包含猫和狗图像的数据集。 我们的目标是开发一个卷积神经网络,该网络将成功地根据图片对猫和狗进行分类。 我们正在使用Kaggle的数据集。
First import all the required packages and libraries.
首先导入所有必需的软件包和库。
import osimport zipfileimport randomimport tensorflow as tffrom tensorflow.keras.optimizers import RMSpropfrom tensorflow.keras.preprocessing.image import ImageDataGeneratorimport shutil
It’s time to get our dataset. We will use a function named ‘wget’ to bring the dataset in the notebook. Just a reminder, once your Google Collab notebook’s session is over, you have to import the dataset again. Let’s download the full Cats-v-Dogs dataset and store it as cats-and-dogs.zip and save it in a directory named ‘tmp’.
现在该获取我们的数据集了。 我们将使用名为“ wget”的函数将数据集放入笔记本中。 提醒一下,一旦Google Collab笔记本的会话结束,您就必须再次导入数据集。 让我们下载完整的Cats-v-Dogs数据集,并将其存储为cats-and-dogs.zip,并将其保存在名为“ tmp”的目录中。
!wget –no-check-certificate \ "https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_3367a.zip" \ -O "/tmp/cats-and-dogs.zip"
Now extract the data from the zip folder which will generate a directory named ‘temp/PetImages’ with two subdirectories called Cat and Dog. That’s how the data was originally structured.
现在从zip文件夹中提取数据,这将生成一个名为“ temp / PetImages”的目录,其中包含两个名为Cat和Dog的子目录。 这就是数据的原始结构。
local_zip = '/tmp/cats-and-dogs.zip'zip_ref = zipfile.ZipFile(local_zip, 'r')zip_ref.extractall('/tmp')zip_ref.close()
Lets’s check the Cat and Dog folders.
让我们检查一下Cat和Dog文件夹。
print(len(os.listdir('/tmp/PetImages/Cat/')))print(len(os.listdir('/tmp/PetImages/Dog/'))
As the data is available to use, now we need to create a directory named cats-v-dogs and subdirectories training and testing.
由于可以使用这些数据,因此现在我们需要创建一个名为cats-v-dogs的目录以及子目录培训和测试。
try: os.mkdir('/tmp/cats-v-dogs/') os.mkdir('/tmp/cats-v-dogs/training/') os.mkdir('/tmp/cats-v-dogs/testing/')except OSError: pass
Now, split the data for training and testing, place the data in the correct directory with a function split_data. Split_data takes a SOURCE directory containing the files, a TRAINING directory where a slice of the data will be copied to, a TESTING directory where the remaining data will be copied to and a split_size to slice the data.
现在,分割数据以进行训练和测试,并使用split_data函数将数据放置在正确的目录中。 Split_data包含一个包含文件的SOURCE目录,一个将数据片段复制到其中的TRAINING目录,一个将剩余数据复制到其中的TESTING目录以及一个将数据进行切片的split_size。
def split_data(SOURCE, TRAINING, TESTING, SPLIT_SIZE): cont = os.listdir(SOURCE) lenList = len(cont) shuffleList = random.sample(cont, lenList) slicePoint = round(len(shuffleList)*SPLIT_SIZE) for i in range(0, len(shuffleList[:slicePoint])): if os.path.getsize(SOURCE+cont[i]) !=0:
shutil.copy(os.path.join(SOURCE,cont[i]), training)
The code block below checks the remaining files for length and put them in the TESTING directory.
下面的代码块检查剩余文件的长度,并将它们放在TESTING目录中。
for j in range(len(shuffleList[slicePoint:])): if os.path.getsize(SOURCE+cont[j]) !=0: shutil.copy(os.path.join(SOURCE,cont[j]), testing)
The function is ready. Use the split_data function to split the data of the source directory and copy them over to the training and testing directory.
功能已准备就绪。 使用split_data函数拆分源目录中的数据,并将其复制到训练和测试目录中。
CAT_SOURCE_DIR = "/tmp/PetImages/Cat/"TRAINING_CATS_DIR = "/tmp/cats-v-dogs/training/cats/"TESTING_CATS_DIR = "/tmp/cats-v-dogs/testing/cats/"DOG_SOURCE_DIR = "/tmp/PetImages/Dog/"TRAINING_DOGS_DIR = "/tmp/cats-v-dogs/training/dogs/"TESTING_DOGS_DIR = "/tmp/cats-v-dogs/testing/dogs/"split_size = .9split_data(CAT_SOURCE_DIR, TRAINING_CATS_DIR, TESTING_CATS_DIR, split_size)split_data(DOG_SOURCE_DIR, TRAINING_DOGS_DIR, TESTING_DOGS_DIR, split_size)
check the length of the training and testing directory.
检查培训和测试目录的长度。
print(len(os.listdir('/tmp/cats-v-dogs/training/cats/')))print(len(os.listdir('/tmp/cats-v-dogs/training/dogs/')))print(len(os.listdir('/tmp/cats-v-dogs/testing/cats/'))print(len(os.listdir('/tmp/cats-v-dogs/testing/dogs/')))
Data preprocessing is done. Here comes the fun part. We will develop a Keras model to classify the cats and dogs. In this model, we will use three convolutional layers and a pooling layer. You can try it with less or more convolution layers. We will use an activation function and input_shape 150 x 150. This input_shape will reshape all the images into this same square shape. Otherwise, images in the real-world will come in different sizes and shapes. In the first layer, we have filter size is 3 x 3, and the number of filters is 16. Max pooling 2 x 2 will condense the pixels by the factor of 2. We have two more layers with different numbers of filters. You can add extra ‘Conv2D’ and ‘MaxPooling2D’ layers to observe the results.
数据预处理完成。 这里是有趣的部分。 我们将开发Keras模型对猫和狗进行分类。 在此模型中,我们将使用三个卷积层和一个池化层。 您可以在更少或更多的卷积层上进行尝试。 我们将使用激活函数,将input_shape设置为150 x150。此input_shape会将所有图像重塑为相同的正方形。 否则,真实世界中的图像将具有不同的大小和形状。 在第一层中,我们的滤镜大小为3 x 3,滤镜的数量为16。最大合并2 x 2将以2的系数压缩像素。我们还有两层滤镜数量不同的图层。 您可以添加额外的“ Conv2D”和“ MaxPooling2D”层以观察结果。
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(150, 150, 3)), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(32, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(64, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(512, activation='relu'), tf.keras.layers.Dense(1, activation='sigmoid')model.compile(optimizer=RMSprop(lr=0.001), loss='binary_crossentropy', metrics=['acc'])
In the compile function, we should pass at least optimizer and loss parameters. Here learning rate is 0.001. It is important to choose a reasonable learning rate. The too small and too big learning rate can make the network inefficient. The next step is to normalize the data.
在编译函数中,我们至少应传递优化器和损失参数。 这里的学习率为0.001。 选择合理的学习速度很重要。 学习率太小或太小都会使网络效率低下。 下一步是标准化数据。
from tensorflow.keras.preprocessing.image import ImageDataGeneratorbase_dir = '/tmp/cats-v-dogs'TRAINING_DIR = os.path.join(base_dir, 'training')train_datagen = ImageDataGenerator(rescale = 1.0/255)train_generator = train_datagen.flow_from_directory(TRAINING_DIR, batch_size=20, class_mode='binary', target_size=(150, 150))
ImageDataGenerator helps to normalize the pixels’ values and make them in between 0 and 1. Originally the values can be 0 to 255 as you may already know. Then we pass our data in batches for training. Here we are providing batch_size 20. We need to normalize the testing data in the same way:
ImageDataGenerator有助于规范化像素的值并使它们在0到1之间。您可能已经知道,最初的值可以是0到255。 然后,我们将数据分批传递以进行培训。 在这里,我们提供batch_size20。我们需要以相同的方式对测试数据进行规范化:
ALIDATION_DIR =os.path.join(base_dir, 'testing')validation_datagen = ImageDataGenerator(rescale = 1.0/255)validation_generator = validation_datagen.flow_from_directory(VALIDATION_DIR, batch_size=20, class_mode='binary', target_size=(150, 150))
Now train the model. Let’s train it with 15 epochs. Please feel free to test with more or fewer epochs. You should keep track of 4 parameters. Loss, accuracy, validation loss, and validation accuracy. The loss should go down and accuracy should go up with every epoch.
现在训练模型。 让我们以15个纪元来训练它。 请随时以更多或更少的时期进行测试。 您应该跟踪4个参数。 损失,准确性,验证损失和验证准确性。 损失应该减少,准确性应该随着每个时期而提高。
history = model.fit_generator(train_generator, epochs=15, verbose=1, validation_data=validation_generator)
I got 89.51% accuracy in training set and 91.76% accuracy on validation data. I have to mention one thing here. That is, if accuracy on the training set is very high and accuracy in test set or validation set is not that good, that is an overfitting problem. It means model learned training dataset so well that it only knows that training data very well it’s not good for other unseen data. But that’s not our goal. Our goal is to develop a model that is good for the overall most dataset out there. When you see overfitting, you need to modify the training parameter. Probably less number of epochs, different learning rate. We will talk about how to deal with overfitting in a later article.
我在训练集上的准确度为89.51%,在验证数据上的准确度为91.76%。 我不得不在这里提到一件事。 也就是说,如果训练集的准确性很高,而测试集或验证集的准确性不是那么好,那就是过度拟合的问题。 这意味着对学习的训练数据集进行建模非常好,以至于只知道训练数据非常好,对其他看不见的数据不利。 但这不是我们的目标。 我们的目标是开发一个模型,该模型适合那里的大多数数据集。 当您看到过度拟合时,您需要修改训练参数。 大约更少的时期,不同的学习率。 在后面的文章中,我们将讨论如何处理过度拟合问题。
cnn初学者—从这入门