卷积神经网络 手势识别_如何构建识别手语手势的卷积神经网络

卷积神经网络 手势识别

by Vagdevi Kommineni

通过瓦格德维·科米尼(Vagdevi Kommineni)

如何构建识别手语手势的卷积神经网络 (How to build a convolutional neural network that recognizes sign language gestures)

Sign language has been a major boon for people who are hearing- and speech-impaired. But it can serve its purpose only when the other person can understand sign language. Thus it would be really nice to have a system which could convert the hand gesture image to the corresponding English letter. And so the aim of this post is to build such an American Sign Language Recognition System.

手语一直是听力和言语障碍人士的主要福音。 但是,只有当其他人能够理解手语时,它才能达到目的。 因此,拥有一个可以将手势图像转换为相应英文字母的系统真的很不错。 因此,本文的目的是建立这样的美国手语识别系统。

Wikipedia has defined ASL as the following:

维基百科将ASL定义如下:

American Sign Language (ASL) is a natural language that serves as the predominant sign language of Deaf communities in the United States and most of Anglophone Canada.

美国手语 ( ASL )是一种自然语言 ,是美国和加拿大大部分聋人社区的主要手语

First, the data: it is really important to remember the diversity of image classes with respect to influential factors like lighting conditions, zooming conditions etc. Kaggle data on ASL has all such different variants. Training on such data makes sure our model has pretty good knowledge of each class. So, let's work on the Kaggle data.

首先,数据:记住影响照明条件,缩放条件等影响因素的图像类别的多样性非常重要。ASL上的Kaggle数据具有所有这些不同的变体。 对此类数据进行培训可确保我们的模型对每个班级都有相当好的知识。 因此,让我们处理K aggle数据

The dataset consists of the images of hand gestures for each letter in the English alphabet. The images of a single class are of different variants — that is, zoomed versions, dim and bright light conditions, etc. For each class, there are as many as 3000 images. Let us consider classifying “A”, “B” and “C” images in our work for simplicity. Here are links for the full code for training and testing.

数据集由英语字母中每个字母的手势图像组成。 单个类别的图像具有不同的变体-即缩放版本,昏暗和明亮的光照条件等。对于每个类别,最多有3000张图像。 为了简单起见,让我们考虑对工作中的“ A”,“ B”和“ C”图像进行分类。 这是培训测试的完整代码的链接。

We are going to build an AlexNet to achieve this classification task. Since we are training the CNN, make sure that there is the support of computational resources like GPU.

我们将构建一个AlexNet来完成此分类任务。 由于我们正在训练CNN,因此请确保有GPU等计算资源的支持。

We start by importing the necessary modules.

我们首先导入必要的模块。

import warningswarnings.filterwarnings("ignore", category=DeprecationWarning)
import osimport cv2import randomimport numpy as npimport kerasfrom random import shufflefrom keras.utils import np_utilsfrom shutil import unpack_archive
print("Imported Modules...")

Download the data zip file from Kaggle data. Now, let us select the gesture images for A, B, and C and split the obtained data into training data, validation data, and test data.

从K aggle数据下载数据zip文件。 现在,让我们选择A,B和C的手势图像,并将获得的数据分为训练数据,验证数据和测试数据。

# data folder pathdata_folder_path = "asl_data/new" files = os.listdir(data_folder_path)
# shuffling the images in the folderfor i in range(10):   shuffle(files)
print("Shuffled Data Files")
# dictionary to maintain numerical labelsclass_dic = {"A":0,"B":1,"C":2}
# dictionary to maintain countsclass_count = {'A':0,'B':0,'C':0}
# training listsX = []Y = []
# validatio
  • 0
    点赞
  • 25
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值