HDU--数据挖掘

Online Temporal Training Methods Based on Spiking Neural Networks Applications in Gesture Recognition

1. Introduction

Impulsive neural network (SNN) is a promising energy-saving model inspired by the brain. The latest development of training methods makes it possible to successfully achieve deep SNN with low delay on large-scale tasks. In particular, Back Propagation with Proxy Gradient (SG) (BPTT) is widely used to achieve high performance in very few time steps. However, its cost is that the memory consumption of training is large, the optimization lacks theoretical clarity, and it is inconsistent with the online properties of biological learning and the hardware rules of neuromorphology. Other work connects the pulse representation of SNN with the equivalent artificial neural network formula, and trains SNN through the gradient of equivalent mapping to ensure the descending direction. But they can't achieve low latency and can't be online.

In this work, the online time training (OTTT) of SNN is proposed, which originated from BPTT, and realizes timely forward learning by tracking presynaptic activity and using instantaneous loss and gradient. At the same time, it is theoretically analyzed and proved that the OTTT gradient can provide a similar descending direction to the gradient based on pulse representation under feedforward and cyclic conditions. OTTT only needs a constant training memory cost independent of the time step, thus avoiding the large memory cost of BPTT for GPU training.

With the rapid development of computer vision and artificial intelligence technology, gesture recognition, as an important application field in human-computer interaction, has been widely concerned. The traditional Convolutional Neural Network (CNN) performs well in processing static images, but it faces the problems of high energy consumption and delay in processing dynamic gesture recognition tasks. Impulsive neural network (SNN), as a neural network inspired by biological nervous system, has the advantages of processing time series information and low power consumption, so it shows great potential in the field of gesture recognition.

In this experiment, an on-line time training method (OTTT) based on impulsive neural network is studied, aiming at solving the problems that the existing training methods have large memory overhead and do not meet the online learning characteristics. Through theoretical analysis and experimental verification, the effectiveness and superiority of OTTT method in gesture recognition task are proved.

2. Experimental information

2.1 Pulse Neural Network

Impulsive neural network (SNN) is a bionic neural network model, which transmits information through spike between neurons. Unlike traditional artificial neural network (ANN), SNN has the characteristics of event-driven and sparse activation, which makes it have significant advantages in energy efficiency. In recent years, with the progress of training methods, SNN has been widely used in large-scale tasks.

2.2 gesture recognition

Gesture recognition technology can realize human-computer interaction by analyzing hand movements and gestures. Common methods include vision-based gesture recognition and sensor-based gesture recognition. The former uses a camera to capture hand movements, while the latter relies on sensor data on wearable devices. Impulsive neural network has potential application value in gesture recognition because of its advantages in processing time series data.

2.3 Training methods

Traditional SNN training methods mainly include training based on pulse representation and back propagation transit time (BPTT) method. The former is trained by transforming the pulse representation of SNN into an equivalent artificial neural network, but there is a high delay problem; The latter realizes high-performance training through time reverse propagation gradient, but it consumes huge memory.

Spiking Neural Networks (SNNs), as a brain-like model with high energy efficiency, has made great progress in training methods in recent years, which makes it perform well in large-scale tasks and has the advantage of low delay. However, the traditional back propagation transit time method has limited its universality in practical application because of its large memory consumption and inconsistency with biological learning rules.

2.4 data set information

This data set is used to build a real-time gesture recognition system entitled "A low-power, full-event-driven gesture recognition system" presented at the 2017 Computer Vision and Pattern Recognition Conference (CVPR). Data is recorded by DVS128. The data set contains 11 hand gestures of 29 subjects under 3 lighting conditions, and it is published under the license of Creative Commons Attribution 4.0. The required disk space is 3 GB of tar file and 5 GB of decompressed data.

1: Clap your hands 2: right hand swing 3: left hand swing 4: right arm clockwise 5: right arm counterclockwise 6: left arm clockwise 7: left arm counterclockwise 8: arm rolling 9: air drum 10: air guitar 11: other gestures.

3. Experimental results and analysis

3.1 Data Set and Pretreatment

DVS Gesture data set is a standard data set for dynamic gesture recognition, which contains 11 different gesture types. Gesture data in data set is captured by dynamic vision sensor (DVS) to generate event stream data. I standardized the data and used data enhancement technology to improve the robustness of the model.

DVS Gesture data set is a gesture data set captured by dynamic vision sensor (DVS), which contains 11 different types of dynamic gestures. These gesture data are recorded in the form of event stream, which reflects the dynamic changes of gestures. The data preprocessing steps include:

Event normalization: event timestamp and pixel position are normalized, so that they can be processed on a unified scale. Data enhancement: data enhancement techniques such as rotation, translation and noise addition are adopted to increase the robustness and generalization ability of the model. Training set and test set are divided: the data set is divided into training set and test set according to the ratio of 7:3 to ensure the fairness of model training and evaluation.

3.2 Model training

The impulsive neural network based on OTTT method is implemented and trained in a single GPU environment. During the training process, the training loss and accuracy of each epoch, as well as the loss and accuracy on the test set, are recorded. Training parameters include learning rate, batch size and training rounds, etc. The specific training parameters are as follows:

3.3 Analysis of results

3.3.1 Loss and accuracy of training and testing

:

Training once

Training twice

Training three times

OTTT(Online Training Through Time)

Training times

one

two

three

Training loss

1.3127

1.0802

0.9585

Training accuracy

52.12%

60.96%

65.56%

Test loss

1.5077

0.9501

0.8794

Test accuracy

43.40%

70.13%

75.97%

Maximum test accuracy

53.12%

70.13%

70.14%

Total time consumption

1063

1059

1077

3.3.2 Analysis of experimental data

In the first training, the training loss of the model is loss: 1.5077; top1: 43.4028 |top5: 92.3611; The training loss is 1.3170308, and the training accuracy is 0.55703.00000000505 The test loss is 1.5077284174368, and the test accuracy rate is 0.507778886 The highest test accuracy is 0.53125, and the total training time is 1063 seconds.

In the second training, the training loss of the model is loss: 0.9501; top1: 70.1389|top5:98.9583; Training loss is 1.0887546, and the training accuracy rate is 0.6804; The test loss is 0.9501072549157672, and the test accuracy rate is 0.50072.5008989895 The highest test accuracy is 0.701288888888, and the total training time is 1059 seconds.

In the third training, the training loss of the model is LOSS: 0.8794; top1:65.9722  |top5:98.9583; The training loss is 0.958548518, and the training accuracy is 0.65592. The test loss is 0.879426047205925, and the test accuracy rate is 0.65942.000000606 The highest test accuracy is 0.701388888888, and the total training time is 1077 seconds.

Training loss and accuracy: the training loss is obviously reduced and the training accuracy is obviously improved, which shows that the model has a good fitting effect on training data.

Test loss and accuracy: the test loss is obviously reduced, and the test accuracy is significantly improved, indicating that the generalization ability of the model to unknown data has been improved.

With the increase of training times, both training loss and test loss decreased significantly. It can be seen that the model is constantly optimized and more effective feature representation is gradually learned. During the fourth training, the training loss decreased to 0.9585 and the test loss decreased to 0.8794, which was the best performance of the three experiments.

Training accuracy and test accuracy also increase with the increase of training times. In the fourth training, the training accuracy reached 65.56% and the testing accuracy reached 75.97%. This shows that the model not only performs well on the training set, but also maintains a high generalization ability on the test set.

With the training, the loss of the model is gradually reduced and the accuracy is gradually improved, which shows that the performance of the model is significantly improved in the process of continuous learning and optimization. Especially in the first few epoch, the test accuracy has been greatly improved, which shows that OTTT method has strong learning ability in the initial stage.

3.5 Memory consumption and training time

An obvious advantage of OTTT method is that its memory consumption is constant and does not increase with the increase of time steps. Compared with the traditional BPTT method, OTTT method is superior in memory usage. In the single GPU environment, the memory usage of OTTT method is kept within a reasonable range, which significantly reduces the hardware requirements in the training process.

In terms of training time, the OTTT method avoids the gradient accumulation of long time series in the process of back propagation, and the training speed is faster. The experimental results show that under the same training conditions, the training time of OTTT method is reduced by about 30% compared with that of BPTT method.

3.6 Analysis and discussion of results

(1) Advantages

High accuracy: the accuracy of the model proposed in this paper on the test set reached 70.14%.

Lower loss: the loss of training and testing decreased significantly, indicating that the optimization effect of the model is good.

Good generalization ability: the test accuracy is high, which shows that the model not only performs well on the training set, but also on the test set.

(2) Disadvantages

Long training time: the long training time of the model will affect the efficiency in practical application.

Large consumption of computing resources: more computing resources need to be consumed in the training process, especially when the number of training times increases.

(3) Several key factors affecting the performance of the model:

① Training times: Appropriate training times can improve the model performance, and too much training may lead to over-fitting. Because of the time problem, only four experiments have been done in this experiment, and its correct rate still needs to be improved.

② Scale and quality of data set: The scale and quality of data set directly affect the training effect and generalization ability of the model.

③ Adjustment of model structure and parameters: The structural complexity and parameter setting of the model, as well as the adjustment of learning rate, batch size and other parameters also have a significant impact on performance.

4. Conclusion

The experimental results show that OTTT method performs well in gesture recognition tasks on DVS Gesture data sets. Compared with the traditional BPTT method, OTTT method not only has obvious advantages in memory usage and training time, but also performs well in classification accuracy and robustness. These advantages make OTTT method have a wide range of potential in practical applications, especially in scenes that require real-time processing and low power consumption.

Through this study, I verified the effectiveness of OTTT method in online time training, and showed its application prospect in the field of gesture recognition. Future research can further optimize the parameter setting of OTTT method, explore its applicability in other time series tasks, and combine with hardware implementation to enhance the practical application value of the model.

  • 22
    点赞
  • 25
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值