human action recognition

卑微小鹿

已于 2024-09-05 11:45:30 修改

阅读量627

点赞数 4

文章标签： tensorflow

于 2024-09-05 11:27:49 首次发布

本文链接：https://blog.csdn.net/qq_56618414/article/details/141817309

版权

人体活动识别（HAR）是一种使用人工智能（AI）从智能手表等活动记录设备产生的原始数据中识别人类活动的方法。当人们执行某种动作时，人们佩戴的传感器（智能手表、手环、专用设备等）就会产生信号。这些收集信息的传感器包括加速度计、陀螺仪和磁力计。人类活动识别有各种各样的应用，从为病人和残疾人提供帮助到像游戏这样严重依赖于分析运动技能的领域。我们可以将这些人类活动识别技术大致分为两类:固定传感器和移动传感器。在本文中，我们使用移动传感器产生的原始数据来识别人类活动。

在本文中，我将使用LSTM (Long - term Memory)和CNN (Convolutional Neural Network)来识别下面的人类活动：

下楼
上楼
跑步
坐着
站立
步行

概述

机器学习方法在很大程度上依赖于启发式手动特征提取人类活动识别任务，而我们这里需要做的是端到端的学习，简化了启发式手动提取特征的操作。

我将要使用的模型是一个深神经网络，该网络是LSTM和CNN的组合形成的，并且具有提取活动特征和仅使用模型参数进行分类的能力。

这里我们使用WISDM数据集，总计1.098.209样本。通过我们的训练，模型的F1得分为0.96，在测试集上，F1得分为0.89。

导入库

首先，我们将导入我们将需要的所有必要库。

from pandas import read_csv, unique

import numpy as np

from scipy.interpolate import interp1d
from scipy.stats import mode

from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix, ConfusionMatrixDisplay

from tensorflow import stack
from tensorflow.keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Dense, GlobalAveragePooling1D, BatchNormalization, MaxPool1D, Reshape, Activation
from keras.layers import Conv1D, LSTM
from keras.callbacks import ModelCheckpoint, EarlyStopping
import matplotlib.pyplot as plt
%matplotlib inline

import warnings
warnings.filterwarnings("ignore")

我们将使用Sklearn，Tensorflow，Keras，Scipy和Numpy来构建模型和进行数据预处理。使用PANDAS 进行数据加载，使用matplotlib进行数据可视化。

数据集加载和可视化

WISDM是由个人腰间携带的移动设备上的加速计记录下来。该数据收集是由个人监督的可以确保数据的质量。我们将使用的文件是WISDM_AR_V1.1_RAW.TXT。使用PANDAS，可以将数据集加载到DataAframe中，如下面代码：

def read_data(filepath):
    df = read_csv(filepath, header=None, names=['user-id',
                                               'activity',
                                               'timestamp',
                                               'X',
                                               'Y',
                                               'Z'])
    ## removing ';' from last column and converting it to float
    ##在 Z 列中移除了分号（;）
    df['Z'].replace(regex=True, inplace=True, to_replace=r';', value=r'')
    ##将这一列的数据转换为浮点数
    df['Z'] = df['Z'].apply(convert_to_float)
#     df.dropna(axis=0, how='any', inplace=True)
    return df

def convert_to_float(x):
    ##转化为float64
    try:
        return np.float64(x)
    except:
        return np.nan

df = read_data('Dataset/WISDM_ar_v1.1/WISDM_ar_v1.1_raw.txt')
df

plt.figure(figsize=(15, 5))

plt.xlabel('Activity Type')
plt.ylabel('Training examples ')
df['activity'].value_counts().plot(kind='bar',
                                  title='Training examples by Activity Types')
plt.show()##不同活动类型的训练样本数

plt.figure(figsize=(15, 5))
plt.xlabel('User')
plt.ylabel('Training examples')
df['user-id'].value_counts().plot(kind='bar', 
                                 title='Training examples by user')
plt.show()##不同活动类型的训练样本数

现在我将收集的三个轴上的加速度计数据进行可视化。

def axis_plot(ax, x, y, title):
    ax.plot(x, y, 'r')##绘制绘制 x 和 y 数据点，并使用红色（'r'）线条
    ax.set_title(title)
    ax.xaxis.set_visible(False)##隐藏x轴
    ##设置x轴和y轴的范围
    ax.set_ylim([min(y) - np.std(y), max(y) + np.std(y)])
    ax.set_xlim([min(x), max(x)])
    ax.grid(True)

for activity in df['activity'].unique():##遍历每种活动类型
    limit = df[df['activity'] == activity][:180]##取前180个
    fig, (ax0, ax1, ax2) = plt.subplots(nrows=3, sharex=True, figsize=(15, 10))
    axis_plot(ax0, limit['timestamp'], limit['X'], 'x-axis')
    axis_plot(ax1, limit['timestamp'], limit['Y'], 'y-axis')
    axis_plot(ax2, limit['timestamp'], limit['Z'], 'z-axis')
    plt.subplots_adjust(hspace=0.2)
    fig.suptitle(activity)
    plt.subplots_adjust(top=0.9)
    plt.show()

(由于篇幅不够，我就放了一个在这里）

数据预处理

数据预处理是一项非常重要的任务，它使我们的模型能够更好的利用我们的原始数据。这里将使用的数据预处理方法有:

标签编码
线性插值
数据分割
归一化
时间序列分割
独热编码

标签编码

由于模型不能接受非数字标签作为输入，我们将在另一列中添加' activity '列的编码标签，并将其命名为' activityEncode '。标签被转换成如下所示的数字标签(这个标签是我们要预测的结果标签)

Downstairs [0]
Jogging [1]
Sitting [2]
Standing [3]
Upstairs [4]
Walking [5]

label_encode = LabelEncoder()
df['activityEncode'] = label_encode.fit_transform(df['activity'].values.ravel())
df
##将数据框 df 中的 activity 列中的活动类型转换为数字编码，并将结果存储在一个新的列 activityEncode 中

df.isna().sum()
##统计缺失值的数量 可以看出z轴有一个空白值(nan)
##然后下面就是补充这个nan值

线性插值

利用线性插值可以避免采集过程中出现NaN的数据丢失的问题。它将通过插值法填充缺失的值。虽然在这个数据集中只有一个NaN值，但为了我们的展示，还是需要实现它。

interpolation_fn = interp1d(df['activityEncode'] ,df['Z'], kind='linear')
interpolation_fn
##线性插值
<scipy.interpolate._interpolate.interp1d at 0x2aeb70ef9a0>

null_list = df[df['Z'].isnull()].index.tolist()
null_list
##输出nan的位置
[343416]

for i in null_list:
    y = df['activityEncode'][i]
    value = interpolation_fn(y)
    df['Z']=df['Z'].fillna(value)
    print(value)
    ##用生成的value去补nan值
4.75

df.isna().sum()
##再次去检查是否缺少
user-id           0
activity          0
timestamp         0
X                 0
Y                 0
Z                 0
activityEncode    0
dtype: int64

数据分割

根据用户id进行数据分割，避免数据分割错误。我们在训练集中使用id小于或等于27的用户，其余的在测试集中使用。

## train split users between 1 and 27, test split users between 28 and 33
##分割数据集，大于27为测试集，小于27为训练集
##训练集 (Training Set)：用于训练模型。
##验证集 (Validation Set)：用于调整模型超参数和评估模型的性能。
##测试集 (Test Set)：用于评估模型的最终性能，通常在训练和验证阶段之后进行。
df_test = df[df['user-id'] > 27]
df_train = df[df['user-id'] <= 27]

归一化

在训练之前，需要将数据特征归一化到0到1的范围内。我们用的方法是

df_train['X'] = (df_train['X']-df_train['X'].min())/(df_train['X'].max()-df_train['X'].min())
df_train['Y'] = (df_train['Y']-df_train['Y'].min())/(df_train['Y'].max()-df_train['Y'].min())
df_train['Z'] = (df_train['Z']-df_train['Z'].min())/(df_train['Z'].max()-df_train['Z'].min())
df_train
##进行归一化处理

df_train['activityEncode'].value_counts()
##统计每一个活动类型的数量


5    314341
1    262651
4     90906
0     71436
2     41932
3     32157
Name: activityEncode, dtype: int64

时间序列分割

因为我们处理的是时间序列数据，所以需要创建一个分割的函数，标签名称和每个记录的范围进行分段。此函数在x_train和y_train中执行特征的分离，将每80个时间段分成一组数据。

def segments(df, time_steps, step, label_name):
    N_FEATURES = 3
    segments = []
    labels = []
    for i in range(0, len(df) - time_steps, step):
        ##获取时间序列
        xs = df['X'].values[i:i+time_steps]
        ys = df['Y'].values[i:i+time_steps]
        zs = df['Z'].values[i:i+time_steps]
        
        label = mode(df[label_name][i:i+time_steps])[0][0]
        ##获取该片段的众数
        segments.append([xs, ys, zs])
        labels.append(label)
        ## 将片段和标签添加到列表中
    ##  将列表转换为 NumPy 数组并重塑成三维数组，其形状为 (片段数量, time_steps, 特征数量)，适用于时间序列模型的输入
    reshaped_segments = np.asarray(segments, dtype=np.float32).reshape(-1, time_steps, N_FEATURES)
    labels = np.asarray(labels)
    
    return reshaped_segments, labels


TIME_PERIOD = 80
STEP_DISTANCE = 40
LABEL = 'activityEncode'
x_train, y_train = segments(df_train, TIME_PERIOD, STEP_DISTANCE, LABEL)
print(x_train.shape,y_train.shape)
##50%的覆盖率


(20334, 80, 3) (20334,)

这样，x_train和y_train形状变为：

print('x_train shape:', x_train.shape) 
print('Training samples:', x_train.shape[0]) 
print('y_train shape:', y_train.shape) 
 
x_train shape: (20334, 80, 3) 
Training samples: 20334 
y_train shape: (20334,)

这里还存储了一些后面用到的数据：时间段（time_period），传感器数（sensors）和类（num_classes）的数量。

## Input and Output Dimensions
# 获取时间片段的长度（时间步数）和传感器的数量
time_period, sensors = x_train.shape[1], x_train.shape[2]

# 获取类别的数量
num_classes = label_encode.classes_.size

# 打印所有的类别标签
print(list(label_encode.classes_))

['Downstairs', 'Jogging', 'Sitting', 'Standing', 'Upstairs', 'Walking']

最后需要使用Reshape将其转换为列表,作为keras的输入

## reshaping data
input_shape = time_period * sensors
x_train = x_train.reshape(x_train.shape[0], input_shape)
print("Input Shape: ", input_shape)
print("Input Data Shape: ", x_train.shape)
#将x_train调整为二维，可是后面的lstm需要三维

Input Shape:  240
Input Data Shape:  (20334, 240)



最后需要将所有数据转换为float32。
x_train = x_train.astype('float32')
y_train = y_train.astype('float32')

独热编码

这是数据预处理的最后一步，我们将通过编码标签并将其存储到y_train_hot中来执行。

y_train_hot = to_categorical(y_train, num_classes)
# 将y_train调整为热编码
print(y_train_hot)
print("y_train shape: ", y_train_hot.shape)
# input_shape=(x_train,1)
# print(input_shape.shape)
# input_shape=(input_shape, 1)：
# input_shape 中的第一个值（input_shape）表示时间步的数量（即序列的长度）。
# 1 表示每个时间步的特征数为 1。



[[0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 0. 1.]
 ...
 [0. 0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0.]]
y_train shape:  (20334, 6)

模型

我们使用的模型是一个由8层组成的序列模型。模型前两层由LSTM组成，每个LSTM具有32个神经元，使用的激活函数为Relu。然后是用于提取空间特征的卷积层。

在两层的连接处需要改变LSTM输出维度，因为输出具有3个维度（样本数，时间步长，输入维度），而CNN则需要4维输入（样本数，1，时间步长，输入）。

第一个CNN层具有64个神经元，另一个神经元有128个神经元。在第一和第二CNN层之间，我们有一个最大池层来执行下采样操作。然后是全局平均池（GAP）层将多D特征映射转换为1-D特征向量，因为在此层中不需要参数，所以会减少全局模型参数。然后是BN层，该层有助于模型的收敛性。

最后一层是模型的输出层，该输出层只是具有SoftMax分类器层的6个神经元的完全连接的层，该层表示当前类的概率。

model = Sequential()
model.add(LSTM(32, return_sequences=True, input_shape=(input_shape,1), activation='relu'))
print("input_shape：",input_shape)
# 为什么是240
# 1代表每个时间步的特征数
model.add(LSTM(32,return_sequences=True, activation='relu'))
model.add(Reshape((1, 240, 32)))
model.add(Conv1D(filters=64,kernel_size=2, activation='relu', strides=2))
model.add(Reshape((120, 64)))
model.add(MaxPool1D(pool_size=4, padding='same'))
model.add(Conv1D(filters=192, kernel_size=2, activation='relu', strides=1))
model.add(Reshape((29, 192)))
model.add(GlobalAveragePooling1D())
model.add(BatchNormalization(epsilon=1e-06))
model.add(Dense(6))
model.add(Activation('softmax'))

print(model.summary())
# 这段代码定义了一个 Keras 的 Sequential 模型，包含了 LSTM 层、卷积层、池化层、全局平均池化层、批量归一化层以及全连接层。以下是对每一层的详细解释，以及其中数字的含义：
# 1. LSTM 层
# model.add(LSTM(32, return_sequences=True, input_shape=(input_shape,1), activation='relu'))
# LSTM(32, return_sequences=True): 添加了一个 LSTM 层，单元数为 32。return_sequences=True 表示该 LSTM 层会返回每个时间步的输出序列，而不是仅返回最后一个时间步的输出。
# input_shape=(input_shape,1): 输入形状定义为 (input_shape, 1)，其中 input_shape 是时间步的数量，1 是每个时间步的特征数。
# 2. 第二个 LSTM 层
# model.add(LSTM(32, return_sequences=True, activation='relu'))
# LSTM(32, return_sequences=True, activation='relu'): 另一个 LSTM 层，单元数为 32，return_sequences=True 同样返回每个时间步的输出。activation='relu' 指定了激活函数为 ReLU。
# 3. Reshape 层
# model.add(Reshape((1, 240, 32)))
# Reshape((1, 240, 32)): 将数据的形状从 (batch_size, timesteps, features) 改变为 (batch_size, 1, 240, 32)。这个步骤的目的是准备数据进入卷积层。
# 4. Conv1D 层
# model.add(Conv1D(filters=64, kernel_size=2, activation='relu', strides=2))
# Conv1D(filters=64, kernel_size=2, activation='relu', strides=2): 一维卷积层，使用 64 个滤波器，卷积核大小为 2，激活函数为 ReLU，步幅为 2。步幅为 2 表示卷积核每次移动 2 个时间步。
# 5. Reshape 层
# model.add(Reshape((120, 64)))
# Reshape((120, 64)): 将数据形状调整为 (batch_size, 120, 64)。这一步可能是为了将卷积层的输出调整为适合下一层的形状。
# 6. MaxPool1D 层
# model.add(MaxPool1D(pool_size=4, padding='same'))
# MaxPool1D(pool_size=4, padding='same'): 一维最大池化层，池化窗口大小为 4，padding='same' 表示输出的长度会尽量保持与输入相同，通过在边缘填充数据来实现。
# 7. 第二个 Conv1D 层
# model.add(Conv1D(filters=192, kernel_size=2, activation='relu', strides=1))
# Conv1D(filters=192, kernel_size=2, activation='relu', strides=1): 另一个一维卷积层，使用 192 个滤波器，卷积核大小为 2，激活函数为 ReLU，步幅为 1。
# 8. Reshape 层
# model.add(Reshape((29, 192)))
# Reshape((29, 192)): 将数据形状调整为 (batch_size, 29, 192)。
# 9. GlobalAveragePooling1D 层
# model.add(GlobalAveragePooling1D())
# GlobalAveragePooling1D(): 全局平均池化层，计算每个特征图的平均值，输出的形状是 (batch_size, features)，其中 features 是特征图的数量（在这里是 192）。
# 10. BatchNormalization 层
# model.add(BatchNormalization(epsilon=1e-06))
# BatchNormalization(epsilon=1e-06): 批量归一化层，用于在训练过程中标准化每一批数据。epsilon 是一个很小的常数，用于防止除以零的情况。
# 11. Dense 层
# model.add(Dense(6))
# Dense(6): 全连接层，输出 6 个神经元。这个层通常用于最终的分类输出。
# 12. Activation 层
# model.add(Activation('softmax'))
# Activation('softmax'): 使用 softmax 激活函数，将模型的输出转化为概率分布，适用于多分类问题。
# 总结
# LSTM: 提取时间序列数据的特征。
# Reshape: 调整数据的形状以适应卷积层。
# Conv1D: 提取局部特征。
# MaxPool1D: 降低特征图的维度，减少计算量。
# GlobalAveragePooling1D: 对每个特征图进行全局平均池化。
# BatchNormalization: 标准化数据，加速训练。
# Dense 和 Activation: 生成最终的分类输出和预测概率。

训练和结果

经过训练，模型给出了98.02%的准确率和0.0058的损失。训练F1得分为0.96。(由于我的电脑是1060，配置比较低，所以我的accury比较低，你们跑出来的accury可能比我高，不要惊讶，是正常现象）

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 
x_train_expanded = np.expand_dims(x_train, axis=2)
print(x_train_expanded.shape)
print(x_train.shape)
print(y_train.shape)
print( y_train_hot)
print(y_train_hot.shape)
(20334, 240, 1)
(20334, 240)
(20334,)
[[0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 0. 1.]
 ...
 [0. 0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0.]]
(20334, 6)



history = model.fit(x_train, 
                    y_train_hot,  
                    batch_size= 192,  
                    epochs=100 
                   )

可视化训练的准确性和损失变化图。

plt.figure(figsize=(6, 4)) 
plt.plot(history.history['accuracy'], 'r', label='Accuracy of training data') 
plt.plot(history.history['loss'], 'r--', label='Loss of training data') 
plt.title('Model Accuracy and Loss') 
plt.ylabel('Accuracy and Loss') 
plt.xlabel('Training Epoch') 
plt.ylim(0) 
plt.legend() 
plt.show() 
 
y_pred_train = model.predict(x_train) 
max_y_pred_train = np.argmax(y_pred_train, axis=1) 
print(classification_report(y_train, max_y_pred_train))

在测试数据集上测试它，但在通过测试集之前，需要对测试集进行相同的预处理。

df_test['X'] = (df_test['X']-df_test['X'].min())/(df_test['X'].max()-df_test['X'].min()) 
df_test['Y'] = (df_test['Y']-df_test['Y'].min())/(df_test['Y'].max()-df_test['Y'].min()) 
df_test['Z'] = (df_test['Z']-df_test['Z'].min())/(df_test['Z'].max()-df_test['Z'].min()) 
x_test, y_test = segments(df_test, 
                         TIME_PERIOD, 
                         STEP_DISTANCE, 
                         LABEL) 
 
x_test = x_test.reshape(x_test.shape[0], input_shape) 
x_test = x_test.astype('float32') 
y_test = y_test.astype('float32') 
y_test = to_categorical(y_test, num_classes)

在评估我们的测试数据集后，得到了89.14%的准确率和0.4647的损失。F1测试得分为0.89。

score = model.evaluate(x_test, y_test) 
print("Accuracy:", score[1]) 
print("Loss:", score[0])

下面绘制混淆矩阵更好地理解对测试数据集的预测。

predictions = model.predict(x_test) 
predictions = np.argmax(predictions, axis=1) 
y_test_pred = np.argmax(y_test, axis=1) 
cm = confusion_matrix(y_test_pred, predictions) 
cm_disp = ConfusionMatrixDisplay(confusion_matrix= cm) 
cm_disp.plot() 
plt.show()

还可以在测试数据集上评估的模型的分类报告。

print(classification_report(y_test_pred, predictions))

卑微小鹿

关注

4
点赞
踩
19

收藏

觉得还不错? 一键收藏
0
评论
human action recognition

机器学习方法在很大程度上依赖于启发式手动特征提取人类活动识别任务，而我们这里需要做的是端到端的学习，简化了启发式手动提取特征的操作。我将要使用的模型是一个深神经网络，该网络是LSTM和CNN的组合形成的，并且具有提取活动特征和仅使用模型参数进行分类的能力。这里我们使用WISDM数据集，总计1.098.209样本。通过我们的训练，模型的F1得分为0.96，在测试集上，F1得分为0.89。
复制链接

扫一扫