#★★★本文源自AlStudio社区精品项目,
【点击此处】查看更多精品内容 >>>
(https://aistudio.baidu.com/aistudio/proiectoverview/public?ad-from=4100)
飞桨常规赛:手机行为识别一月第三名方案score:95
优化方法:增加了resnet层,修改优化器,修改学习率
赛事介绍
如今的智能机已经很智能了,如果手机可以觉察到我们在生活中的一举一动,知道我们行动的意图,你觉得会如何?智能手机不仅搭载了多种惯性传感器,这使得基于智能手机的人体行为识别研究越来越受关注。
在本次赛题由志愿者使用智能手机时,通过基本活动的行为构建而成。希望选手能够构建模型对活动行为进行预测。
赛事任务
实验是在 19-48 岁年龄段的 30 名志愿者中进行的。每个人在腰部佩戴智能手机(三星 Galaxy S II)进行六项活动(步行、楼上步行、楼下步行、坐、站、躺)。实验以 50Hz 的恒定速率捕获 3 轴线性加速度和 3 轴角速度。
赛题训练集案例如下:
- 训练集8千条数据;
- 测试集共2000条数据;
数据总共100MB,赛题数据均为csv格式,列使用逗号分割。若使用Pandas读取数据,可参考如下代码:
import pandas as pd
import numpy as np
train = pd.read_csv('train.csv.zip')
对于数据集中的每一条记录,都提供了以下内容,来自加速度计的三轴加速度(总加速度)和估计的身体加速度、和来自陀螺仪的三轴角速度。总共是具有时域和频域变量的561个特征向量。
测试集中label字段Activity为空,需要选手预测。
评审规则
- 数据说明:选手需要提交测试集队伍排名预测,具体的提交格式如下:
Activity
STANDING
LAYING
WALKING
SITTING
WALKING
WALKING_DOWNSTAIRS
STANDING
- 评估指标:本次竞赛的使用准确率进行评分,数值越高精度越高,评估代码参考:
from sklearn.metrics import accuracy_score
y_pred = [0, 2, 1, 3]
y_true = [0, 1, 2, 3]
accuracy_score(y_true, y_pred)
Baseline使用指导
1、点击‘fork按钮’,出现‘fork项目’弹窗
2、点击‘创建按钮’ ,出现‘运行项目’弹窗
3、点击‘运行项目’,自动跳转至新页面
4、点击‘启动环境’ ,出现‘选择运行环境’弹窗
5、选择运行环境(启动项目需要时间,请耐心等待),出现‘环境启动成功’弹窗,点击确定
6、点击进入环境,即可进入notebook环境
7、鼠标移至下方每个代码块内(代码块左侧边框会变成浅蓝色),再依次点击每个代码块左上角的‘三角形运行按钮’,待一个模块运行完以后再运行下一个模块,直至全部运行完成
8、下载页面左侧submission.zip压缩包
9、在比赛页提交submission.zip压缩包,等待系统评测结束后,即可登榜!
10、点击页面左侧‘版本-生成新版本’
11、填写‘版本名称’,点击‘生成版本按钮’,即可在个人主页查看到该项目(可选择公开此项目哦)
数据分析
import pandas as pd
import paddle
import numpy as np
%pylab inline
import seaborn as sns
train_df = pd.read_csv('data/data137267/train.csv.zip')
test_df = pd.read_csv('data/data137267/test.csv.zip')
Populating the interactive namespace from numpy and matplotlib
train_df.shape
(8000, 562)
train_df.columns
Index(['tBodyAcc-mean()-X', 'tBodyAcc-mean()-Y', 'tBodyAcc-mean()-Z',
'tBodyAcc-std()-X', 'tBodyAcc-std()-Y', 'tBodyAcc-std()-Z',
'tBodyAcc-mad()-X', 'tBodyAcc-mad()-Y', 'tBodyAcc-mad()-Z',
'tBodyAcc-max()-X',
...
'fBodyBodyGyroJerkMag-skewness()', 'fBodyBodyGyroJerkMag-kurtosis()',
'angle(tBodyAccMean,gravity)', 'angle(tBodyAccJerkMean),gravityMean)',
'angle(tBodyGyroMean,gravityMean)',
'angle(tBodyGyroJerkMean,gravityMean)', 'angle(X,gravityMean)',
'angle(Y,gravityMean)', 'angle(Z,gravityMean)', 'Activity'],
dtype='object', length=562)
train_df['Activity'].value_counts().plot(kind='bar')
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/cbook/__init__.py:2349: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
if isinstance(obj, collections.Iterator):
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/cbook/__init__.py:2366: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
return list(data) if isinstance(data, collections.MappingView) else data
<matplotlib.axes._subplots.AxesSubplot at 0x7f3ed77fd890>
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-WL0iEJtf-1677158396342)(main_files/main_6_2.png)]
plt.figure(figsize=(10, 5))
sns.boxplot(y='tBodyAcc-mean()-X', x='Activity', data=train_df)
plt.tight_layout()
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/seaborn/categorical.py:340: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
np.asarray(s, dtype=np.float)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/seaborn/utils.py:538: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
np.asarray(values).astype(np.float)
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-KTrClqSA-1677158396343)(main_files/main_7_1.png)]
train_df['Activity'] = train_df['Activity'].map({
'LAYING': 0,
'STANDING': 1,
'SITTING': 2,
'WALKING': 3,
'WALKING_UPSTAIRS': 4,
'WALKING_DOWNSTAIRS': 5
})
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(train_df.values[:, :-1])
train_df.iloc[:, :-1] = scaler.transform(train_df.values[:, :-1])
test_df.iloc[:, :] = scaler.transform(test_df.values)
搭建模型
class Classifier(paddle.nn.Layer):
# self代表类的实例自身
def __init__(self):
# 初始化父类中的一些参数
super(Classifier, self).__init__()
self.conv1 = paddle.nn.Conv1D(in_channels=1, out_channels=16, kernel_size=3,stride=1,padding=1)
self.conv2 = paddle.nn.Conv1D(in_channels=16, out_channels=16, kernel_size=3,stride=1,padding=1)
self.conv3 = paddle.nn.Conv1D(in_channels=32, out_channels=64, kernel_size=3)
self.flatten = paddle.nn.Flatten()
self.dropout = paddle.nn.Dropout()
self.bn=paddle.nn.BatchNorm1D(16)
self.fc = paddle.nn.Linear(in_features=16*93, out_features=60)
self.fc1 = paddle.nn.Linear(in_features=60,out_features=6)
self.relu = paddle.nn.Mish()
self.pool = paddle.nn.AvgPool1D(6)
self.softmax = paddle.nn.Softmax()
# 网络的前向计算
def forward(self, inputs):
x= self.conv1(inputs)
identity = x
x=self.conv2(x)
x=self.bn(x)
x=self.relu(x)
x=self.conv2(x)
x=self.bn(x)
x=x+ identity
x=self.relu(x)
x=self.pool(x)
x=self.flatten(x)
x=self.fc(x)
x=self.relu(x)
x=self.fc1(x)
x=self.relu(x)
x=self.softmax(x)
return x
model = Classifier()
model.train()
opt = paddle.optimizer.Adam(learning_rate=0.0008, parameters=model.parameters())
loss_fn = paddle.nn.CrossEntropyLoss()
EPOCH_NUM =1000 # 设置外层循环次数
BATCH_SIZE = 512 # 设置batch大小
training_data = train_df.iloc[:-1000].values.astype(np.float32)
val_data = train_df.iloc[-1000:].values.astype(np.float32)
training_data = training_data.reshape(-1, 1, 562)
val_data = val_data.reshape(-1, 1, 562)
# 定义外层循环
for epoch_id in range(EPOCH_NUM):
# 在每轮迭代开始之前,将训练数据的顺序随机的打乱
np.random.shuffle(training_data)
# 将训练数据进行拆分,每个batch包含10条数据
mini_batches = [training_data[k:k+BATCH_SIZE] for k in range(0, len(training_data), BATCH_SIZE)]
# 定义内层循环
for iter_id, mini_batch in enumerate(mini_batches):
model.train()
x = np.array(mini_batch[:,:, :-1]) # 获得当前批次训练数据
y = np.array(mini_batch[:,:, -1:]) # 获得当前批次训练标签
features = paddle.to_tensor(x)
y = paddle.to_tensor(y)
predicts = model(features)
# 计算损失
loss = loss_fn(predicts, y.flatten().astype(int))
avg_loss = paddle.mean(loss)
# 反向传播,计算每层参数的梯度值
avg_loss.backward()
opt.step()
# 清空梯度变量,以备下一轮计算
opt.clear_grad()
# 训练与验证
if iter_id%2000==0 and epoch_id % 10 == 0:
acc = predicts.argmax(1) == y.flatten().astype(int)
acc = acc.astype(float).mean()
model.eval()
val_predict = model(paddle.to_tensor(val_data[:, :, :-1])).argmax(1)
val_label = val_data[:, :, -1]
val_acc = np.mean(val_predict.numpy() == val_label.flatten())
print("epoch: {}, iter: {}, loss is: {}, acc is {} / {}".format(
epoch_id, iter_id, avg_loss.numpy(), acc.numpy(), val_acc))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:653: UserWarning: When training, we now always track global mean and variance.
"When training, we now always track global mean and variance.")
epoch: 0, iter: 0, loss is: [1.7966979], acc is [0.17578125] / 0.306
epoch: 10, iter: 0, loss is: [1.1040581], acc is [0.96289062] / 0.914
epoch: 20, iter: 0, loss is: [1.079589], acc is [0.97265625] / 0.918
epoch: 30, iter: 0, loss is: [1.066896], acc is [0.984375] / 0.919
epoch: 40, iter: 0, loss is: [1.0661179], acc is [0.984375] / 0.936
epoch: 50, iter: 0, loss is: [1.0553534], acc is [0.9921875] / 0.918
epoch: 60, iter: 0, loss is: [1.0540726], acc is [0.99414062] / 0.924
epoch: 70, iter: 0, loss is: [1.0510386], acc is [0.99414062] / 0.93
epoch: 80, iter: 0, loss is: [1.0543432], acc is [0.99023438] / 0.931
epoch: 90, iter: 0, loss is: [1.0503807], acc is [0.99414062] / 0.939
epoch: 100, iter: 0, loss is: [1.0481585], acc is [0.99609375] / 0.927
epoch: 110, iter: 0, loss is: [1.0504905], acc is [0.99414062] / 0.932
epoch: 120, iter: 0, loss is: [1.0479023], acc is [0.99609375] / 0.938
epoch: 130, iter: 0, loss is: [1.0442163], acc is [1.] / 0.929
epoch: 140, iter: 0, loss is: [1.0496444], acc is [0.99414062] / 0.939
epoch: 150, iter: 0, loss is: [1.0476862], acc is [0.99609375] / 0.94
epoch: 160, iter: 0, loss is: [1.0495868], acc is [0.99414062] / 0.94
epoch: 170, iter: 0, loss is: [1.0456569], acc is [0.99804688] / 0.94
epoch: 180, iter: 0, loss is: [1.0534363], acc is [0.99023438] / 0.941
epoch: 190, iter: 0, loss is: [1.0437151], acc is [1.] / 0.937
epoch: 200, iter: 0, loss is: [1.0495721], acc is [0.99414062] / 0.942
epoch: 210, iter: 0, loss is: [1.0456309], acc is [0.99804688] / 0.944
epoch: 220, iter: 0, loss is: [1.0514649], acc is [0.9921875] / 0.941
epoch: 230, iter: 0, loss is: [1.0475488], acc is [0.99609375] / 0.943
epoch: 240, iter: 0, loss is: [1.0514573], acc is [0.9921875] / 0.945
epoch: 250, iter: 0, loss is: [1.0534008], acc is [0.99023438] / 0.944
epoch: 260, iter: 0, loss is: [1.0475426], acc is [0.99609375] / 0.943
epoch: 270, iter: 0, loss is: [1.0494891], acc is [0.99414062] / 0.944
epoch: 280, iter: 0, loss is: [1.0475368], acc is [0.99609375] / 0.94
epoch: 290, iter: 0, loss is: [1.0436193], acc is [1.] / 0.946
epoch: 300, iter: 0, loss is: [1.0475165], acc is [0.99609375] / 0.942
epoch: 310, iter: 0, loss is: [1.045569], acc is [0.99804688] / 0.943
epoch: 320, iter: 0, loss is: [1.0436138], acc is [1.] / 0.943
epoch: 330, iter: 0, loss is: [1.0514243], acc is [0.9921875] / 0.947
epoch: 340, iter: 0, loss is: [1.0455647], acc is [0.99804688] / 0.945
epoch: 350, iter: 0, loss is: [1.0455636], acc is [0.99804688] / 0.946
epoch: 360, iter: 0, loss is: [1.0561364], acc is [0.99023438] / 0.873
epoch: 370, iter: 0, loss is: [1.0466547], acc is [0.99804688] / 0.93
epoch: 380, iter: 0, loss is: [1.049477], acc is [0.99414062] / 0.942
epoch: 390, iter: 0, loss is: [1.0436027], acc is [1.] / 0.943
epoch: 400, iter: 0, loss is: [1.0514171], acc is [0.9921875] / 0.941
epoch: 410, iter: 0, loss is: [1.0475162], acc is [0.99609375] / 0.942
epoch: 420, iter: 0, loss is: [1.0494684], acc is [0.99414062] / 0.941
epoch: 430, iter: 0, loss is: [1.0475113], acc is [0.99609375] / 0.939
epoch: 440, iter: 0, loss is: [1.0475147], acc is [0.99609375] / 0.939
epoch: 450, iter: 0, loss is: [1.0455577], acc is [0.99804688] / 0.941
epoch: 460, iter: 0, loss is: [1.0455561], acc is [0.99804688] / 0.941
epoch: 470, iter: 0, loss is: [1.0475075], acc is [0.99609375] / 0.94
epoch: 480, iter: 0, loss is: [1.0494622], acc is [0.99414062] / 0.941
epoch: 490, iter: 0, loss is: [1.0474067], acc is [0.99609375] / 0.939
epoch: 500, iter: 0, loss is: [1.0514139], acc is [0.9921875] / 0.942
epoch: 510, iter: 0, loss is: [1.0436], acc is [1.] / 0.939
epoch: 520, iter: 0, loss is: [1.0494587], acc is [0.99414062] / 0.941
epoch: 530, iter: 0, loss is: [1.045552], acc is [0.99804688] / 0.94
epoch: 540, iter: 0, loss is: [1.043597], acc is [1.] / 0.94
epoch: 550, iter: 0, loss is: [1.051412], acc is [0.9921875] / 0.942
epoch: 560, iter: 0, loss is: [1.049456], acc is [0.99414062] / 0.941
epoch: 570, iter: 0, loss is: [1.0435953], acc is [1.] / 0.943
epoch: 580, iter: 0, loss is: [1.0494558], acc is [0.99414062] / 0.94
epoch: 590, iter: 0, loss is: [1.0494573], acc is [0.99414062] / 0.942
epoch: 600, iter: 0, loss is: [1.0455493], acc is [0.99804688] / 0.942
epoch: 610, iter: 0, loss is: [1.0533609], acc is [0.99023438] / 0.943
epoch: 620, iter: 0, loss is: [1.0514091], acc is [0.9921875] / 0.942
epoch: 630, iter: 0, loss is: [1.0514086], acc is [0.9921875] / 0.941
epoch: 640, iter: 0, loss is: [1.0455501], acc is [0.99804688] / 0.941
epoch: 650, iter: 0, loss is: [1.0455483], acc is [0.99804688] / 0.942
epoch: 660, iter: 0, loss is: [1.0435947], acc is [1.] / 0.942
epoch: 670, iter: 0, loss is: [1.0475001], acc is [0.99609375] / 0.943
epoch: 680, iter: 0, loss is: [1.0514076], acc is [0.9921875] / 0.943
epoch: 690, iter: 0, loss is: [1.0455478], acc is [0.99804688] / 0.942
epoch: 700, iter: 0, loss is: [1.0514073], acc is [0.9921875] / 0.941
epoch: 710, iter: 0, loss is: [1.0455493], acc is [0.99804688] / 0.938
epoch: 720, iter: 0, loss is: [1.0533599], acc is [0.99023438] / 0.943
epoch: 730, iter: 0, loss is: [1.0455471], acc is [0.99804688] / 0.943
epoch: 740, iter: 0, loss is: [1.0455468], acc is [0.99804688] / 0.945
epoch: 750, iter: 0, loss is: [1.0474999], acc is [0.99609375] / 0.945
epoch: 760, iter: 0, loss is: [1.0474997], acc is [0.99609375] / 0.945
epoch: 770, iter: 0, loss is: [1.0474995], acc is [0.99609375] / 0.945
epoch: 780, iter: 0, loss is: [1.0474992], acc is [0.99609375] / 0.945
epoch: 790, iter: 0, loss is: [1.0435936], acc is [1.] / 0.945
epoch: 800, iter: 0, loss is: [1.0435929], acc is [1.] / 0.945
epoch: 810, iter: 0, loss is: [1.0435933], acc is [1.] / 0.945
epoch: 820, iter: 0, loss is: [1.045546], acc is [0.99804688] / 0.945
epoch: 830, iter: 0, loss is: [1.0494521], acc is [0.99414062] / 0.945
epoch: 840, iter: 0, loss is: [1.0494528], acc is [0.99414062] / 0.945
epoch: 850, iter: 0, loss is: [1.0435925], acc is [1.] / 0.945
epoch: 860, iter: 0, loss is: [1.047499], acc is [0.99609375] / 0.945
epoch: 870, iter: 0, loss is: [1.0455458], acc is [0.99804688] / 0.945
epoch: 880, iter: 0, loss is: [1.0435927], acc is [1.] / 0.945
epoch: 890, iter: 0, loss is: [1.0474988], acc is [0.99609375] / 0.945
epoch: 900, iter: 0, loss is: [1.0455456], acc is [0.99804688] / 0.945
epoch: 910, iter: 0, loss is: [1.0455455], acc is [0.99804688] / 0.945
epoch: 920, iter: 0, loss is: [1.0435923], acc is [1.] / 0.945
epoch: 930, iter: 0, loss is: [1.0455456], acc is [0.99804688] / 0.945
epoch: 940, iter: 0, loss is: [1.0474986], acc is [0.99609375] / 0.945
epoch: 950, iter: 0, loss is: [1.0494516], acc is [0.99414062] / 0.945
epoch: 960, iter: 0, loss is: [1.0474982], acc is [0.99609375] / 0.945
epoch: 970, iter: 0, loss is: [1.0455455], acc is [0.99804688] / 0.946
epoch: 980, iter: 0, loss is: [1.0455455], acc is [0.99804688] / 0.945
epoch: 990, iter: 0, loss is: [1.0537219], acc is [0.99023438] / 0.927
model.eval()
test_data = paddle.to_tensor(test_df.values.reshape(-1, 1, 561).astype(np.float32))
test_predict = model(test_data)
test_predict = test_predict.argmax(1).numpy()
test_predict = pd.DataFrame({'Activity': test_predict})
test_predict['Activity'] = test_predict['Activity'].map({
0:'LAYING',
1:'STANDING',
2:'SITTING',
3:'WALKING',
4:'WALKING_UPSTAIRS',
5:'WALKING_DOWNSTAIRS'
})
test_predict.to_csv('submission.csv', index=None)
!zip submission.zip submission.csv
updating: submission.csv (deflated 93%)
submission.csv’, index=None)
!zip submission.zip submission.csv
updating: submission.csv (deflated 93%)
## 未来上分点
1. 模型可以加入残差结构,参考resnet。
2. 数据可以加入数据扩增,比如加噪音。
```python