题目介绍
是一个语音转MFCC帧后的识别分类问题
2023ML课程网站ML 2023 Spring (ntu.edu.tw)
作业给的基线代码:ML2023Spring - HW2 - Colaboratory (google.com)
kaggle提交:ML2023Spring-hw2 | Kaggle
以上可能需要科学上网
题目解答
很无聊的题目,力大飞砖就结束了,吐槽一下LSTM训练好慢,GPU卡只能跑10%。。。想并发跑还炸内存(nframe拼的太多了)
simple baseline(0.49798)
拿到代码跑一遍
medium baseline(0.66440)
做了一些尝试,单独增加数据的nframes并不能带来更大的提升了,当时我以为21就是包含信息的极限了,实际上是因为模型太简单了,很处理好距离更远的信息。
如果要增加更多上下文信息给模型,那就要同时提升模型的复杂程度。
学到的一点是,简单的模型给的信息更多反而会变得混淆,而更复杂的模型则可以靠额外的信息提升准确率。
concat_nframes | hidden_layers | 激活单元 | batchsize=512 | 额外层 | 线性层维度=64 | rnn层 | epoch=10 | 私榜 |
3 | 2 | 0.49706 | ||||||
19 | 2 | 0.60361 | ||||||
21 | 4 | 0.59865 | ||||||
31 | 4 | Leakyrelu(0.1) | batchnorm | 256 | 0.69499 |
strong baseline(0.74944)
strong baseline只需要加epoch,把模型继续改的更复杂就可以了,但是达到strong之后我也做过一些尝试,想要沿用之前的思路,继续加模型复杂度,但是发现已经不太能够继续提升了。使用了一层BiLSTM并没有太大的提升。
concat_nframes | hidden_layers | 激活单元 | batchsize=512 | 额外层 | 线性层维度=64 | rnn层 | epoch=10 | 私榜 |
31 | 5 | Leakyrelu(0.1) | dropout(0.3)+batchnorm | 1024 | 30 | 0.7535 | ||
31 | 7 | Leakyrelu(0.1) | dropout(0.3)+batchnorm | 1024 | 30 | 0.75241 | ||
31 | 5 | Leakyrelu(0.1) | dropout(0.3)+batchnorm | 金字塔形1024~128 | 30 | 0.73左右 | ||
31 | 7 | Leakyrelu(0.1) | dropout(0.25)+batchnorm | 1024 | 30 | 0.75562 | ||
31 | 7 | Leakyrelu(0.1) | dropout(0.25)+batchnorm | 1024 | bilstm2*128 | 10 | 0.74375 | |
31 | 7 | Leakyrelu(0.1) | dropout(0.25)+batchnorm | 1024 | bilstm2*512 | 10 | 0.74317 | |
31 | 7 | Leakyrelu(0.1) | dropout(0.45)+batchnorm | 1024 | bilstm2*512 | 10 | ||
31 | 5 | Leakyrelu(0.1) | dropout(0.5)+batchnorm | 1024 | bilstm2*512 | 10 | 0.74539 | |
21 | 5 | Leakyrelu(0.1) | dropout(0.5)+batchnorm | 1024 | bilstm2*512 | 10 | 0.73641 | |
31 | 5 | Leakyrelu(0.1) | dropout(0.5)+batchnorm | 1024 | bilstm2*512+dropout | 10 | 0.74426 | |
31 | 3 | dropout(0.5)+batchnorm | 1024 | bilstm2*512+dropout+mean | 10 | 0.735 | ||
31 | 3 | dropout(0.5)+batchnorm | 1024 | bilstm2*512+dropout+中间 | 10 | 0.74763 | ||
31 | 3 | dropout(0.5)+batchnorm | 1024 | bilstm2*512 | 10 | 0.749 | ||
31 | 10 | dropout(0.5)+batchnorm | 1024 | bilstm2*512 | 10 | 0.73325 |
boss baseline(0.83017)
后来发现实际上和之前做medium的问题相反,模型复杂,但是数据简单也是有上限的。只需要把训练次数加大(但是下面的表格没有记录,实际上最下面的81层训练的epoch是20个),然后模型加复杂就行了。
这里有个坑,TrainAcc和ValidAcc的前期表现很像过拟合,因为会有一段TrainAcc涨点但是ValidAcc掉点,不要停,继续训练下去,后续ValidAcc会再涨上来,会有波动。
nframes | score | rnn_layer | hiddenlayer | hiddendim/rnn_dim |
---|---|---|---|---|
51 | 0.75909 | 1 | 4 | 1024/512 |
51 | 0.77988 | 1 | 4 | 2048/1024 |
51 | 0.78554 | 2 | 4 | 2048/1024 |
81 | 0.82639 | 5 | 4 | 1024/512 |
一直train下去就行了,50个epoch
过程记录
boss baseline之前的尝试记录,各位有兴趣可以看看,boss baseline的尝试在文章里
concat_nframes | hidden_layers | 激活单元 | batchsize=512 | 额外层 | 线性层维度=64 | rnn层 | epoch=10 | 私榜 |
3 | 2 | 0.49706 | ||||||
5 | 2 | 0.53473 | ||||||
7 | 2 | 0.55846 | ||||||
9 | 2 | 0.57239 | ||||||
13 | 2 | 0.59363 | ||||||
17 | 2 | 0.60141 | ||||||
19 | 2 | 0.60361 | ||||||
23 | 2 | 0.6071 | ||||||
31 | 2 | 0.6081 | ||||||
31 | 4 | 0.59882 | ||||||
21 | 4 | 0.59865 | ||||||
31 | 4 | Leakyrelu(0.01) | 0.60042 | |||||
31 | 3 | Leakyrelu(0.01) | 0.60277 | |||||
31 | 3 | Leakyrelu(0.1) | 0.6061 | |||||
31 | 3 | Leakyrelu(0.01) | 1024 | 0.5933 | ||||
31 | 3 | Leakyrelu(0.1) | 128 | |||||
31 | 3 | Leakyrelu(0.1) | batchnorm | 0.63354 | ||||
31 | 3 | Leakyrelu(0.1) | dropout(0.2)+batchnorm | 0.57017 | ||||
31 | 3 | Leakyrelu(0.1) | batchnorm | 128 | 0.67084 | |||
31 | 3 | Leakyrelu(0.1) | batchnorm | 256 | 0.69201 | |||
31 | 4 | Leakyrelu(0.1) | batchnorm | 256 | 0.69499 | |||
31 | 4 | Leakyrelu(0.1) | batchnorm | 512 | 0.69832 | |||
31 | 4 | Leakyrelu(0.1) | dropout(0.2)+batchnorm | 512 | 0.71315 | |||
31 | 4 | Leakyrelu(0.1) | dropout(0.4)+batchnorm | 512 | 0.67169 | |||
31 | 4 | Leakyrelu(0.1) | dropout(0.4)+batchnorm | 1024 | 0.70874 | |||
31 | 4 | Leakyrelu(0.1) | dropout(0.3)+batchnorm | 1024 | 0.72534 | |||
31 | 5 | Leakyrelu(0.1) | dropout(0.3)+batchnorm | 1024 | 0.72484 | |||
31 | 5 | Leakyrelu(0.1) | dropout(0.3)+batchnorm | 1024 | 15 | 0.73771 | ||
31 | 5 | Leakyrelu(0.1) | dropout(0.3)+batchnorm | 1024 | 20 | 0.74509 | ||
31 | 5 | Leakyrelu(0.1) | dropout(0.3)+batchnorm | 1024 | 30 | 0.7535 | ||
31 | 7 | Leakyrelu(0.1) | dropout(0.3)+batchnorm | 1024 | 30 | 0.75241 | ||
31 | 5 | Leakyrelu(0.1) | dropout(0.3)+batchnorm | 金字塔形1024~128 | 30 | 0.73左右 | ||
31 | 7 | Leakyrelu(0.1) | dropout(0.25)+batchnorm | 1024 | 30 | 0.75562 | ||
31 | 7 | Leakyrelu(0.1) | dropout(0.25)+batchnorm | 1024 | bilstm2*128 | 10 | 0.74375 | |
31 | 7 | Leakyrelu(0.1) | dropout(0.25)+batchnorm | 1024 | bilstm2*512 | 10 | 0.74317 | |
31 | 7 | Leakyrelu(0.1) | dropout(0.45)+batchnorm | 1024 | bilstm2*512 | 10 | ||
31 | 5 | Leakyrelu(0.1) | dropout(0.5)+batchnorm | 1024 | bilstm2*512 | 10 | 0.74539 | |
21 | 5 | Leakyrelu(0.1) | dropout(0.5)+batchnorm | 1024 | bilstm2*512 | 10 | 0.73641 | |
31 | 5 | Leakyrelu(0.1) | dropout(0.5)+batchnorm | 1024 | bilstm2*512+dropout | 10 | 0.74426 | |
31 | 3 | dropout(0.5)+batchnorm | 1024 | bilstm2*512+dropout+mean | 10 | 0.735 | ||
31 | 3 | dropout(0.5)+batchnorm | 1024 | bilstm2*512+dropout+中间 | 10 | 0.74763 | ||
31 | 3 | dropout(0.5)+batchnorm | 1024 | bilstm2*512 | 10 | 0.749 | ||
31 | 10 | dropout(0.5)+batchnorm | 1024 | bilstm2*512 | 10 | 0.73325 |
总结
加复杂度和epoch就行了
代码开源
百度网盘:
链接:https://pan.baidu.com/s/1kDSS9ptY8Vojb_y5zOQDrA?pwd=ncm3
提取码:ncm3