引用项目:https://github.com/SMART-TTS/SMART-Single_Emotional_TTS
音频样本数据:LJSpeech-1.1
样本格式内容如:
LJ_NOR_10001.wav|the chronicles of newgate, volume two. by arthur griffiths. section eight: the beginnings of prison reform.
LJ_NOR_10002.wav|newgate prisoners were the victims to another most objectionable practice which obtained all over london.
LJ_NOR_10003.wav|persons committed to a metropolitan jail at that time were taken in gangs, men and women handcuffed together, or linked on to a long chain,
LJ_NOR_10004.wav|unless they could afford to pay for a vehicle out of their own funds.
异常:Assertion srcIndex < srcSelectDimSize
failed.
(emo_tts3) D:\workspace_tts\emotion-fs-3>python train_transformer.py
Trainable Parameters: 15.927M
C:\Users\fangg\Anaconda3\envs\emo_tts3\lib\site-packages\torch\nn\modules\loss.py:94: UserWarning: Using a target size (torch.Size([8])) that is different to the input size (torch.Size([8, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
return F.l1_loss(input, target, reduction=self.reduction)
| Epoch: 0, 0/330th loss : 0.9549 + 1.1547 + 0.0280 + 4.0505 = 1.2376
Validation| loss : 1.0039 + 1.2023 + 0.0212 + 3.8226 = 6.0500
| Epoch: 0, 1/330th loss : 0.9634 + 1.1604 + 0.0290 + 3.7260 = 1.1758
| Epoch: 0, 2/330th loss : 0.9589 + 1.1530 + 0.0286 + 3.8567 = 1.1994
| Epoch: 0, 3/330th loss : 0.9564 + 1.1508 + 0.0285 + 3.7279 = 1.1727
.
.
.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Indexing.cu:658: block: [38,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Indexing.cu:658: block: [38,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
.
.
.
Traceback (most recent call last):
File "train_transformer.py", line 261, in <module>
main()
File "train_transformer.py", line 223, in main
mel_pred, postnet_pred, attn_probs, decoder_outputs, attns_enc, attns_dec, attns_style, post_linear, duration_predictor_output, duration, weights = m.forward(character, mel_input, pos_text, pos_mel, mel, pos_mel, mel_max_length_array=mel_max_length_array)
File "C:\Users\fangg\Anaconda3\envs\emo_tts3\lib\site-packages\torch\nn\parallel\data_parallel.py", line 159, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\Users\fangg\Anaconda3\envs\emo_tts3\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\workspace_tts\emotion-fs-3\network.py", line 288, in forward
memory, c_mask, attns_enc, duration_mask = self.encoder(characters, pos=pos_text)
File "C:\Users\fangg\Anaconda3\envs\emo_tts3\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\workspace_tts\emotion-fs-3\network.py", line 106, in forward
x, attn = layer(x, x, mask=mask, query_mask=c_mask)
File "C:\Users\fangg\Anaconda3\envs\emo_tts3\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\workspace_tts\emotion-fs-3\module.py", line 289, in forward
result, attns = self.multihead(key, value, query, mask=mask, query_mask=query_mask, kv_mask=kv_mask)
File "C:\Users\fangg\Anaconda3\envs\emo_tts3\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\workspace_tts\emotion-fs-3\module.py", line 212, in forward
attn = t.bmm(query, key.transpose(1, 2)) #batch matrix-matrix product
RuntimeError: CUDA error: device-side assert triggered
问题描述:当前项目下调用python train_transformer.py命令后,有时会报上面异常,有时则直接卡住然后运行结束(什么信息也没有,其实主要的问题就是:Assertion srcIndex < srcSelectDimSize
failed.),然后我就开始尝试修改hyperparams.py里面的一些主要参数(其中网上查找了很多问题相关的文章),没有效果……最后看到了这位老哥的文章no cuda capable device给了我灵感,他说词表中索引不对,我当然不知道他的词表是怎样的,但想到我的metadata_train.csv文件里面的内容好像有一大把标点符号,因为这些标点符号在训练过程中是没有什么用的,很可能问题就在这里,最后我把所有的标点符号都去掉,重新开始
python prepare_data.py
python train_transformer.py
……
想不到竟然OK了,哎,我可是搞了半天了啊这个问题,要是还没成我都打算直接到原项目里面去提问了,值得记录一下。