CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning 代码执行踩坑记录
文章目录
- CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning 代码执行踩坑记录
- gnutls_handshake() failed: The TLS connection was non-properly terminated.
- ImportError: numpy.core.multiarray failed to import
- APEX安装问题
- csrc/layer_norm_cuda_kernel.cu:4:10: fatal error: ATen/cuda/DeviceUtils.cuh: No such file or directory
- AttributeError: can't set attribute
- pytorch-lightning版本不对应导致checkpoint等中很多字段不见了
- cuda无法使用的问题torch.cuda.is_available() False
- AttributeError: module ‘torch.cuda’ has no attribtue ‘amp’
- 解决python UnicodeDecodeError: ‘ascii’ codec can’t decode byte ……
- On running finetune.py for seq2seq, the following error comes up: optimizer_step() got an unexpected keyword argument 'using_native_amp'
- RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 2.00 GiB total capacity; 1.34 GiB already allocated; 14.76 MiB free; 1.38 GiB reserved in total by PyTorch)
- BERT运行:No model name: ransformers.modeling_bert
gnutls_handshake() failed: The TLS connection was non-properly terminated.
ImportError: numpy.core.multiarray failed to import
https://blog.csdn.net/weixin_41765699/article/details/81780722
APEX安装问题
https://zhuanlan.zhihu.com/p/80386137
csrc/layer_norm_cuda_kernel.cu:4:10: fatal error: ATen/cuda/DeviceUtils.cuh: No such file or directory
解决方案:
RUN cd /home/ && git clone https://github.com/NVIDIA/apex.git apex && cd apex && git reset --hard 3fe10b5597ba14a748ebb271a6ab97c09c5701ac && pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
参考链接
https://github.com/NVIDIA/semantic-segmentation/issues/168
https://github.com/NVIDIA/apex/issues/1043
AttributeError: can’t set attribute
bug具体情况:
解决方案:
https://github.com/PyTorchLightning/pytorch-lightning/discussions/7525
问题解决了。
pytorch-lightning版本不对应导致checkpoint等中很多字段不见了
通过一系列版本回退以及产生的bug定位到当前项目可能使用的是0.9.0的版本,结果问题解决了
pip install pytorch_lightning==0.9.0
https://github.com/huggingface/transformers/issues/7782
cuda无法使用的问题torch.cuda.is_available() False
nvcc -V 查看cuda版本
设备的cuda是10.1版本的,在pytorch官网查看合适的pytorch安装命令
conda install pytorch1.4.0 torchvision0.5.0 cudatoolkit=10.1 -c pytorch
教程:https://blog.csdn.net/qq_41997920/article/details/105090212
但是这个项目应该安装下面这个版本才不会出错
conda install pytorch1.7.1 torchvision0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch
AttributeError: module ‘torch.cuda’ has no attribtue ‘amp’
conda install pytorch1.7.1 torchvision0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch
https://blog.csdn.net/qq_34211771/article/details/120625282
解决python UnicodeDecodeError: ‘ascii’ codec can’t decode byte ……
f2 = open(path, ‘r’, encoding=‘utf-8’)
https://blog.csdn.net/weixin_40005542/article/details/110078373
On running finetune.py for seq2seq, the following error comes up: optimizer_step() got an unexpected keyword argument ‘using_native_amp’
def optimizer_step(self, epoch, batch_idx, optimizer, optimizer_idx, second_order_closure=None,using_native_amp=None):
把最后一个字段删了即可。
RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 2.00 GiB total capacity; 1.34 GiB already allocated; 14.76 MiB free; 1.38 GiB reserved in total by PyTorch)
调小batch_size没用,直接改用CPU跑
https://www.cnblogs.com/yifanrensheng/p/13381931.html