Traffic Graph Convolutional RNN: A Deep Learning Framework for Network-Scale代码调试记录

最新推荐文章于 2024-10-01 20:17:55 发布

Yukee_

最新推荐文章于 2024-10-01 20:17:55 发布

阅读量413

点赞数 8

文章标签：深度学习 rnn 人工智能

本文链接：https://blog.csdn.net/Yukee_/article/details/135602547

版权

本文讲述了作者在搭建Python3.7.0和PyTorch1.7.1环境过程中遇到的问题，包括版本不兼容、错误调试及内存不足等，详细记录了解决这些问题的步骤和方法。

摘要由CSDN通过智能技术生成

代码地址：https://github.com/zhiyongc/Graph_Convolutional_LSTM
数据集地址：https://drive.google.com/drive/folders/1E-rRwIPFDZcTWc7zZDcyd4XbIgecW97q

环境搭建

作者给出的环境是Python3.6.1和Pytorch0.4.1，但是Pytorch0.4.1太老了，因此查询Python和Pytorch和CUDA匹配的版本并经过安装验证后选择Python3.7.0和CUDA1.7.1，但是仍然有很多因为版本带来的问题，只能逐一修改。

配置`Python3.7.0`和`Pytorch1.7.1`环境

conda create -n Python3.7.0andPyTorch1.7.1 python=3.7.0

activate Python3.7.0andPyTorch1.7.1

pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

调试步骤：

在Code_V1同级目录下新建文件夹Data，把下载好的数据放进去。
在Train_Validate.py中69行下注释data = 'inrix',取消data = 'loop'的注释。
运行Main.py文件，运行之后报错

报错：

报错（1）
- 问题：NameError: name 'train_dataloader' is not defined
- 解决：把93行train_dataloader, valid_dataloader, test_dataloader, max_speed = PrepareDataset(speed_matrix)取消注释。
报错（2）：
- 问题：IndexError: too many indices for array: array is 0-dimensional, but 1 were indexed，该问题有三处。
- 解决：定位到Train_Validate.py的108和110行、210行和212行、301行和312行，把.cpu().numpy()[0]中的[0]去掉
报错（3）：
- 问题：TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
- 解决：参考: 链接，改源码 File "D:\Anaconda3\Lib\site-packages\torch\_tensor.py"，将

    def __array__(self, dtype=None):
        if has_torch_function_unary(self):
            return handle_torch_function(Tensor.__array__, (self,), self, dtype=dtype)
        if dtype is None:
            return self.numpy()
        else:
            return self.numpy().astype(dtype, copy=False)

改为：

    def __array__(self, dtype=None):
        if has_torch_function_unary(self):
            return handle_torch_function(Tensor.__array__, (self,), self, dtype=dtype)
        if dtype is None:
            return self.cpu().numpy()
        else:
            return self.cpu().numpy().astype(dtype, copy=False)

报错（4）
- 问题：IndexError: invalid index of a 0-dim tensor. Use tensor.item()in Python ortensor.item() in C++ to convert a 0-dim tensor to a number，该问题有三处。
- 解决：参考：链接，定位到Train_Validate.py的531和533行、595行和597行、654行和656行，把np.around([loss_l1.data[0]]和np.around([loss_mse.data[0]]改成np.around([loss_l1.item()]和np.around([loss_mse.item()]
报错（5）
- 问题：torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 6.00 GiB total capacity; 12.58 GiB already allocated; 0 bytes free;
- 解决：显存不足，把batch_size改小，改为10
搭建环境遇到的问题
最初搭建的是Python3.6.1

conda create -n Python3.6.1andPyTorch0.4.1 python=3.6.1

报错：
Solving environment: failed PackagesNotFoundError: The following packages are not available from current channels: - python=3.6.1

原因：当前使用的是conda的官方源，其中没有3.6.1版本的python，添加源后再创建环境。

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud//pytorch/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/

进入Python3.6.1andPyTorch0.4.1环境

activate Python3.6.1andPyTorch0.4.1

安装PyTorch0.4.1，作者使用的0.4.1太低了！！！，在链接中查询适合的pytorch版本为1.7

pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

报错：
ERROR: Package 'torch' requires a different Python: 3.6.1 not in '>=3.6.2'
安装3.6.2还是有版本问题，于是使用Python3.7.0

Yukee_

关注

8
点赞
踩
8

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

Traffic Graph Convolutional RNN: A Deep Learning Framework for Network-Scale代码调试记录

环境搭建

配置Python3.7.0和Pytorch1.7.1环境

调试步骤：

报错：

配置`Python3.7.0`和`Pytorch1.7.1`环境