Scene-Graph-Benchmark环境配置过程（基于ubuntu18.04，cuda11.1，cudnn8.0.5，torch1.8.1）

原创已于 2025-11-23 10:31:22 修改 · 561 阅读

15 ·

CC 4.0 BY-SA版权

文章标签：

#python #目标检测 #人工智能 #深度学习 #pytorch

于 2025-06-28 12:25:04 首次发布

部署运行你感兴趣的模型镜像

Scene-Graph-Benchmark环境配置过程（基于ubuntu18.04，cuda11.1，cudnn8.0.5，torch1.8.1）

安装cuda11.1和cudnn8.0.5
安装torch-1.8.1和torchvision-0.9.1
配置Scene-Graph-Benchmark

安装cuda11.1和cudnn8.0.5

由于已经2025年了，太旧的cuda适应不了新的显卡，太新的cuda搞不定旧的环境，踩了很多坑，采取这个方案，在ubuntu18.04上进行配置。cuda11.1和cudnn8.0.5不讲了，直接参考https://blog.csdn.net/m0_71087087/article/details/135828903

安装torch-1.8.1和torchvision-0.9.1

anaconda创建虚拟环境，不必须但是建议

conda create -n sgbm python=3.7

然后把下面的语句加入.bashrc文件的最后一行，这样打开新的终端会自动进入创建的虚拟环境中

conda activate sgbm

可以用迅雷把这两个包下载下来，使用离线安装的方法，会比较快

https://download.pytorch.org/whl/cu111/torch-1.8.1%2Bcu111-cp37-cp37m-linux_x86_64.whl
https://download.pytorch.org/whl/cu111/torchvision-0.9.1%2Bcu111-cp37-cp37m-linux_x86_64.whl

然后在离线安装包所在的目录运行下面的命令。

pip install torch-1.8.1+cu111-cp37-cp37m-linux_x86_64.whl torchvision-0.9.1+cu111-cp37-cp37m-linux_x86_64.whl torchaudio==0.8.1 torchtext==0.9.1  -f https://download.pytorch.org/whl/torch_stable.html -i https://pypi.tuna.tsinghua.edu.cn/simple

-i https://pypi.tuna.tsinghua.edu.cn/simple可以使用清华镜像源加速。
不想安装torchaudio==0.8.1 torchtext==0.9.1的可以直接把这部分删掉。

基本上到这里torch-1.8.1和torchvision-0.9.1环境就搞定了。

配置Scene-Graph-Benchmark

上面安装cuda11.1的配置里面好像没有设置CUDA_HOME，我.bashrc里面的CUDA相关配置是这样的

export PATH=/usr/local/cuda-11.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda-11.1 #/usr/local/cuda

安装依赖包

pip install ipython scipy h5py ninja yacs cython matplotlib tqdm opencv-python -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install overrides -i https://pypi.tuna.tsinghua.edu.cn/simple

别问为什么overrides要单独安装，不知道，反正放一起报错了。

安装cocoapi，这个网上很多细节教程，我直接使用的下面命令。

pip install pycocotools -i https://pypi.tuna.tsinghua.edu.cn/simple

喜欢细节的可以参考https://blog.csdn.net/gaoqing_dream163/article/details/112554621

安装apex

cd到准备放置apex的目录

git clone https://github.com/NVIDIA/apex.git
cd apex
git reset --hard 3fe10b5597ba14a748ebb271a6ab97c09c5701ac
python setup.py install --cuda_ext --cpp_ext

这时候可能会报错

  File "/home/ps/anaconda3/envs/sgbm/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1683, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

在终端运行下面命令打开错误所在文件。把1631行左右的的command = ['ninja', '-v']改成command = ['ninja', '--version']。

gedit /home/ps/anaconda3/envs/sgbm/lib/python3.7/site-packages/torch/utils/cpp_extension.py

再次执行python setup.py install --cuda_ext --cpp_ext就安装成功了。

到这里环境配置就完成了

Scene-Graph-Benchmark 编译安装

cd到你希望放置这个工程的目录，然后运行。

git clone https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch.git
cd scene-graph-benchmark
python setup.py build develop

执行这一步有人会报错，这个错我在其中一台电脑上解决了，当时忘记记录，在另外一台电脑没解决。就把之前电脑行的那个/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc下面的东西全部拷贝到另一个电脑对应的目录下了，然后再执行 python setup.py build develop就成功了。知道怎么解决的同学欢迎评论，我将十分感谢。

g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/cpu/ROIAlign_cpu.o: 没有那个文件或目录
g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/cpu/nms_cpu.o: 没有那个文件或目录
g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/cuda/ROIAlign_cuda.o: 没有那个文件或目录
g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/cuda/ROIPool_cuda.o: 没有那个文件或目录
g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/cuda/SigmoidFocalLoss_cuda.o: 没有那个文件或目录
g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/cuda/deform_conv_cuda.o: 没有那个文件或目录
g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/cuda/deform_conv_kernel_cuda.o: 没有那个文件或目录
g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/cuda/deform_pool_cuda.o: 没有那个文件或目录
g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/cuda/deform_pool_kernel_cuda.o: 没有那个文件或目录
g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/cuda/nms.o: 没有那个文件或目录
g++: error: /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/build/temp.linux-x86_64-cpython-37/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/csrc/vision.o: 没有那个文件或目录
error: command '/usr/bin/g++' failed with exit code 1

Scene-Graph-Benchmark 运行报错

在运行程序之前，还要按照要求准备好数据集和相关文件，这里不再赘述

报错1

/home/ps/anaconda3/envs/sgbm/bin/python /home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/my_relation_train_net.py 
Traceback (most recent call last):
  File "/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/my_relation_train_net.py", line 8, in <module>
    from maskrcnn_benchmark.utils.env import setup_environment  # noqa F401 isort:skip
  File "/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/utils/env.py", line 4, in <module>
    from maskrcnn_benchmark.utils.imports import import_file
  File "/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/utils/imports.py", line 4, in <module>
    if torch._six.PY3:
AttributeError: module 'torch._six' has no attribute 'PY3'

解决1
找到"/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/utils/imports.py"把torch._six.PY3改成torch._six.PY37

报错2

  File "/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/modeling/rpn/rpn.py", line 178, in _forward_train
    anchors, objectness, rpn_box_regression, targets
  File "/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/modeling/rpn/loss.py", line 106, in __call__
    sampled_pos_inds, sampled_neg_inds = self.fg_bg_sampler(labels)
  File "/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/modeling/balanced_positive_negative_sampler.py", line 38, in __call__
    positive = torch.nonzero(matched_idxs_per_image >= 1).squeeze(1)
RuntimeError: CUDA error: device-side assert triggered

解决2
现在代码的最开始加上

import os
os.environ['CUDA_LAUNCH_BLOCKING'] = '1'

再次运行程序，就可以定位到真正的报错地方

  File "/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/modeling/rpn/loss.py", line 106, in __call__
    sampled_pos_inds, sampled_neg_inds = self.fg_bg_sampler(labels)
  File "/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/maskrcnn_benchmark/modeling/balanced_positive_negative_sampler.py", line 53, in __call__
    neg_idx_per_image = negative[perm2]
RuntimeError: CUDA error: device-side assert triggered

报错位置在这里，perm2 = torch.randperm(negative.numel(), device=negative.device)[:num_neg]生成perm2的过程有问题。

# randomly select positive and negative examples
perm1 = torch.randperm(positive.numel(), device=positive.device)[:num_pos]
perm2 = torch.randperm(negative.numel(), device=negative.device)[:num_neg]

pos_idx_per_image = positive[perm1]
neg_idx_per_image = negative[perm2]

将perm2 = torch.randperm(negative.numel(), device=negative.device)[:num_neg]改成如下过程问题就解决了。

# perm2 = torch.randperm(min(negative.numel(), 20485), device=negative.device)[:num_neg]
if negative.numel() > 20480:
    perm2 = torch.randperm(negative.numel(), device='cpu').to(negative.device)[:num_neg]
else:
    perm2 = torch.randperm(negative.numel(), device=negative.device)[:num_neg]

其他问题或解决方案欢迎在下面评论

训练过程中，博主使用的配置文件是e2e_relation_X_101_32_8_FPN_1x.yaml，在训练过程中，“MotifPredictor” 、“VCTreePredictor”、"TransformerPredictor"使用如下设置取得了较好的效果。TransformerPredictor一开始总是没有取得好结果，在TransformerPredictor的训练参数上摸索了比较久。

INPUT:
  MIN_SIZE_TRAIN: (600,)
  MAX_SIZE_TRAIN: 1000
  MIN_SIZE_TEST: 600
  MAX_SIZE_TEST: 1000
MODEL:
  PRETRAINED_DETECTOR_CKPT: "/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/checkpoints/pretrained_faster_rcnn/model_final.pth"
  META_ARCHITECTURE: "GeneralizedRCNN"
  WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d"
  BACKBONE:
    CONV_BODY: "R-101-FPN" # VGG-16
  RESNETS:
    BACKBONE_OUT_CHANNELS: 256
    STRIDE_IN_1X1: False
    NUM_GROUPS: 32
    WIDTH_PER_GROUP: 8
  RELATION_ON: True
  ATTRIBUTE_ON: False
  FLIP_AUG: False            # if there is any left-right relation, FLIP AUG should be false
  RPN:
    USE_FPN: True
    ANCHOR_SIZES: (32, 64, 128, 256, 512)
    ANCHOR_STRIDE: (4, 8, 16, 32, 64)
    ASPECT_RATIOS: (0.23232838, 0.63365731, 1.28478321, 3.15089189)   # from neural-motifs
    PRE_NMS_TOP_N_TRAIN: 6000
    PRE_NMS_TOP_N_TEST: 6000
    POST_NMS_TOP_N_TRAIN: 1000
    POST_NMS_TOP_N_TEST: 1000
    FPN_POST_NMS_TOP_N_TRAIN: 1000
    FPN_POST_NMS_TOP_N_TEST: 1000
    FPN_POST_NMS_PER_BATCH: False
    RPN_MID_CHANNEL: 256
  ROI_HEADS:
    USE_FPN: True
    POSITIVE_FRACTION: 0.5
    BG_IOU_THRESHOLD: 0.3
    BATCH_SIZE_PER_IMAGE: 256
    DETECTIONS_PER_IMG: 80
    NMS_FILTER_DUPLICATES: True
  ROI_BOX_HEAD:
    POOLER_RESOLUTION: 7
    POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
    POOLER_SAMPLING_RATIO: 2
    FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
    PREDICTOR: "FPNPredictor"
    NUM_CLASSES: 151                # 151 for VG, 1201 for GQA
    MLP_HEAD_DIM: 4096
  ROI_ATTRIBUTE_HEAD:
    FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
    PREDICTOR: "FPNPredictor"
    USE_BINARY_LOSS: True           # choose binary, because cross_entropy loss deteriorate the box head, even with 0.1 weight
    POS_WEIGHT: 50.0
    ATTRIBUTE_LOSS_WEIGHT: 1.0
    NUM_ATTRIBUTES: 201             # 201 for VG, 501 for GQA
    MAX_ATTRIBUTES: 10             
    ATTRIBUTE_BGFG_SAMPLE: True    
    ATTRIBUTE_BGFG_RATIO: 3        
  ROI_RELATION_HEAD:
    USE_GT_BOX: False                       # for choose sgdet, sgcls, precls
    USE_GT_OBJECT_LABEL: False              # for choose sgdet, sgcls, precls
    REQUIRE_BOX_OVERLAP: False              # for sgdet, during training, only train pairs with overlap 重叠
    ADD_GTBOX_TO_PROPOSAL_IN_TRAIN: True    # for sgdet only, in case some gt boxes are missing
    NUM_CLASSES: 51                 # 51 for VG, 201 for GQA (not contain "to the left of" & "to the right of")
    BATCH_SIZE_PER_IMAGE: 1024      # sample as much as possible
    POSITIVE_FRACTION: 0.25
    CONTEXT_POOLING_DIM: 4096
    CONTEXT_HIDDEN_DIM: 512         #1024 for VCTree 512 for Others
    POOLING_ALL_LEVELS: True
    LABEL_SMOOTHING_LOSS: False
    FEATURE_EXTRACTOR: "RelationFeatureExtractor"
    #################### Select Relationship Model ####################
#    PREDICTOR: "MotifPredictor"
#    PREDICTOR: "VCTreePredictor"
    PREDICTOR: "TransformerPredictor"
#    PREDICTOR: "VtransePredictor"
#    PREDICTOR: "CausalAnalysisPredictor"
    ################# Parameters for Motif Predictor ##################
    CONTEXT_OBJ_LAYER: 1
    CONTEXT_REL_LAYER: 1
    ############# Parameters for Causal Unbias Predictor ##############
    ### Implementation for paper "Unbiased Scene Graph Generation from Biased Training"
    CAUSAL:
      EFFECT_TYPE: 'none'             # candicates: 'TDE', 'NIE', 'TE', 'none'
      FUSION_TYPE: 'sum'              # candicates: 'sum', 'gate'
      SEPARATE_SPATIAL: False         # separate spatial in union feature
      CONTEXT_LAYER: "motifs"         # candicates: motifs, vctree, vtranse
      SPATIAL_FOR_VISION: True
      EFFECT_ANALYSIS: True
    ############### Parameters for Transformer Predictor ##############
    TRANSFORMER:
      DROPOUT_RATE: 0.1
      OBJ_LAYER: 4
      REL_LAYER: 2
      NUM_HEAD: 8
      KEY_DIM: 64
      VAL_DIM: 64
      INNER_DIM: 2048
DATASETS:
  TRAIN: ("VG_stanford_filtered_with_attribute_train",)
  VAL: ("VG_stanford_filtered_with_attribute_val",)
  TEST: ("VG_stanford_filtered_with_attribute_test",) # VG_stanford_filtered_with_attribute_test
DATALOADER:
  SIZE_DIVISIBILITY: 32
SOLVER:
  BIAS_LR_FACTOR: 2     # 2 for Transformer, 1 for Motif, VCTree
  BASE_LR: 0.001        # 0.0001 for Transformer, 0.01 for Motif, VCTree
  WARMUP_FACTOR: 0.01    # 0.01 for Transformer, 0.1 for Motif, VCTree
  WEIGHT_DECAY: 0.0001  # 0.01 for Transformer, 0.0001 for Motif, VCTree
  MOMENTUM: 0.9
  GRAD_NORM_CLIP: 1.0   # 1 for Transformer, 5.0 for Motif, VCTree
  STEPS: (8000, 12000, 16000) # (8000, 12000, 16000) for Transformer, (10000, 16000) for Motif, VCTree
  MAX_ITER: 20000       # 20000 for Transformer, 40000 for Motif, VCTree
  VAL_PERIOD: 2000
  CHECKPOINT_PERIOD: 1000
  PRINT_GRAD_FREQ: 1000 # 1000 for Transformer, 4000 for Motif, VCTree
  IMS_PER_BATCH: 16     # 16 for Transformer, 64 for Motif, VCTree
  PRE_VAL: False
  SCHEDULE:
    # the following paramters are only used for WarmupReduceLROnPlateau
    TYPE: "WarmupMultiStepLR"    # WarmupMultiStepLR for TransformerPredictor, WarmupReduceLROnPlateau for Motif, VCTree
    PATIENCE: 2
    THRESHOLD: 0.001
    COOLDOWN: 0
    FACTOR: 0.1
    MAX_DECAY_STEP: 3
OUTPUT_DIR: '/home/ps/Disk2/sgbm_model'
TEST:
  ALLOW_LOAD_FROM_CACHE: False
  RELATION:
    SYNC_GATHER: True      # turn on will slow down the evaluation to solve the sgdet test out of memory problem
    REQUIRE_OVERLAP: False
    LATER_NMS_PREDICTION_THRES: 0.5
  CUSTUM_EVAL: False       # eval SGDet model on custum images, output a json
  CUSTUM_PATH: '.'         # the folder that contains the custum images, only jpg files are allowed
  IMS_PER_BATCH: 8  # me
DTYPE: "float16"  # me
GLOVE_DIR: '/home/ps/glove'

其中，重点需要关注的参数如下：

MODEL:
  PRETRAINED_DETECTOR_CKPT:"/home/ps/MyProject/Scene-Graph-Benchmark.pytorch-master/checkpoints/pretrained_faster_rcnn/model_final.pth"

ROI_RELATION_HEAD:
    USE_GT_BOX: False                       # for choose sgdet, sgcls, precls
    USE_GT_OBJECT_LABEL: False              # for choose sgdet, sgcls, precls
    REQUIRE_BOX_OVERLAP: False              # for sgdet, during training, only train pairs with overlap 重叠
    ADD_GTBOX_TO_PROPOSAL_IN_TRAIN: True    # for sgdet only, in case some gt boxes are missing
    
#    PREDICTOR: "MotifPredictor"
#    PREDICTOR: "VCTreePredictor"
    PREDICTOR: "TransformerPredictor"
#    PREDICTOR: "VtransePredictor"
#    PREDICTOR: "CausalAnalysisPredictor"

SOLVER:
  BIAS_LR_FACTOR: 2     # 2 for Transformer, 1 for Motif, VCTree
  BASE_LR: 0.001        # 0.0001 for Transformer, 0.01 for Motif, VCTree
  WARMUP_FACTOR: 0.01    # 0.01 for Transformer, 0.1 for Motif, VCTree
  WEIGHT_DECAY: 0.0001  # 0.01 for Transformer, 0.0001 for Motif, VCTree
  MOMENTUM: 0.9
  GRAD_NORM_CLIP: 1.0   # 1 for Transformer, 5.0 for Motif, VCTree
  STEPS: (8000, 12000, 16000) # (8000, 12000, 16000) for Transformer, (10000, 16000) for Motif, VCTree
  MAX_ITER: 20000       # 20000 for Transformer, 40000 for Motif, VCTree
  VAL_PERIOD: 2000
  CHECKPOINT_PERIOD: 1000
  PRINT_GRAD_FREQ: 1000 # 1000 for Transformer, 4000 for Motif, VCTree
  IMS_PER_BATCH: 16     # 16 for Transformer, 64 for Motif, VCTree
  SCHEDULE:
    # the following paramters are only used for WarmupReduceLROnPlateau
    TYPE: "WarmupMultiStepLR"    # WarmupMultiStepLR for TransformerPredictor, WarmupReduceLROnPlateau for Motif, VCTree

OUTPUT_DIR: '/home/ps/Disk2/sgbm_model'
TEST:
  ALLOW_LOAD_FROM_CACHE: False
  RELATION:
    SYNC_GATHER: True      # turn on will slow down the evaluation to solve the sgdet test out of memory problem
    REQUIRE_OVERLAP: False

博主训练好的模型可在此处下载，所得指标与其他论文不完全一致

您可能感兴趣的与本文相关的镜像

PyTorch 2.5

PyTorch

Cuda

PyTorch 是一个开源的 Python 机器学习库，基于 Torch 库，底层由 C++ 实现，应用于人工智能领域，如计算机视觉和自然语言处理