MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library

最新推荐文章于 2024-08-26 09:46:10 发布

there2belief

最新推荐文章于 2024-08-26 09:46:10 发布

阅读量5.7k

点赞数 3

分类专栏： Linux AI/ML/DL 文章标签： pytorch 人工智能 python

本文链接：https://blog.csdn.net/dou3516/article/details/121396950

版权

AI/ML/DL 同时被 2 个专栏收录

253 篇文章 16 订阅

订阅专栏

Linux

52 篇文章 1 订阅

订阅专栏

使用pytorch做分布式训练时，遇到错误：

Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
    Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.

解决方案1：在环境变量增加设置

export MKL_SERVICE_FORCE_INTEL=1

解决方案2：在环境变量增加设置

export MKL_THREADING_LAYER=GNU

问题分析：

Grepping conda manifests, libgomp is pulled in by libgcc-ng, which is in turn pulled in by, uh, pretty much everything. So the culprit is more likely to be whoever's setting MKL_THREADING_LAYER=INTEL. As far as that goes, well, it's weird.

import os

def print_layer(prefix):
    print(f'{prefix}: {os.environ.get("MKL_THREADING_LAYER")}')

if __name__ == '__main__':
    print_layer('Pre-import')
    import numpy as np
    from torch import multiprocessing as mp
    print_layer('Post-import')

    mp.set_start_method('spawn')
    p = mp.Process(target=print_layer, args=('Child',))
    p.start()
    p.join()

See, if torch is imported before numpy then the child process here gets a GNU threading layer (even though the parent doesn't have the variable defined).

Pre-import: None
Post-import: None
Child: GNU

But if the imports are swapped so numpy is imported before torch, the child process gets an INTEL threading layer

Pre-import: None
Post-import: None
Child: INTEL

So I suspect numpy - or ones of its imports - is messing with the env parameter of Popen, but half an hour's search and I can't figure out how.

Ref: https://github.com/pytorch/pytorch/issues/37377

there2belief

关注

3
点赞
踩
7

收藏

觉得还不错? 一键收藏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录