MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library

52 篇文章 1 订阅

使用pytorch做分布式训练时,遇到错误:

Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
    Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.

解决方案1:在环境变量增加设置

export MKL_SERVICE_FORCE_INTEL=1

解决方案2:在环境变量增加设置

export MKL_THREADING_LAYER=GNU

问题分析:

Grepping conda manifests, libgomp is pulled in by libgcc-ng, which is in turn pulled in by, uh, pretty much everything. So the culprit is more likely to be whoever's setting MKL_THREADING_LAYER=INTEL. As far as that goes, well, it's weird.

import os

def print_layer(prefix):
    print(f'{prefix}: {os.environ.get("MKL_THREADING_LAYER")}')

if __name__ == '__main__':
    print_layer('Pre-import')
    import numpy as np
    from torch import multiprocessing as mp
    print_layer('Post-import')

    mp.set_start_method('spawn')
    p = mp.Process(target=print_layer, args=('Child',))
    p.start()
    p.join()

See, if torch is imported before numpy then the child process here gets a GNU threading layer (even though the parent doesn't have the variable defined).

Pre-import: None
Post-import: None
Child: GNU

But if the imports are swapped so numpy is imported before torch, the child process gets an INTEL threading layer

Pre-import: None
Post-import: None
Child: INTEL

So I suspect numpy - or ones of its imports - is messing with the env parameter of Popen, but half an hour's search and I can't figure out how.

Ref: https://github.com/pytorch/pytorch/issues/37377

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值