首先如果apex安装cuda的版本,简单的pip instal apex
是不行的
官网正确的安装步骤
进入官网https://github.com/NVIDIA/apex按照指示输入
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
如果装好了,恭喜,那就没问题了
然而我装的时候会报一些奇奇怪怪的编译错误,还有就是记得安装cudatoolkit(在命令行输入nvcc -V就知道有没有装好,有nvidia-smi不够哈)
我的环境是
win 10
cudatoolkit 11.3
pytorch 1.10
报错
D:/python-3.6.7/lib/site-packages/torch/include\c10/cuda/CUDAStream.h(171): warning: field of class type without a DLL interface used in a class with a DLL interface
D:/vs2017/VC/Tools/MSVC/14.16.27023/include\type_traits(1271): error: static assertion failed with “You’ve instantiated std::aligned_storage<Len, Align> with an extended alignment (in other words, Align > alignof(max_align_t)). Before VS 2017 15.8, the member type would non-conformingly have an alignment of only alignof(max_align_t). VS 2017 15.8 was fixed to handle this correctly, but the fix inherently changes layout and breaks binary compatibility (only for uses of aligned_storage with extended alignments). Please define either (1) _ENABLE_EXTENDED_ALIGNED_STORAGE to acknowledge that you understand this message and that you actually want a type with an extended alignment, or (2) _DISABLE_EXTENDED_ALIGNED_STORAGE to silence this message and get the old non-conformant behavior.”
detected during:
instantiation of class “std::_Aligned<_Len, _Align, double, false> [with _Len=16ULL, _Align=16ULL]”
(1291): here
instantiation of class “std::_Aligned<_Len, _Align, int, false> [with _Len=16ULL, _Align=16ULL]”
(1298): here
instantiation of class “std::_Aligned<_Len, _Align, short, false> [with _Len=16ULL, _Align=16ULL]”
(1305): here
instantiation of class “std::_Aligned<_Len, _Align, char, false> [with _Len=16ULL, _Align=16ULL]”
(1312): here
instantiation of class “std::aligned_storage<_Len, _Align> [with _Len=16ULL, _Align=16ULL]”
csrc/multi_tensor_scale_kernel.cu(25): here
instantiation of “void load_store(T *, T *, int, int) [with T=float]”
csrc/multi_tensor_scale_kernel.cu(64): here
根据https://github.com/NVIDIA/apex/issues/835
热心网友给出自己修改源码之后的apex下载github链接
有用的安装方法是
按照https://github.com/kezewang/apex
$ git clone https://github.com/NVIDIA/apex
$ cd apex
$ pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
成功了耶
后记
解决这个bug,好晚了嘿,希望自己学有所成吧