运行QWen2-1.5b模型时报错“RuntimeError: cutlassF: no kernel found to launch!”
#问题:成功加载QWen2-1.5b模型,但是推理时
“model.generate(
model_inputs.input_ids,
top_p=self.top_p,
max_new_tokens=512
)时”,报错“RuntimeError: cutlassF: no kernel found to launch!”
解决方法:
在代码运行的开头加入以下代码即可
import torch
torch.backends.cuda.enable_mem_efficient_sdp(False)
torch.backends.cuda.enable_flash_sdp(False)
参考链接:
https://blog.csdn.net/zc1226/article/details/140213258
https://stackoverflow.com/questions/77803696/runtimeerror-cutlassf-no-kernel-found-to-launch-when-running-huggingface-tran