torch.cuda.OutOfMemoryError: CUDA out of memory.

Stable diffusion model failed to load
Loading weights [6ce0161689] from E:\SD\stable-diffusion-webui\models\Stable-diffusion\v1-5-pruned-emaonly.safetensors
Creating model from config: E:\SD\stable-diffusion-webui\configs\v1-inference.yaml
loading stable diffusion model: OutOfMemoryError
Traceback (most recent call last):
  File "E:\program\anaconda3\lib\threading.py", line 973, in _bootstrap
    self._bootstrap_inner()
  File "E:\program\anaconda3\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "E:\SD\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "E:\SD\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
  File "E:\SD\stable-diffusion-webui\modules\ui.py", line 1298, in <lambda>
    update_image_cfg_scale_visibility = lambda: gr.update(visible=shared.sd_model and shared.sd_model.cond_stage_key == "edit")
  File "E:\SD\stable-diffusion-webui\modules\shared_items.py", line 110, in sd_model
    return modules.sd_models.model_data.get_sd_model()
  File "E:\SD\stable-diffusion-webui\modules\sd_models.py", line 499, in get_sd_model
    load_model()
  File "E:\SD\stable-diffusion-webui\modules\sd_models.py", line 626, in load_model
    load_model_weights(sd_model, checkpoint_info, state_dict, timer)
  File "E:\SD\stable-diffusion-webui\modules\sd_models.py", line 353, in load_model_weights
    model.load_state_dict(state_dict, strict=False)
  File "E:\SD\stable-diffusion-webui\modules\sd_disable_initialization.py", line 223, in <lambda>
    module_load_state_dict = self.replace(torch.nn.Module, 'load_state_dict', lambda *args, **kwargs: load_state_dict(module_load_state_dict, *args, **kwargs))
  File "E:\SD\stable-diffusion-webui\modules\sd_disable_initialization.py", line 221, in load_state_dict
    original(module, state_dict, strict=strict)
  File "E:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 2027, in load_state_dict
    load(self, state_dict)
  File "E:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  File "E:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  File "E:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  [Previous line repeated 4 more times]
  File "E:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 2009, in load
    module._load_from_state_dict(
  File "E:\SD\stable-diffusion-webui\modules\sd_disable_initialization.py", line 226, in <lambda>
    conv2d_load_from_state_dict = self.replace(torch.nn.Conv2d, '_load_from_state_dict', lambda *args, **kwargs: load_from_state_dict(conv2d_load_from_state_dict, *args, **kwargs))
  File "E:\SD\stable-diffusion-webui\modules\sd_disable_initialization.py", line 191, in load_from_state_dict
    module._parameters[name] = torch.nn.parameter.Parameter(torch.zeros_like(param, device=device, dtype=dtype), requires_grad=param.requires_grad)
  File "E:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\_meta_registrations.py", line 1780, in zeros_like
    return aten.empty_like.default(
  File "E:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\_ops.py", line 287, in __call__
    return self._op(*args, **kwargs or {})
  File "E:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\_refs\__init__.py", line 4254, in empty_like
    return torch.empty_strided(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 4.00 GiB total capacity; 3.43 GiB already allocated; 0 bytes free; 3.48 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

This error message indicates that there is not enough memory available on the GPU to perform the operation that triggered the error. The GPU has a total capacity of 4.00 GiB, but only 3.43 GiB is currently allocated for other tasks. The system attempted to allocate 20.00 MiB of memory, which exceeds the remaining free space on the GPU.

The solution:

set COMMANDLINE_ARGS=--precision full --no-half --lowvram --always-batch-cond-uncond --xformers
set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.9,max_split_size_mb:512

set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.9,max_split_size_mb:512

This command sets the configuration parameters for garbage collection threshold and max split size for PyTorch on the GPU. The garbage collection threshold controls the percentage of memory that PyTorch will attempt to free up before moving objects to the CPU, while the max split size limits the size of shared memory allocations on the GPU.

The value "garbage_collection_threshold:0.9" sets the garbage collection threshold to 90%, which means that PyTorch will attempt to free up 90% of the memory it no longer needs before moving objects to the CPU. This can help reduce the amount of memory required by PyTorch and improve performance.

The value "max_split_size_mb:512" sets the maximum size of shared memory allocations on the GPU to 512 MB. This can help prevent fragmentation of the GPU memory and improve performance. However, if you find that this value is too small for your application, you may need to increase it.

set COMMANDLINE_ARGS=--precision full --no-half --lowvram --always-batch-cond-uncond --xformers

This command sets various command-line arguments for PyTorch on the GPU. Here are some of the arguments and their meanings:

  • --precision full: This argument specifies that all computations should be performed with full precision, which can provide better numerical stability but requires more memory.

  • --no-half: This argument disables the use of half-precision floating-point numbers in calculations, which can also improve numerical stability but requires more memory.

  • --lowvram: This argument tells PyTorch to reduce the amount of video memory available to it, which can help prevent out-of-memory errors when running certain operations.

  • --always-batch-cond-uncond: This argument tells PyTorch to always perform batch normalization operations conditionally, even if the input data is already normalized. This can help improve training performance by avoiding redundant computations.

  • --xformers: This argument specifies that the model should use Xformer modules, which are a type of transformer architecture specifically designed for NLP tasks.

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值