12.10.2024 I can't generate 1536*1536 (or any big Hires) images with an RX580. I'm looking for all kinds of ways AND problems with the second generation (does not clean the memory of the GPU)

I need to generate a 1536* 1536 (or any big Hires) image, but it doesn't work. I am ready to do anything for this, even if it is generated for at least an hour. Are there extensions that make this possible?

GPU: RX580 8GB
CPU: intel xeon e3 1270 v3
RAM: 16gb.

Please help me.

**And another problem:** 
I restart the SD every time after the first generation because the video card on the second generation does not clear the memory, after which it writes "Low GPU vram warning" to the console and the memory is clogged more, and generates slower.
Why is this so and can it be fixed? I immediately say that clearing `%temp%` does not help.

(the first generation is at the bottom, there are no warnings)
```
venv "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\Scripts\Python.exe"
WARNING: ZLUDA works best with SD.Next. Please consider migrating to SD.Next.
fatal: No names found, cannot describe anything.
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: f2.0.1v1.10.1-1.10.1
Commit hash: 545cb6bf1187a11475ce2b28b3f7f99938cddf3d
Using ZLUDA in D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\.zluda
WARNING: no ROCm agent was found!
Launching Web UI with arguments: --use-zluda --no-half --upcast-sampling --precision full --theme dark --skip-version-check --always-batch-cond-uncond --opt-sub-quad-attention --disable-nan-check
You are using PyTorch below version 2.3. Some optimizations will be disabled.
Total VRAM 8192 MB, total RAM 16328 MB
pytorch version: 2.2.1+cu118
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon RX 580 2048SP [ZLUDA] : native
VAE dtype preferences: [torch.bfloat16, torch.float32] -> torch.bfloat16
CUDA Using Stream: False
Using pytorch cross attention
Using pytorch attention for VAE
ONNX: version=1.19.2 provider=CPUExecutionProvider, available=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']
ControlNet preprocessor location: D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\models\ControlNetPreprocessor
*** Error loading script: pa.py
    Traceback (most recent call last):
      File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\scripts.py", line 525, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\script_loading.py", line 13, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\extensions\sd-webui-prevent-artifact\scripts\pa.py", line 55, in <module>
        sd_hijack_clip.FrozenCLIPEmbedderWithCustomWordsBase.process_tokens = process_tokens
    AttributeError: module 'modules.sd_hijack_clip' has no attribute 'FrozenCLIPEmbedderWithCustomWordsBase'

---
2024-10-12 23:07:54,293 - ControlNet - INFO - ControlNet UI callback registered.
Model selected: {'checkpoint_info': {'filename': 'D:\\Stable Diffusion\\stable-diffusion-webui-amdgpu-forge\\models\\Stable-diffusion\\snowpony_v10.safetensors', 'hash': '7a851477'}, 'additional_modules': ['D:\\Stable Diffusion\\stable-diffusion-webui-amdgpu-forge\\models\\VAE\\sdxl.vae.safetensors'], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 31.6s (prepare environment: 1.9s, import torch: 19.7s, initialize shared: 2.2s, load scripts: 2.7s, initialize google blockly: 0.2s, create ui: 3.0s, gradio launch: 1.7s).
Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}
[GPU Setting] You will use 87.50% GPU memory (7168.00 MB) to load weights, and use 12.50% GPU memory (1024.00 MB) to do matrix computation.
Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}
[GPU Setting] You will use 87.50% GPU memory (7168.00 MB) to load weights, and use 12.50% GPU memory (1024.00 MB) to do matrix computation.
Loading Model: {'checkpoint_info': {'filename': 'D:\\Stable Diffusion\\stable-diffusion-webui-amdgpu-forge\\models\\Stable-diffusion\\snowpony_v10.safetensors', 'hash': '7a851477'}, 'additional_modules': ['D:\\Stable Diffusion\\stable-diffusion-webui-amdgpu-forge\\models\\VAE\\sdxl.vae.safetensors'], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Done.
StateDict Keys: {'unet': 1680, 'vae': 250, 'text_encoder': 197, 'text_encoder_2': 518, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
IntegratedAutoencoderKL Unexpected: ['model_ema.decay', 'model_ema.num_updates']
K-Model Created: {'storage_dtype': torch.float16, 'computation_dtype': torch.float16}
Model loaded in 12.2s (unload existing model: 0.3s, forge model load: 11.9s).
activating extra network lora with arguments [<modules.extra_networks.ExtraNetworkParams object at 0x0000024052A59C30>, <modules.extra_networks.ExtraNetworkParams object at 0x0000024052A5AB00>, <modules.extra_networks.ExtraNetworkParams object at 0x0000024052A59F90>]: AttributeError
Traceback (most recent call last):
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\extensions-builtin\sd_forge_lora\networks.py", line 94, in load_networks
    net = load_network(name, network_on_disk)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\extensions-builtin\sd_forge_lora\networks.py", line 63, in load_network
    net.mtime = os.path.getmtime(network_on_disk.filename)
AttributeError: 'NoneType' object has no attribute 'filename'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\extra_networks.py", line 135, in activate
    extra_network.activate(p, extra_network_args)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\extensions-builtin\sd_forge_lora\extra_networks_lora.py", line 45, in activate
    networks.load_networks(names, te_multipliers, unet_multipliers, dyn_dims)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\extensions-builtin\sd_forge_lora\networks.py", line 96, in load_networks
    errors.display(e, f"loading network {network_on_disk.filename}")
AttributeError: 'NoneType' object has no attribute 'filename'

[Unload] Trying to free 3051.58 MB for cuda:0 with 0 models keep loaded ... Done.
[Memory Management] Target: JointTextEncoder, Free GPU: 7347.49 MB, Model Require: 1559.68 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 4763.81 MB, All loaded to GPU.
Moving model(s) has taken 5.07 seconds
[Unload] Trying to free 1024.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 5587.12 MB ... Done.
[Unload] Trying to free 11609.04 MB for cuda:0 with 0 models keep loaded ... Current free memory is 5586.86 MB ... Unload model JointTextEncoder Done.
[Memory Management] Target: KModel, Free GPU: 7333.84 MB, Model Require: 4897.05 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 1412.79 MB, All loaded to GPU.
Moving model(s) has taken 17.84 seconds
  0%|                                                                                           | 0/25 [00:14<?, ?it/s]
Traceback (most recent call last):
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules_forge\main_thread.py", line 30, in work
    self.result = self.func(*self.args, **self.kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\txt2img.py", line 123, in txt2img_function
    processed = processing.process_images(p)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\processing.py", line 818, in process_images
    res = process_images_inner(p)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\processing.py", line 1053, in process_images_inner
    samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\processing.py", line 1430, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\sd_samplers_kdiffusion.py", line 240, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\sd_samplers_common.py", line 272, in launch_sampling
    return func()
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\sd_samplers_kdiffusion.py", line 240, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\k_diffusion\sampling.py", line 595, in sample_dpmpp_2m
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\modules\sd_samplers_cfg_denoiser.py", line 199, in forward
    denoised, cond_pred, uncond_pred = sampling_function(self, denoiser_params=denoiser_params, cond_scale=cond_scale, cond_composition=cond_composition)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\sampling\sampling_function.py", line 362, in sampling_function
    denoised, cond_pred, uncond_pred = sampling_function_inner(model, x, timestep, uncond, cond, cond_scale, model_options, seed, return_full=True)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\sampling\sampling_function.py", line 303, in sampling_function_inner
    cond_pred, uncond_pred = calc_cond_uncond_batch(model, cond, uncond_, x, timestep, model_options)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\sampling\sampling_function.py", line 273, in calc_cond_uncond_batch
    output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\modules\k_model.py", line 45, in apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\nn\unet.py", line 713, in forward
    h = module(h, emb, context, transformer_options)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\nn\unet.py", line 83, in forward
    x = layer(x, context, transformer_options)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\nn\unet.py", line 321, in forward
    x = block(x, context=context[i], transformer_options=transformer_options)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\nn\unet.py", line 181, in forward
    return checkpoint(self._forward, (x, context, transformer_options), None, self.checkpoint)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\nn\unet.py", line 12, in checkpoint
    return f(*args)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\nn\unet.py", line 235, in _forward
    n = self.attn1(n, context=context_attn1, value=value_attn1, transformer_options=extra_options)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\nn\unet.py", line 154, in forward
    out = attention_function(q, k, v, self.heads, mask)
  File "D:\Stable Diffusion\stable-diffusion-webui-amdgpu-forge\backend\attention.py", line 335, in attention_pytorch
    out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 5.00 GiB. GPU 0 has a total capacity of 8.00 GiB of which 6.57 GiB is free. Of the allocated memory 10.17 GiB is allocated by PyTorch, and 395.65 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
CUDA out of memory. Tried to allocate 5.00 GiB. GPU 0 has a total capacity of 8.00 GiB of which 6.57 GiB is free. Of the allocated memory 10.17 GiB is allocated by PyTorch, and 395.65 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

12.10.2024 I can't generate 1536*1536 (or any big Hires) images with an RX580. I'm looking for all kinds of ways AND problems with the second generation (does not clean the memory of the GPU) #42

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

12.10.2024 I can't generate 1536*1536 (or any big Hires) images with an RX580. I'm looking for all kinds of ways AND problems with the second generation (does not clean the memory of the GPU) #42

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions