[Bug] Segmentation fault due unsufficient memory

### Git commit

43a70e819b9254dee0d017305d6992f6bb27f850

### Operating System & Version

Ubuntu 24.04

### GGML backends

CUDA

### Command-line arguments used

./build/bin/sd-cli  -m ~/SD_models/sd3/sd3.5_large-iq4_nl.gguf --t5xxl ~/SD_models/flux/t5xxl_q4_k.gguf -v -p "A cute cat"

### Steps to reproduce

start above mentioned command   (CUDA compiled)   on a 8GB VRAM machine

Note: 
 t5xxl_q4_k.gguf  comes from https://huggingface.co/Green-Sky/flux.1-schnell-GGUF/blob/main/t5xxl_q4_k.gguf
and sd3.5_large-iq4_nl.gguf  comes from https://huggingface.co/stduhpf/SD3.5-Large-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large-iq4_nl.gguf

### What you expected to happen

a clean exit with an error message

(or maybe a complete run, but that is not on topic here)

### What actually happened

crash with message:
 Segmentation fault (core dumped) ./build/bin/sd-cli -m ~/SD_models/sd3/sd3.5_large-iq4_nl.gguf --t5xxl ~/SD_models/flux/t5xxl_q4_k.gguf -v -p "A cute cat"

### Logs / error messages / stack trace

[INFO ] stable-diffusion.cpp:228  - loading model from '/home/xxx/SD_models/sd3/sd3.5_large-iq4_nl.gguf'
[INFO ] model.cpp:370  - load /home/xxx/SD_models/sd3/sd3.5_large-iq4_nl.gguf using gguf format
[DEBUG] model.cpp:412  - init from '/home/xxx/SD_models/sd3/sd3.5_large-iq4_nl.gguf'
[INFO ] stable-diffusion.cpp:275  - loading t5xxl from '/home/xxx/SD_models/flux/t5xxl_q4_k.gguf'
[INFO ] model.cpp:370  - load /home/xxx/SD_models/flux/t5xxl_q4_k.gguf using gguf format
[DEBUG] model.cpp:412  - init from '/home/xxx/SD_models/flux/t5xxl_q4_k.gguf'
[INFO ] stable-diffusion.cpp:312  - Version: SD3.x 
[INFO ] stable-diffusion.cpp:340  - Weight type stat:                      f32: 192  |     f16: 395  |    q4_K: 218  |  iq4_nl: 581  
[INFO ] stable-diffusion.cpp:341  - Conditioner weight type stat:          f16: 1    |    q4_K: 218  
[INFO ] stable-diffusion.cpp:342  - Diffusion model weight type stat:      f16: 394  |  iq4_nl: 529  
[INFO ] stable-diffusion.cpp:343  - VAE weight type stat:                  f32: 192  |  iq4_nl: 52   
[DEBUG] stable-diffusion.cpp:345  - ggml tensor size = 400 bytes
[DEBUG] clip.hpp:160  - vocab size: 49408
[DEBUG] clip.hpp:171  - trigger word img already in vocab
[DEBUG] clip.hpp:160  - vocab size: 49408
[DEBUG] clip.hpp:171  - trigger word img already in vocab
[INFO ] mmdit.hpp:690  - MMDiT layers: 38 (including 0 MMDiT-x layers)
[DEBUG] ggml_extend.hpp:1883 - t5 params backend buffer size =  2986.77 MB(VRAM) (219 tensors)
[ERROR] ggml_extend.hpp:83   - ggml_backend_cuda_buffer_type_alloc_buffer: allocating 4779.80 MiB on device 0: cudaMalloc failed: out of memory
[ERROR] ggml_extend.hpp:83   - alloc_tensor_range: failed to allocate CUDA0 buffer of size 5011982336
[ERROR] ggml_extend.hpp:1877 - mmdit alloc params backend buffer failed, num_tensors = 923
[DEBUG] ggml_extend.hpp:1883 - vae params backend buffer size =  94.57 MB(VRAM) (138 tensors)
[DEBUG] stable-diffusion.cpp:688  - loading weights
[DEBUG] model.cpp:1351 - using 8 threads for model loading
[DEBUG] model.cpp:1373 - loading tensors from /home/xxx/SD_models/sd3/sd3.5_large-iq4_nl.gguf
  |>                                                 | 7/1386 - 7000.00it/s
 Segmentation fault (core dumped) ./build/bin/sd-cli -m ~/SD_models/sd3/sd3.5_large-iq4_nl.gguf --t5xxl ~/SD_models/flux/t5xxl_q4_k.gguf -v -p "A cute cat"

### Additional context / environment details

CUDA 8 GB VRAM
By the way  using option  _--offload-to-cpu_      it runs complete with saving an image file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Segmentation fault due unsufficient memory #1094

Git commit

Operating System & Version

GGML backends

Command-line arguments used

Steps to reproduce

What you expected to happen

What actually happened

Logs / error messages / stack trace

Additional context / environment details

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug] Segmentation fault due unsufficient memory #1094

Description

Git commit

Operating System & Version

GGML backends

Command-line arguments used

Steps to reproduce

What you expected to happen

What actually happened

Logs / error messages / stack trace

Additional context / environment details

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions