-
Notifications
You must be signed in to change notification settings - Fork 483
Description
Git commit
Operating System & Version
Ubuntu 24.04
GGML backends
CUDA
Command-line arguments used
./build/bin/sd-cli -m ~/SD_models/sd3/sd3.5_large-iq4_nl.gguf --t5xxl ~/SD_models/flux/t5xxl_q4_k.gguf -v -p "A cute cat"
Steps to reproduce
start above mentioned command (CUDA compiled) on a 8GB VRAM machine
Note:
t5xxl_q4_k.gguf comes from https://huggingface.co/Green-Sky/flux.1-schnell-GGUF/blob/main/t5xxl_q4_k.gguf
and sd3.5_large-iq4_nl.gguf comes from https://huggingface.co/stduhpf/SD3.5-Large-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large-iq4_nl.gguf
What you expected to happen
a clean exit with an error message
(or maybe a complete run, but that is not on topic here)
What actually happened
crash with message:
Segmentation fault (core dumped) ./build/bin/sd-cli -m ~/SD_models/sd3/sd3.5_large-iq4_nl.gguf --t5xxl ~/SD_models/flux/t5xxl_q4_k.gguf -v -p "A cute cat"
Logs / error messages / stack trace
[INFO ] stable-diffusion.cpp:228 - loading model from '/home/xxx/SD_models/sd3/sd3.5_large-iq4_nl.gguf'
[INFO ] model.cpp:370 - load /home/xxx/SD_models/sd3/sd3.5_large-iq4_nl.gguf using gguf format
[DEBUG] model.cpp:412 - init from '/home/xxx/SD_models/sd3/sd3.5_large-iq4_nl.gguf'
[INFO ] stable-diffusion.cpp:275 - loading t5xxl from '/home/xxx/SD_models/flux/t5xxl_q4_k.gguf'
[INFO ] model.cpp:370 - load /home/xxx/SD_models/flux/t5xxl_q4_k.gguf using gguf format
[DEBUG] model.cpp:412 - init from '/home/xxx/SD_models/flux/t5xxl_q4_k.gguf'
[INFO ] stable-diffusion.cpp:312 - Version: SD3.x
[INFO ] stable-diffusion.cpp:340 - Weight type stat: f32: 192 | f16: 395 | q4_K: 218 | iq4_nl: 581
[INFO ] stable-diffusion.cpp:341 - Conditioner weight type stat: f16: 1 | q4_K: 218
[INFO ] stable-diffusion.cpp:342 - Diffusion model weight type stat: f16: 394 | iq4_nl: 529
[INFO ] stable-diffusion.cpp:343 - VAE weight type stat: f32: 192 | iq4_nl: 52
[DEBUG] stable-diffusion.cpp:345 - ggml tensor size = 400 bytes
[DEBUG] clip.hpp:160 - vocab size: 49408
[DEBUG] clip.hpp:171 - trigger word img already in vocab
[DEBUG] clip.hpp:160 - vocab size: 49408
[DEBUG] clip.hpp:171 - trigger word img already in vocab
[INFO ] mmdit.hpp:690 - MMDiT layers: 38 (including 0 MMDiT-x layers)
[DEBUG] ggml_extend.hpp:1883 - t5 params backend buffer size = 2986.77 MB(VRAM) (219 tensors)
[ERROR] ggml_extend.hpp:83 - ggml_backend_cuda_buffer_type_alloc_buffer: allocating 4779.80 MiB on device 0: cudaMalloc failed: out of memory
[ERROR] ggml_extend.hpp:83 - alloc_tensor_range: failed to allocate CUDA0 buffer of size 5011982336
[ERROR] ggml_extend.hpp:1877 - mmdit alloc params backend buffer failed, num_tensors = 923
[DEBUG] ggml_extend.hpp:1883 - vae params backend buffer size = 94.57 MB(VRAM) (138 tensors)
[DEBUG] stable-diffusion.cpp:688 - loading weights
[DEBUG] model.cpp:1351 - using 8 threads for model loading
[DEBUG] model.cpp:1373 - loading tensors from /home/xxx/SD_models/sd3/sd3.5_large-iq4_nl.gguf
|> | 7/1386 - 7000.00it/s
Segmentation fault (core dumped) ./build/bin/sd-cli -m ~/SD_models/sd3/sd3.5_large-iq4_nl.gguf --t5xxl ~/SD_models/flux/t5xxl_q4_k.gguf -v -p "A cute cat"
Additional context / environment details
CUDA 8 GB VRAM
By the way using option --offload-to-cpu it runs complete with saving an image file.