Commit Graph

350 Commits

Author SHA1 Message Date
leejet
c2d8ffc22c fix: compatibility for models with modified tensor shapes (#951) 2025-11-07 23:04:41 +08:00
stduhpf
fb748bb8a4 fix: TAE encoding (#935) 2025-11-07 22:58:59 +08:00
leejet
8f6c5c217b refactor: simplify the model loading logic (#933)
* remove String2GGMLType

* remove preprocess_tensor

* fix clip init

* simplify the logic for reading weights
2025-11-03 21:21:34 +08:00
leejet
6103d86e2c refactor: introduce GGMLRunnerContext (#928)
* introduce GGMLRunnerContext

* add Flash Attention enable control through GGMLRunnerContext

* add conv2d_direct enable control through GGMLRunnerContext
2025-11-02 02:11:04 +08:00
stduhpf
c42826b77c fix: resolve multiple inpainting issues (#926)
* Fix inpainting masked image being broken by side effect

* Fix unet inpainting concat not being set

* Fix Flex.2 inpaint mode crash (+ use scale factor)
2025-11-02 02:10:32 +08:00
Wagner Bruna
945d9a9ee3 docs: add Koboldcpp as an available UI (#930) 2025-11-02 02:03:01 +08:00
Wagner Bruna
353e708844 docs: update ggml and llama.cpp URLs (#931) 2025-11-02 02:02:44 +08:00
leejet
dd75fc081c refactor: unify the naming style of ggml extension functions (#921) 2025-10-28 23:26:48 +08:00
stduhpf
77eb95f8e4 docs: fix taesd direct download link (#917) 2025-10-28 23:26:23 +08:00
Wagner Bruna
8a45d0ff7f chore: clean up stb includes (#919) 2025-10-28 23:25:45 +08:00
leejet
9e28be6479 feat: add chroma radiance support (#910)
* add chroma radiance support

* fix ci

* simply generate_init_latent

* workaround: avoid ggml cuda error

* format code

* add chroma radiance doc
2025-10-25 23:56:14 +08:00
akleine
062490aa7c feat: add SSD1B and tiny-sd support (#897)
* feat: add code and doc for running SSD1B models

* Added some more lines to support SD1.x with TINY U-Nets too.

* support SSD-1B.safetensors

* fix sdv1.5 diffusers format loader

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-10-25 23:35:54 +08:00
stduhpf
faabc5ad3c feat: allow models to run without all text encoder(s) (#645) 2025-10-25 22:00:56 +08:00
leejet
69b9511ce9 sync: update ggml 2025-10-24 00:32:45 +08:00
stduhpf
917f7bfe99 fix: support --flow-shift for flux models with default pred (#913) 2025-10-23 21:35:18 +08:00
leejet
48e0a28ddf feat: add shift factor support (#903) 2025-10-23 01:20:29 +08:00
leejet
d05e46ca5e chore: add .clang-tidy configuration and apply modernize checks (#902) 2025-10-18 23:23:40 +08:00
Wagner Bruna
64a7698347 chore: report number of Qwen layers as info (#901) 2025-10-18 23:22:01 +08:00
leejet
0723ee51c9 refactor: optimize option printing (#900) 2025-10-18 17:50:30 +08:00
leejet
90ef5f8246 feat: add auto-resize support for reference images (was Qwen-Image-Edit only) (#898) 2025-10-18 16:37:09 +08:00
leejet
db6f4791b4 feat: add wtype stat (#899) 2025-10-17 23:40:32 +08:00
leejet
b25785bc10 sync: update ggml 2025-10-17 21:46:39 +08:00
leejet
0585e2609d docs: split README sections (build, performance, etc.) into separate docs 2025-10-16 23:22:06 +08:00
leejet
683d6d08a8 chore: add github issue template 2025-10-16 21:04:41 +08:00
leejet
40a6a8710e fix: resolve precision issues in SDXL VAE under fp16 (#888)
* fix: resolve precision issues in SDXL VAE under fp16

* add --force-sdxl-vae-conv-scale option

* update docs
2025-10-15 23:01:00 +08:00
Daniele
e3702585cb feat: added prediction argument (#334) 2025-10-15 23:00:10 +08:00
cmdr2
a7d6d296c7 chore: allow building ggml as a separate shared lib (#468) 2025-10-15 22:10:26 +08:00
leejet
2e9242e37f feat: add Qwen Image Edit support (#877)
* add ref latent support for qwen image

* optimize clip_preprocess and fix get_first_stage_encoding

* add qwen2vl vit support

* add qwen image edit support

* fix qwen image edit pipeline

* add mmproj file support

* support dynamic number of Qwen image transformer blocks

* set prompt_template_encode_start_idx every time

* to_add_out precision fix

* to_out.0 precision fix

* update docs
2025-10-13 23:17:18 +08:00
Wagner Bruna
c64994dc1d fix: better progress display for second-order samplers (#834) 2025-10-13 22:12:48 +08:00
Wagner Bruna
5436f6b814 fix: correct canny preprocessor (#861) 2025-10-13 22:02:35 +08:00
leejet
1c32fa03bc fix: avoid generating black images when running T5 on the GPU (#882) 2025-10-13 00:01:06 +08:00
Wagner Bruna
9727c6bb98 fix: resolve VAE tiling problem in Qwen Image (#873) 2025-10-12 23:45:53 +08:00
leejet
beb99a2de2 feat: add Qwen Image support (#851)
* add qwen tokenizer

* add qwen2.5 vl support

* mv qwen.hpp -> qwenvl.hpp

* add qwen image model

* add qwen image t2i pipeline

* fix qwen image flash attn

* add qwen image i2i pipline

* change encoding of vocab_qwen.hpp to utf8

* fix get_first_stage_encoding

* apply jeffbolz f32 patch

https://github.com/leejet/stable-diffusion.cpp/pull/851#issuecomment-3335515302

* fix the issue that occurs when using CUDA with k-quants weights

* optimize the handling of the FeedForward precision fix

* to_add_out precision fix

* update docs
2025-10-12 23:23:19 +08:00
Wagner Bruna
aa68b875b9 refactor: deal with default img-cfg-scale at the library level (#869) 2025-10-12 23:17:52 +08:00
Wagner Bruna
5b261b9cee feat: add a stand-alone upscale mode (#865)
* feat: add a stand-alone upscale mode

* fix prompt option check

* format code

* update README.md

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-10-12 23:10:02 +08:00
Pedrito
e70d0205ca feat: add support for more esrgan models & x2 & x1 models (#855) 2025-10-12 22:53:31 +08:00
leejet
02af48a97f chore: fix vulkan ci (#878) 2025-10-11 00:40:57 +08:00
leejet
e12d5e0aaf fix: ensure directory iteration results are sorted by filename (#858) 2025-10-11 00:18:39 +08:00
Serkan Sahin
940a2018e1 chore: fix dockerfile libgomp1 dependency + improvements (#852) 2025-10-11 00:17:45 +08:00
Sharuzzaman Ahmat Raslan
b451728b2f docs: update README.md (#866) 2025-10-11 00:11:10 +08:00
stduhpf
11f436c483 feat: add support for Flux Controls and Flex.2 (#692) 2025-10-11 00:06:57 +08:00
leejet
35843c77ea fix: optimize the handling of embedding weight (#859) 2025-09-25 23:09:59 +08:00
leejet
6ad46bb700 sync: update ggml 2025-09-25 21:57:43 +08:00
leejet
1ba30ce005 sync: update ggml 2025-09-25 00:38:38 +08:00
leejet
2abe9451c4 fix: optimize the handling of CLIP embedding weight (#840) 2025-09-25 00:28:20 +08:00
Wagner Bruna
f3140eadbb fix: tensor loading thread count (#854) 2025-09-25 00:26:38 +08:00
Stefan-Olt
98ba155fc6 docs: HipBLAS / ROCm build instruction fix (#843) 2025-09-25 00:03:05 +08:00
Wagner Bruna
513f36d495 docs: include Vulkan compatibility for LoRA quants (#845) 2025-09-25 00:01:10 +08:00
rmatif
1e0d2821bb fix: correct tensor deduplication logic (#844) 2025-09-24 23:22:40 +08:00
leejet
fd693ac6a2 refactor: remove unused --normalize-input parameter (#835) 2025-09-18 00:12:53 +08:00