Model Specifications

Technical specifications for all supported diffusion models.

Summary Table

Model	Pipeline	Steps	CFG	Neg. Prompt	Architecture	VRAM
Z-Image Turbo	ZImagePipeline	9	0.0	No	zimage	8 GB
Flux.1 Dev	FluxPipeline	28	3.5	No	flux1	24 GB
Qwen-Image-2512	QwenImagePipeline	50	4.5	Yes	qwen	24 GB
SDXL Turbo	AutoPipelineForText2Image	4	0.0	No	sdxl	6 GB
Juggernaut XL v9	StableDiffusionXLPipeline	30	7.0	Yes	sdxl	8 GB
DreamShaper XL Lightning	StableDiffusionXLPipeline	4	2.0	No	sdxl	8 GB
Realistic Vision v5.1	StableDiffusionPipeline	30	5.0	Yes	sd15	4 GB

Z-Image Turbo

Slug	`zimageturbo`
HuggingFace	`Tongyi-MAI/Z-Image-Turbo`
Architecture	`zimage`
Scheduler	FlowMatchEulerDiscreteScheduler
Dtype	bfloat16
Default Size	1024 x 1024 (max 1 MP)
Token Window	512
Prompt Handling	Custom (LoRA trigger word appending)
LoRA Support	Yes (zimage architecture)

Behavior flags: force_default_guidance — ignores user CFG, always uses 0.0.

Fast single-step turbo model with 9-step generation. Best for rapid prototyping and previews.

Flux.1 Dev

Slug	`flux1_dev`
HuggingFace	`black-forest-labs/FLUX.1-dev`
Architecture	`flux1`
Scheduler	FlowMatchEulerDiscreteScheduler
Dtype	bfloat16
Default Size	1024 x 1024 (max 2 MP)
Token Window	512 (`max_sequence_length`)
Prompt Handling	Native (no truncation)
LoRA Support	Yes (flux1 architecture)

High-quality generation with 28 steps. Supports long prompts up to 512 tokens without truncation. Best for production-quality images.

Qwen-Image-2512

Slug	`qwen_image`
HuggingFace	`Qwen/Qwen-Image-2512`
Architecture	`qwen`
Scheduler	FlowMatchEulerDiscreteScheduler
Dtype	bfloat16
Default Size	1328 x 1328 (max 14 MP)
Token Window	512
Prompt Handling	Native (requires non-empty negative prompt)
LoRA Support	No

Optional flag: load_in_8bit — enables 8-bit quantization to reduce VRAM usage.

Highest resolution model with 50-step generation. Supports negative prompts and VAE slicing for memory efficiency. Best for high-resolution outputs.

SDXL Turbo

Slug	`sdxl_turbo`
HuggingFace	`stabilityai/sdxl-turbo`
Architecture	`sdxl`
Scheduler	EulerAncestralDiscreteScheduler
Dtype	float16
Default Size	512 x 512 (max 256 KP)
Token Window	77 (CLIP)
Prompt Handling	CLIPTokenLimitMixin (77-token truncation)
LoRA Support	Yes (sdxl architecture)

Behavior flags: force_default_guidance — ignores user CFG, always uses 0.0.

Ultra-fast 4-step generation at 512px. Best for quick iterations and testing LoRA effects.

Juggernaut XL v9

Slug	`juggernaut_xl`
HuggingFace	`RunDiffusion/Juggernaut-XL-v9`
Architecture	`sdxl`
Scheduler	DPMSolverMultistepScheduler
Dtype	float16
Default Size	1024 x 1024 (max 1 MP)
Token Window	77 (CLIP)
Prompt Handling	CLIPTokenLimitMixin (77-token truncation)
LoRA Support	Yes (sdxl architecture)

Full SDXL model with 30-step generation and high CFG (7.0). Supports negative prompts. Best for photorealistic outputs and character work.

DreamShaper XL Lightning

Slug	`dreamshaper_xl`
HuggingFace	`Lykon/dreamshaper-xl-lightning`
Architecture	`sdxl`
Scheduler	DPMSolverMultistepScheduler
Dtype	float16
Default Size	1024 x 1024 (max 1 MP)
Token Window	77 (CLIP)
Prompt Handling	CLIPTokenLimitMixin (77-token truncation)
LoRA Support	Yes (sdxl architecture)

Distilled SDXL model with 4-step generation at full 1024px resolution. Best for fast SDXL-quality output without turbo limitations.

Realistic Vision v5.1

Slug	`realistic_vision`
HuggingFace	`SG161222/Realistic_Vision_V5.1_noVAE`
Architecture	`sd15`
Scheduler	DPMSolverMultistepScheduler
Dtype	float16
Default Size	512 x 768 (max 384 KP)
Token Window	77 (CLIP)
Prompt Handling	CLIPTokenLimitMixin (77-token truncation)
LoRA Support	Yes (sd15 architecture)

SD 1.5 model with 30-step generation. Supports negative prompts. Lowest VRAM requirement (4 GB). Best for photorealistic portraits and scenes at standard definition.

Architecture Compatibility

LoRAs must match the model’s base_architecture:

Architecture	Models	LoRA Compatibility
`sdxl`	SDXL Turbo, Juggernaut XL, DreamShaper XL	SDXL LoRAs only
`sd15`	Realistic Vision	SD 1.5 LoRAs only
`flux1`	Flux.1 Dev	Flux LoRAs only
`zimage`	Z-Image Turbo	Z-Image LoRAs only
`qwen`	Qwen-Image-2512	No LoRA support

Token Handling

CLIP-based models (SDXL, SD15) use CLIPTokenLimitMixin which truncates prompts to 77 tokens. When a LoRA is active, trigger words are prioritized — the base prompt is trimmed to make room for trigger words within the 77-token limit.

Flux and Qwen models support 512-token prompts natively without truncation.

See Compel Prompt Weighting for advanced prompt weighting syntax on CLIP-based models.

Device Optimizations

All models apply device-specific optimizations:

Apple Silicon (MPS): Sequential CPU offload, attention slicing, VAE kept in float32
CUDA: Model-level CPU offload (or sequential with use_sequential_cpu_offload flag)
CPU: Fallback with no optimizations