Library Modules

Supporting library modules under cw.lib.

Diffusion Models

Model Factory

Generative Creative Lab - Model Implementations Modular model classes for different diffusion pipelines

class cw.lib.models.BaseModel(model_config, model_path)[source]

Bases: ABC

Abstract base class for all diffusion models.

This class implements the Template Method pattern, providing a common framework for pipeline loading and image generation. Subclasses only need to override _create_pipeline() and optionally customize behavior through hooks.

Template Methods (do not override):

load_pipeline(): Handles device setup, pipeline creation, and optimizations
generate(): Handles parameter resolution, prompt building, and generation

Required Abstract Method:

_create_pipeline(): Return the specific pipeline instance

Optional Hooks (override for customization):

_build_prompts(): Customize prompt processing
_build_pipeline_kwargs(): Add model-specific pipeline parameters
_apply_device_optimizations(): Custom device optimizations
_handle_special_prompt_requirements(): Special prompt handling

Configuration Flags (set in presets.json settings):

force_default_guidance: Always use default guidance_scale (turbo models)
use_sequential_cpu_offload: Use sequential vs model CPU offload
enable_vae_slicing: Enable VAE slicing for memory efficiency
enable_debug_logging: Enable verbose debug output
max_sequence_length: Maximum sequence length for text encoder

Parameters:

model_config (Dict)
model_path (str)

config: Model configuration dictionary from presets.json

model_path: Path to model weights (local or HuggingFace ID)

pipeline: The loaded diffusion pipeline (None until load_pipeline called)

device: Torch device (mps, cuda, or cpu)

current_lora: Currently loaded LoRA configuration (or None)

settings: Model settings from config

dtype: Torch dtype for model weights

__init__(model_config, model_path)[source]

Initialize base model

Parameters:

model_config (Dict) – Model configuration from presets.json
model_path (str) – Full path to model file or HuggingFace ID

load_pipeline(progress_callback=None)[source]

Template method for pipeline loading (common flow)

Subclasses should NOT override this method. Instead, override _create_pipeline() to specify how to load the pipeline.

Parameters:: progress_callback – Optional callback for progress updates
Return type:: str
Returns:: Status message

generate(prompt, negative_prompt=None, steps=None, guidance_scale=None, width=None, height=None, seed=None, clip_skip=None, scheduler=None, progress_callback=None, **extra_params)[source]

Template method for image generation (common flow)

Subclasses should NOT override this method. Instead, override the hooks (_build_prompts, _build_pipeline_kwargs, etc.) to customize behavior.

Parameters:

prompt (str) – Text prompt
negative_prompt (Optional[str]) – Negative prompt (if supported)
steps (Optional[int]) – Number of inference steps
guidance_scale (Optional[float]) – Guidance scale
width (Optional[int]) – Image width
height (Optional[int]) – Image height
seed (Optional[int]) – Random seed
clip_skip (Optional[int]) – Number of CLIP layers to skip (if supported)
scheduler (Optional[str]) – Scheduler class name to use (overrides default)
progress_callback – Optional callback for progress updates
**extra_params – Additional parameters passed through to hooks (e.g., control_image, conditioning_scale for ControlNet)

Return type:

Tuple[Image, Dict]

Returns:

Tuple of (generated image, metadata dict)

set_scheduler(scheduler_name)[source]

Set the scheduler for the pipeline by class name.

Parameters:: scheduler_name (str) – Name of the scheduler class (e.g., ‘EulerDiscreteScheduler’, ‘DPMSolverMultistepScheduler’, ‘FlowMatchEulerDiscreteScheduler’)
Return type:: Optional[str]
Returns:: Name of the scheduler that was set, or None if failed

setup_device()[source]

Configure device for Apple Silicon optimization

Return type:: Tuple[device, str]
Returns:: Tuple of (device, device_name)

load_lora(lora_path, lora_config)[source]

Load a LoRA adapter

Parameters:

lora_path (str) – Path to LoRA file
lora_config (Dict) – LoRA configuration from presets

Return type:

str

Returns:

Status message

unload_lora()[source]

Unload current LoRA

Return type:: str
Returns:: Status message

get_lora_prompt_suffix()[source]

Get prompt suffix from current LoRA

Return type:: str

get_lora_negative_prompt_suffix()[source]

Get negative prompt suffix from current LoRA

Return type:: str

get_lora_clip_skip()[source]

Get clip_skip value from current LoRA

Return type:: Optional[int]

get_lora_guidance_scale()[source]

Get guidance_scale value from current LoRA

Return type:: Optional[float]

clear_cache()[source]

Clear GPU memory cache

Return type:: None

is_loaded()[source]

Check if pipeline is loaded

Return type:: bool

property model_name: str: Get model label

property model_slug: str: Get model slug

class cw.lib.models.CLIPTokenLimitMixin[source]

Bases: object

Mixin for models that need CLIP 77-token limit handling

Used by SDXL, SDXLTurbo, and SD15 models to ensure prompts fit within CLIP’s token limit while prioritizing LoRA trigger words.

class cw.lib.models.CompelPromptMixin(*args, **kwargs)[source]

Bases: object

Mixin for models that use Compel for long prompt handling and prompt weighting

Compel enables: - Prompts longer than 77 tokens (breaks into chunks, concatenates embeddings) - Prompt weighting syntax: (word:1.3) for emphasis, (word:0.8) for de-emphasis - Proper handling of LoRA trigger words without truncation

Used by SDXL, SDXLTurbo, and SD15 models as an alternative to CLIPTokenLimitMixin.

class cw.lib.models.DebugLoggingMixin[source]

Bases: object

Mixin for models that need debug print statements

Used by ZImageTurbo and SDXLTurbo for diagnostic output.

class cw.lib.models.ZImageTurboModel(model_config, model_path)[source]

Bases: DebugLoggingMixin, BaseModel

Z-Image Turbo implementation

Parameters:

model_config (Dict)
model_path (str)

class cw.lib.models.FluxModel(model_config, model_path)[source]

Bases: BaseModel

Flux.1-dev implementation

Parameters:

model_config (Dict)
model_path (str)

class cw.lib.models.QwenImageModel(model_config, model_path)[source]

Bases: BaseModel

Qwen-Image-2512 implementation

Parameters:

model_config (Dict)
model_path (str)

class cw.lib.models.SDXLTurboModel(model_config, model_path)[source]

Bases: CLIPTokenLimitMixin, DebugLoggingMixin, BaseModel

SDXL Turbo implementation

Parameters:

model_config (Dict)
model_path (str)

class cw.lib.models.SDXLModel(model_config, model_path)[source]

Bases: CLIPTokenLimitMixin, BaseModel

Generic SDXL implementation (Juggernaut XL, DreamShaper XL, etc.)

Parameters:

model_config (Dict)
model_path (str)

class cw.lib.models.SDXLControlNetModel(model_config, model_path)[source]

Bases: CompelPromptMixin, BaseModel

SDXL + ControlNet implementation for structure-guided generation.

Parameters:

model_config (dict)
model_path (str)

class cw.lib.models.SD15Model(model_config, model_path)[source]

Bases: CLIPTokenLimitMixin, BaseModel

Stable Diffusion 1.5 implementation (Realistic Vision, etc.)

Parameters:

model_config (Dict)
model_path (str)

class cw.lib.models.SD15ControlNetModel(model_config, model_path)[source]

Bases: CompelPromptMixin, BaseModel

SD 1.5 + ControlNet implementation for structure-guided generation.

Parameters:

model_config (dict)
model_path (str)

class cw.lib.models.ModelFactory[source]

Bases: object

Factory for creating model instances based on pipeline type

static create_model(model_config, model_path)[source]

Create model instance based on pipeline type

Parameters:

model_config (dict) – Model configuration from presets
model_path (str) – Full path to model or HuggingFace ID

Return type:

BaseModel

Returns:

Model instance

Base Model

Base model class for diffusion pipelines.

This module implements the Template Method pattern for diffusion model loading and image generation. All concrete model implementations inherit from BaseModel and override specific hooks to customize behavior.

The key methods are:

BaseModel.load_pipeline() - Template method for loading (concrete)
BaseModel.generate() - Template method for generation (concrete)
BaseModel._create_pipeline() - Hook for pipeline creation (abstract)

Example usage:

from cw.lib.models import ModelFactory

# Create model from presets config
model = ModelFactory.create_model(model_config, model_path)
model.load_pipeline()
image, metadata = model.generate("a beautiful sunset")

class cw.lib.models.base.BaseModel(model_config, model_path)[source]

Bases: ABC

Abstract base class for all diffusion models.

This class implements the Template Method pattern, providing a common framework for pipeline loading and image generation. Subclasses only need to override _create_pipeline() and optionally customize behavior through hooks.

Template Methods (do not override):

load_pipeline(): Handles device setup, pipeline creation, and optimizations
generate(): Handles parameter resolution, prompt building, and generation

Required Abstract Method:

_create_pipeline(): Return the specific pipeline instance

Optional Hooks (override for customization):

_build_prompts(): Customize prompt processing
_build_pipeline_kwargs(): Add model-specific pipeline parameters
_apply_device_optimizations(): Custom device optimizations
_handle_special_prompt_requirements(): Special prompt handling

Configuration Flags (set in presets.json settings):

force_default_guidance: Always use default guidance_scale (turbo models)
use_sequential_cpu_offload: Use sequential vs model CPU offload
enable_vae_slicing: Enable VAE slicing for memory efficiency
enable_debug_logging: Enable verbose debug output
max_sequence_length: Maximum sequence length for text encoder

Parameters:

model_config (Dict)
model_path (str)

config: Model configuration dictionary from presets.json

model_path: Path to model weights (local or HuggingFace ID)

pipeline: The loaded diffusion pipeline (None until load_pipeline called)

device: Torch device (mps, cuda, or cpu)

current_lora: Currently loaded LoRA configuration (or None)

settings: Model settings from config

dtype: Torch dtype for model weights

__init__(model_config, model_path)[source]

Initialize base model

Parameters:

model_config (Dict) – Model configuration from presets.json
model_path (str) – Full path to model file or HuggingFace ID

load_pipeline(progress_callback=None)[source]

Template method for pipeline loading (common flow)

Subclasses should NOT override this method. Instead, override _create_pipeline() to specify how to load the pipeline.

Parameters:: progress_callback – Optional callback for progress updates
Return type:: str
Returns:: Status message

generate(prompt, negative_prompt=None, steps=None, guidance_scale=None, width=None, height=None, seed=None, clip_skip=None, scheduler=None, progress_callback=None, **extra_params)[source]

Template method for image generation (common flow)

Subclasses should NOT override this method. Instead, override the hooks (_build_prompts, _build_pipeline_kwargs, etc.) to customize behavior.

Parameters:

prompt (str) – Text prompt
negative_prompt (Optional[str]) – Negative prompt (if supported)
steps (Optional[int]) – Number of inference steps
guidance_scale (Optional[float]) – Guidance scale
width (Optional[int]) – Image width
height (Optional[int]) – Image height
seed (Optional[int]) – Random seed
clip_skip (Optional[int]) – Number of CLIP layers to skip (if supported)
scheduler (Optional[str]) – Scheduler class name to use (overrides default)
progress_callback – Optional callback for progress updates
**extra_params – Additional parameters passed through to hooks (e.g., control_image, conditioning_scale for ControlNet)

Return type:

Tuple[Image, Dict]

Returns:

Tuple of (generated image, metadata dict)

set_scheduler(scheduler_name)[source]

Set the scheduler for the pipeline by class name.

Parameters:: scheduler_name (str) – Name of the scheduler class (e.g., ‘EulerDiscreteScheduler’, ‘DPMSolverMultistepScheduler’, ‘FlowMatchEulerDiscreteScheduler’)
Return type:: Optional[str]
Returns:: Name of the scheduler that was set, or None if failed

setup_device()[source]

Configure device for Apple Silicon optimization

Return type:: Tuple[device, str]
Returns:: Tuple of (device, device_name)

load_lora(lora_path, lora_config)[source]

Load a LoRA adapter

Parameters:

lora_path (str) – Path to LoRA file
lora_config (Dict) – LoRA configuration from presets

Return type:

str

Returns:

Status message

unload_lora()[source]

Unload current LoRA

Return type:: str
Returns:: Status message

get_lora_prompt_suffix()[source]

Get prompt suffix from current LoRA

Return type:: str

get_lora_negative_prompt_suffix()[source]

Get negative prompt suffix from current LoRA

Return type:: str

get_lora_clip_skip()[source]

Get clip_skip value from current LoRA

Return type:: Optional[int]

get_lora_guidance_scale()[source]

Get guidance_scale value from current LoRA

Return type:: Optional[float]

clear_cache()[source]

Clear GPU memory cache

Return type:: None

is_loaded()[source]

Check if pipeline is loaded

Return type:: bool

property model_name: str: Get model label

property model_slug: str: Get model slug

Mixins

Mixins for model implementations.

This module provides reusable behaviors that can be composed with BaseModel via multiple inheritance.

Available Mixins:

CompelPromptMixin: Long prompt handling (>77 tokens) and prompt weighting syntax (word:1.3) for CLIP-based models. Recommended for SDXL/SD15.
CLIPTokenLimitMixin: Simple 77-token truncation with LoRA suffix preservation. Legacy mixin - prefer CompelPromptMixin for new models.
DebugLoggingMixin: Conditional debug print statements via enable_debug_logging flag.

Usage Example:

from cw.lib.models.base import BaseModel
from cw.lib.models.mixins import CompelPromptMixin

class MySDXLModel(CompelPromptMixin, BaseModel):
    def _create_pipeline(self):
        return StableDiffusionXLPipeline.from_pretrained(...)

Note

Mixins must be listed before BaseModel in the inheritance order (MRO) to properly override BaseModel methods.

class cw.lib.models.mixins.CLIPTokenLimitMixin[source]

Bases: object

Mixin for models that need CLIP 77-token limit handling

Used by SDXL, SDXLTurbo, and SD15 models to ensure prompts fit within CLIP’s token limit while prioritizing LoRA trigger words.

class cw.lib.models.mixins.CompelPromptMixin(*args, **kwargs)[source]

Bases: object

Mixin for models that use Compel for long prompt handling and prompt weighting

Compel enables: - Prompts longer than 77 tokens (breaks into chunks, concatenates embeddings) - Prompt weighting syntax: (word:1.3) for emphasis, (word:0.8) for de-emphasis - Proper handling of LoRA trigger words without truncation

Used by SDXL, SDXLTurbo, and SD15 models as an alternative to CLIPTokenLimitMixin.

class cw.lib.models.mixins.DebugLoggingMixin[source]

Bases: object

Mixin for models that need debug print statements

Used by ZImageTurbo and SDXLTurbo for diagnostic output.

Adaptation Pipeline

State & Initialization

State helpers bridging VideoAdUnit (ORM) and PipelineState (in-memory).

Functions here handle the translation between Django models and the LangGraph PipelineState TypedDict used by pipeline nodes.

cw.lib.pipeline.state.resolve_pipeline_models(video_ad_unit)[source]

Resolve LLM model config for each pipeline node.

Return type:: dict

Fallback chain per non-writer node:

AdUnit.pipeline_model_config[node_key] (per-adaptation override)
PipelineSettings.<node_key>_default_model (app setting)
PipelineSettings.global_default_model (app fallback)
Language primary model (ultimate fallback)

Writer fallback chain:

AdUnit.pipeline_model_config[“writer”] (per-adaptation override)
AdUnit.llm_model (existing FK) (legacy override)
Language primary model (language default)
PipelineSettings.global_default_model (app fallback)

Returns dict of {node_key: {“model_id”: str, “load_in_4bit”: bool}}.

cw.lib.pipeline.state.build_initial_state(video_ad_unit)[source]

Build the initial PipelineState dict from a VideoAdUnit.

Mirrors the data-fetching logic used by the adaptation generator, providing pipeline nodes with the necessary origin data and target context.

Return type:: PipelineState

cw.lib.pipeline.state.save_pipeline_result(video_ad_unit, final_state)[source]

Persist the pipeline’s final state back to the VideoAdUnit.

On success, creates AdUnitScriptRow records for the adapted script. On failure, records the error.

Parameters:: final_state (PipelineState)

cw.lib.pipeline.state.get_alternative_model(language_code)[source]

Look up the first active alternative model for a language.

Returns None if no alternatives are available.

Return type:: Optional[object]
Parameters:: language_code (str)

Pipeline Nodes

Node functions for the LangGraph adaptation pipeline.

Each node follows the signature (state: PipelineState) -> dict and returns a partial state update. ORM imports are at function level to avoid circular imports and to match the pattern used in cw.tvspots.tasks.

cw.lib.pipeline.nodes.concept_node(state)[source]

Analyse the original script and produce a ConceptBrief.

Return type:: dict
Parameters:: state (PipelineState)

cw.lib.pipeline.nodes.culture_node(state)[source]

Produce a CulturalBrief for the target market.

Return type:: dict
Parameters:: state (PipelineState)

cw.lib.pipeline.nodes.writer_node(state)[source]

Generate (or revise) the adapted script.

Return type:: dict
Parameters:: state (PipelineState)

cw.lib.pipeline.nodes.format_eval_node(state)[source]

Evaluate format and language compliance of the adapted script.

Return type:: dict
Parameters:: state (PipelineState)

cw.lib.pipeline.nodes.cultural_eval_node(state)[source]

Evaluate cultural compliance of the adapted script.

Return type:: dict
Parameters:: state (PipelineState)

cw.lib.pipeline.nodes.concept_eval_node(state)[source]

Evaluate concept fidelity of the adapted script.

Return type:: dict
Parameters:: state (PipelineState)

cw.lib.pipeline.nodes.brand_eval_node(state)[source]

Evaluate brand consistency of the adapted script.

Return type:: dict
Parameters:: state (PipelineState)

Graph Construction

LangGraph state-graph definition for the adaptation pipeline.

Defines the seven-node graph with conditional routing for evaluation loops and revision retries.

cw.lib.pipeline.graph.route_after_format_eval(state)[source]

Route after format/language compliance evaluation.

Passed -> cultural_eval
Failed but retries remain -> writer (revision loop)
Exhausted retries -> fail (END)

Return type:: str
Parameters:: state (PipelineState)

cw.lib.pipeline.graph.route_after_cultural_eval(state)[source]

Route after cultural evaluation.

Passed -> concept_eval
Failed but retries remain -> writer (revision loop)
Exhausted retries -> fail (END)

Return type:: str
Parameters:: state (PipelineState)

cw.lib.pipeline.graph.route_after_concept_eval(state)[source]

Route after concept fidelity evaluation.

Passed -> brand_eval
Failed but retries remain -> writer (revision loop)
Exhausted retries -> fail (END)

Return type:: str
Parameters:: state (PipelineState)

cw.lib.pipeline.graph.route_after_brand_eval(state)[source]

Route after brand consistency evaluation.

Passed -> end (success)
Failed but retries remain -> writer (revision loop)
Exhausted retries -> fail (END)

Return type:: str
Parameters:: state (PipelineState)

cw.lib.pipeline.graph.build_adaptation_graph()[source]

Build and compile the LangGraph adaptation pipeline.

Flow:

concept -> culture -> writer -> format_eval
                        ^            |
                        |            v (pass)
                        |       cultural_eval
                        |            |
                        |            v (pass)
                        |       concept_eval
                        |            |
                        |            v (pass)
                        |       brand_eval
                        |            |
                        +--(fail? retry at any eval)
                                     |
                                     v (pass)
                                    END

Model Loader

Shared model loader for the adaptation pipeline.

Extracts model loading logic from AdaptationGenerator into a reusable class that can serve Outlines generators for any Pydantic schema.

class cw.lib.pipeline.model_loader.PipelineModelLoader(model_id='Qwen/Qwen2.5-3B-Instruct', device=None, load_in_4bit=False)[source]

Bases: object

Loads and caches an LLM, producing Outlines generators for arbitrary schemas.

Designed as a shared resource: the model loads once, and get_generator() can be called with different Pydantic schemas cheaply.

Parameters:

model_id (str)
device (Optional[str])
load_in_4bit (bool)

property tokenizer: Access the tokenizer (loads model if needed).

get_generator(output_schema)[source]

Return an Outlines generator bound to a specific Pydantic schema.

Parameters:: output_schema – A Pydantic BaseModel class defining the output format.
Returns:: An outlines.Generator that produces JSON matching the schema.

switch_model(model_id, load_in_4bit=False)[source]

Clear cache and reconfigure for a different model.

Parameters:

model_id (str)
load_in_4bit (bool)

clear_cache()[source]: Clear the model from memory.

cw.lib.pipeline.model_loader.get_model_loader(model_id='Qwen/Qwen2.5-3B-Instruct', load_in_4bit=False)[source]

Get or create the shared PipelineModelLoader singleton.

Return type:

PipelineModelLoader

Parameters:

model_id (str)
load_in_4bit (bool)

Prompt Enhancement

Prompt enhancement for diffusion models.

Expands simple text prompts into detailed, high-quality prompts optimized for diffusion model image generation. Supports three enhancement strategies:

Rule-based (PromptEnhancer) - No external dependencies - Fast, deterministic - Uses predefined quality tags and style descriptors
Local LLM (HFPromptEnhancer) - Uses local HuggingFace models (Qwen2.5-3B recommended) - Optimized for Apple Silicon (MPS) - More creative and context-aware
Anthropic API (LLMPromptEnhancer) - Uses Claude API - Highest quality but requires API key - Best for production use

Supported Styles:

auto: Auto-detect from prompt keywords
photography: DSLR, bokeh, lighting terms
artistic: Concept art, illustration terms
realistic: Photorealistic, hyperrealistic
cinematic: Movie still, dramatic lighting
coloring-book: Line art, clean outlines

CLI Usage:

# Rule-based (no dependencies)
python prompt_enhancer.py "a cat"

# Local HuggingFace model (recommended)
python prompt_enhancer.py "a cat" --use-hf

# Anthropic API
python prompt_enhancer.py "a cat" --use-llm --api-key sk-xxx

Programmatic Usage:

from cw.lib.prompt_enhancer import HFPromptEnhancer

enhancer = HFPromptEnhancer(style="photography")
result = enhancer.enhance_prompt("a cat on a windowsill")
print(result["enhanced_prompt"])

class cw.lib.prompt_enhancer.PromptEnhancer(style='auto', creativity=0.7, trigger_words=None)[source]

Bases: object

Rule-based prompt enhancer.

Adds quality tags, style descriptors, and technical enhancements to simple prompts without requiring any external models.

The enhancer automatically detects style from keywords and adds appropriate tags. For coloring-book style, special line-art focused enhancements are applied.

Parameters:

style (str)
creativity (float)
trigger_words (str | None)

style: Enhancement style (auto, photography, artistic, etc.)

creativity: Creativity level (0.0-1.0) affects number of tags

trigger_words: Optional LoRA trigger words to prepend

Example:

enhancer = PromptEnhancer(style="photography", creativity=0.8)
result = enhancer.enhance_prompt("a sunset over mountains")
# result = {
#     "original": "a sunset over mountains",
#     "enhanced_prompt": "masterpiece, best quality, ...",
#     "negative_prompt": "blurry, low quality, ...",
#     "detected_style": "photography"
# }

QUALITY_TAGS = ['masterpiece', 'best quality', 'high quality', 'highly detailed', 'professional', 'award-winning', 'stunning', 'exceptional detail']

STYLE_DESCRIPTORS = {'artistic': ['concept art', 'digital art', 'illustration', 'artwork', 'trending on artstation', 'by renowned artist', 'gallery quality', 'fine art', 'artistic composition'], 'cinematic': ['cinematic', 'movie still', 'film grain', 'wide angle', 'dramatic lighting', 'atmospheric', 'moody', 'epic composition', 'depth of field'], 'coloring-book': ['line art', 'clean lines', 'clear outlines', 'black and white', 'simple shapes', 'well-defined edges', 'bold outlines', 'coloring page style', 'no shading', 'flat design', 'easy to color', 'distinct sections'], 'photography': ['professional photography', 'DSLR', 'sharp focus', 'bokeh', 'perfectly composed', 'rule of thirds', 'golden hour lighting', 'studio lighting', 'natural lighting', 'cinematic lighting'], 'realistic': ['photorealistic', 'hyperrealistic', 'lifelike', 'realistic details', 'accurate proportions', 'natural colors', 'true to life', '8k resolution', 'ultra HD']}

TECHNICAL_ENHANCEMENTS = ['intricate details', 'sharp focus', 'crisp details', 'perfect composition', 'balanced colors', 'rich colors', 'vibrant', 'dynamic range', 'high contrast', 'well-lit', 'professional grade']

NEGATIVE_PROMPT_DEFAULTS = ['blurry', 'low quality', 'worst quality', 'low resolution', 'jpeg artifacts', 'compression artifacts', 'distorted', 'deformed', 'ugly', 'duplicate', 'mutilated', 'poorly drawn', 'bad anatomy', 'bad proportions', 'watermark', 'signature', 'text']

COLORING_BOOK_NEGATIVE_PROMPTS = ['colored', 'shaded', 'shading', 'gradient', 'soft edges', 'blurry lines', 'unclear outlines', 'complex details', 'texture', 'photorealistic', 'detailed rendering', 'colored pencil', 'watercolor', 'painted', 'low quality', 'messy lines', 'incomplete outlines', 'bad anatomy', 'poorly drawn', 'distorted', 'watermark', 'signature', 'text']

__init__(style='auto', creativity=0.7, trigger_words=None)[source]

Initialize the prompt enhancer.

Parameters:

style (str) – Enhancement style (photography, artistic, realistic, cinematic, coloring-book, auto)
creativity (float) – How creative to be with enhancements (0.0-1.0)
trigger_words (Optional[str]) – Optional LoRA trigger words to always include at the start

enhance_prompt(simple_prompt)[source]

Enhance a simple prompt with quality tags and detailed descriptions.

Parameters:: simple_prompt (str) – The basic prompt to enhance
Return type:: dict
Returns:: dict with ‘enhanced_prompt’, ‘negative_prompt’, ‘original’

class cw.lib.prompt_enhancer.HFPromptEnhancer(model_id='Qwen/Qwen2.5-3B-Instruct', style='auto', creativity=0.7, trigger_words=None, device=None)[source]

Bases: PromptEnhancer

Enhanced version using local HuggingFace models (optimized for Apple Silicon).

Parameters:

model_id (str)
style (str)
creativity (float)
trigger_words (str | None)
device (str | None)

__init__(model_id='Qwen/Qwen2.5-3B-Instruct', style='auto', creativity=0.7, trigger_words=None, device=None)[source]

Initialize HuggingFace model-based enhancer.

Parameters:

model_id (str) – HuggingFace model ID (e.g., “Qwen/Qwen2.5-3B-Instruct”)
style (str) – Enhancement style
creativity (float) – Creativity level
trigger_words (Optional[str]) – Optional LoRA trigger words
device (Optional[str]) – Device to use (‘cpu’, ‘mps’, ‘cuda’, or None for auto-detect)

enhance_prompt(simple_prompt)[source]

Use local HuggingFace model to create a sophisticated enhanced prompt.

Return type:: dict
Parameters:: simple_prompt (str)

class cw.lib.prompt_enhancer.LLMPromptEnhancer(api_key, model='claude-3-5-sonnet-20241022', style='auto', creativity=0.7, trigger_words=None)[source]

Bases: PromptEnhancer

Enhanced version using LLM API for more sophisticated expansions.

Parameters:

api_key (str)
model (str)
style (str)
creativity (float)
trigger_words (str | None)

__init__(api_key, model='claude-3-5-sonnet-20241022', style='auto', creativity=0.7, trigger_words=None)[source]

Initialize LLM-based enhancer.

Parameters:

api_key (str) – Anthropic API key
model (str) – Model to use for enhancement
style (str) – Enhancement style
creativity (float) – Creativity level
trigger_words (Optional[str]) – Optional LoRA trigger words

enhance_prompt(simple_prompt)[source]

Use LLM to create a sophisticated enhanced prompt.

Return type:: dict
Parameters:: simple_prompt (str)

cw.lib.prompt_enhancer.process_prompts_from_file(filepath, enhancer)[source]

Process prompts from a text file (one prompt per line).

Return type:

List[dict]

Parameters:

filepath (Path)
enhancer (PromptEnhancer)

cw.lib.prompt_enhancer.main()[source]

CLI entry point for prompt enhancement.

Refactored to reduce complexity by extracting: - Argument parser setup to _setup_argument_parser() - Model recommendations display to _show_recommended_models() - Enhancer initialization to _initialize_enhancer() - Output formatting to _output_results()

LoRA Management

LoRA Manager for Generative Creative Lab Handles LoRA filtering, loading, and compatibility checking

class cw.lib.loras.manager.LoRAManager[source]

Bases: object

Manages LoRA adapters and their compatibility with models

get_compatible_loras(model_slug)[source]

Get list of LoRAs compatible with specified model

Parameters:: model_slug (str) – Model slug (e.g., “zimageturbo”, “flux1_dev”)
Return type:: List[Dict]
Returns:: List of compatible LoRA configurations

get_lora_choices(model_slug)[source]

Get list of LoRA labels for UI dropdown, filtered by model compatibility

Parameters:: model_slug (str) – Model slug to filter by
Return type:: List[str]
Returns:: List of LoRA labels, starting with “None (No LoRA)”

get_lora_by_label(label, model_slug=None)[source]

Get LoRA configuration by label

Parameters:

label (str) – LoRA label to search for
model_slug (Optional[str]) – Optional model slug to filter by

Return type:

Optional[Dict]

Returns:

LoRA configuration dict or None

get_lora_path(lora_config)[source]

Get full path to LoRA file

Parameters:: lora_config (Dict) – LoRA configuration from presets
Return type:: str
Returns:: Full path to LoRA .safetensors file

validate_lora_exists(lora_config)[source]

Check if LoRA file exists

Parameters:: lora_config (Dict) – LoRA configuration from presets
Return type:: bool
Returns:: True if file exists, False otherwise

get_lora_strength(lora_config)[source]

Get recommended strength for LoRA

Parameters:: lora_config (Dict) – LoRA configuration from presets
Return type:: float
Returns:: LoRA strength (default 1.0)

get_lora_prompt_suffix(lora_config)[source]

Get prompt suffix for LoRA

Parameters:: lora_config (Dict) – LoRA configuration from presets
Return type:: str
Returns:: Prompt suffix string

is_compatible(lora_config, model_slug)[source]

Check if LoRA is compatible with model

Parameters:

lora_config (Dict) – LoRA configuration from presets
model_slug (str) – Model slug to check compatibility with

Return type:

bool

Returns:

True if compatible, False otherwise

Configuration

Configuration loader and validator for Generative Creative Lab Loads presets.json and provides access to model and LoRA configurations

class cw.lib.config.PresetsConfig(presets_path='presets.json')[source]

Bases: object

Loads and validates presets.json configuration

Parameters:: presets_path (str)

load()[source]

Load presets from JSON file

Return type:: None

property base_model_path: Path: Get base path for model files

property base_output_path: Path: Get base path for output files

get_models()[source]

Get list of all available models

Return type:: List[Dict]

get_model_by_slug(slug)[source]

Get model configuration by slug

Return type:: Optional[Dict]
Parameters:: slug (str)

get_loras()[source]

Get list of all available LoRAs

Return type:: List[Dict]

get_loras_for_model(model_slug)[source]

Get LoRAs compatible with specified model

Return type:: List[Dict]
Parameters:: model_slug (str)

get_model_choices()[source]

Get list of model labels for UI dropdown

Return type:: List[str]

get_model_by_label(label)[source]

Get model configuration by label

Return type:: Optional[Dict]
Parameters:: label (str)

is_huggingface_model(model)[source]

Check if model path is a HuggingFace Hub ID

Return type:: bool
Parameters:: model (Dict)

get_model_path(model)[source]

Get full model path (local or HuggingFace ID)

Return type:: str
Parameters:: model (Dict)

cw.lib.config.get_config()[source]

Get or create global config instance

Return type:: PresetsConfig

CivitAI Integration

CivitAI integration for downloading LoRA files via AIR URN.

AIR format: urn:air:{ecosystem}:{type}:civitai:{modelId}@{versionId} Download endpoint: GET https://civitai.com/api/download/models/{versionId} Metadata endpoint: GET https://civitai.com/api/v1/model-versions/{versionId}

cw.lib.civitai.parse_air(air_urn)[source]

Extract model ID and version ID from a CivitAI AIR URN.

Parameters:: air_urn (str) – e.g. “urn:air:zimageturbo:lora:civitai:2344335@2636956”
Return type:: tuple[str, str]
Returns:: (model_id, version_id) as strings
Raises:: ValueError – If the AIR URN cannot be parsed

cw.lib.civitai.fetch_model_version_metadata(version_id, api_key=None)[source]

Fetch metadata for a model version from CivitAI API.

Parameters:

version_id (str) – CivitAI model version ID
api_key (Optional[str]) – Optional CivitAI API key for authentication

Return type:

dict

Returns:

Dictionary containing model version metadata

Raises:

RuntimeError – If API request fails

cw.lib.civitai.extract_lora_metadata(metadata)[source]

Extract relevant LoRA fields from CivitAI model version metadata.

Parameters:: metadata (dict) – Raw metadata from CivitAI API
Return type:: dict
Returns:: Dictionary with extracted fields suitable for LoraModel creation

cw.lib.civitai.download_lora(air_urn, dest_path, api_key)[source]

Download a LoRA file from CivitAI using its AIR URN.

Parameters:

air_urn (str) – CivitAI AIR URN containing model/version IDs
dest_path (str) – Local path where the file should be saved
api_key (str) – CivitAI API key for authentication

Return type:

str

Returns:

The dest_path string on success

Raises:

ValueError – If AIR cannot be parsed
RuntimeError – If download fails

Insights Composition

Insights composition for multi-level adaptation guidance.

Aggregates insights from region → country → language into a single hierarchical guidance document for LLM-based adaptations.

cw.lib.insights.compose_insights(video_ad_unit)[source]

Aggregate insights from region, country, language, and persona segments.

Parameters:: video_ad_unit – VideoAdUnit instance with dimensional references (region, country, language, persona)
Return type:: List[Dict[str, str]]
Returns:: List of dicts, each with 'source' (e.g. 'Region: North America') and 'markdown' (rendered insight content) keys.

cw.lib.insights.compose_insights_as_markdown(video_ad_unit)[source]

Compose all insights into a single Markdown document.

Parameters:: video_ad_unit – VideoAdUnit instance
Return type:: str
Returns:: Markdown string with all insights hierarchically organized

Storyboard Generation

Storyboard generation from TV spot scripts.

This module generates image prompts from TV spot script visual descriptions and creates DiffusionJobs for storyboard frame generation.

Key Features:

Extracts visual elements from script descriptions
Optional LLM enhancement for more detailed prompts
Creates linked DiffusionJob records for each frame
Supports visual style prefixes for consistency

Classes:

StoryboardGenerator: Generates image prompts from script rows with optional LLM enhancement

Functions:

create_storyboard_jobs(): Creates DiffusionJob and StoryboardImage records from prompts

Workflow:

Extract visual elements from script row visual_text
Add visual style prefix and cinematic quality keywords
Optionally enhance with LLM (HFPromptEnhancer)
Create DiffusionJob records linked to StoryboardJob

Usage:

from cw.lib.storyboard import StoryboardGenerator, create_storyboard_jobs

# Generate prompts
generator = StoryboardGenerator(use_llm=True)
prompts = generator.generate_prompts_for_version(tv_spot_version)

# Create DiffusionJobs
jobs = create_storyboard_jobs(storyboard_job, prompts)

Note

Generated storyboard frames use 1280x720 (16:9) dimensions by default.

class cw.lib.storyboard.StoryboardGenerator(use_llm=True, model_id='Qwen/Qwen2.5-3B-Instruct', device=None)[source]

Bases: object

Generates storyboard image prompts from TV spot script rows.

Can use either simple prompt building (combines visual description with style prefix) or LLM-enhanced prompts for more detailed generation.

Parameters:

use_llm (bool)
model_id (str)
device (str | None)

__init__(use_llm=True, model_id='Qwen/Qwen2.5-3B-Instruct', device=None)[source]

Initialize the storyboard generator.

Parameters:

use_llm (bool) – Whether to use LLM for enhanced prompt generation
model_id (str) – HuggingFace model ID for prompt enhancement
device (Optional[str]) – Device to use (‘cpu’, ‘mps’, ‘cuda’, or None for auto-detect)

generate_prompt(visual_text, audio_text, visual_style_prompt='', enhance=True)[source]

Generate an image prompt from a script row.

Parameters:

visual_text (str) – Visual description from script (shots, settings, actions)
audio_text (str) – Audio description (dialogue, VO) for context
visual_style_prompt (str) – Common style prefix for consistency
enhance (bool) – Whether to use LLM enhancement

Return type:

dict

Returns:

Dict with ‘prompt’ and optionally ‘negative_prompt’

generate_prompts_for_version(video_ad_unit, enhance=True)[source]

Generate prompts for all script rows in a video ad unit.

Parameters:

video_ad_unit – VideoAdUnit instance
enhance (bool) – Whether to use LLM enhancement

Return type:

list[dict]

Returns:

List of prompt dicts, one per script row

cw.lib.storyboard.create_storyboard_jobs(storyboard, prompts)[source]

Create DiffusionJobs and StoryboardImages for a storyboard.

Parameters:

storyboard – Storyboard instance
prompts (list[dict]) – List of prompt dicts from generate_prompts_for_version

Return type:

list

Returns:

List of created DiffusionJob instances

class cw.lib.storyboard.WireframePromptBuilder(style_prompt='')[source]

Bases: object

Builds image prompts for wireframe/line-drawing storyboard generation.

Unlike StoryboardGenerator which creates prompts from script text, this builder creates simple style-focused prompts designed to work with ControlNet structural guidance from video keyframes.

Parameters:: style_prompt (str)

DEFAULT_STYLE = 'clean line drawing, wireframe storyboard cel, black and white, professional illustration, architectural sketch style'

DEFAULT_NEGATIVE = 'photorealistic, photograph, blurry, low quality, watermark, text overlay, color photograph, 3D render'

__init__(style_prompt='')[source]

Parameters:: style_prompt (str) – Custom style prompt. Falls back to DEFAULT_STYLE if empty.

build_prompt(visual_text='', scene_number=None)[source]

Build a wireframe generation prompt for a single keyframe.

The prompt emphasises the desired visual style rather than content description — the content comes from the ControlNet reference image.

Parameters:

visual_text (str) – Optional script-row description for extra context.
scene_number (int | None) – Optional scene number for logging.

Return type:

dict

Returns:

Dict with prompt and negative_prompt keys.

cw.lib.storyboard.create_wireframe_storyboard_jobs(storyboard)[source]

Create ControlNet-guided DiffusionJobs from video keyframes.

Matches keyframes (by scene_number) to script rows and creates one DiffusionJob per keyframe per images_per_row. Each DiffusionJob is configured with the storyboard’s ControlNet settings and the keyframe image as the reference input.

Parameters:: storyboard – Storyboard with source_type="keyframe" and a controlnet_model set.
Return type:: list
Returns:: List of created DiffusionJob instances.
Raises:: ValueError – If the VideoAdUnit has no source media or keyframes.

Video Analysis

Video analysis utilities for extracting scenes, transcription, and insights.

This package provides tools for analyzing uploaded video files to automatically generate structured scripts for TV spot campaigns.

cw.lib.video_analysis.extract_video_metadata(video_path)[source]

Extract metadata from a video file using PyAV.

Parameters:

video_path (str) – Path to the video file

Returns:

{: ‘duration’: float (seconds), ‘width’: int, ‘height’: int, ‘frame_rate’: float, ‘audio_channels’: int, ‘sample_rate’: int, ‘file_size’: int (bytes)

}

Return type:

Dictionary containing video metadata

Raises:

FileNotFoundError – If video file doesn’t exist
av.AVError – If file cannot be opened or is not a valid video

cw.lib.video_analysis.detect_scenes(video_path, threshold=27.0, min_scene_len=15)[source]

Detect scene boundaries using content-based detection.

Uses PySceneDetect’s ContentDetector which analyzes changes in content between frames to identify scene boundaries.

Parameters:

video_path (str) – Path to video file
threshold (float) – Sensitivity threshold for scene detection (default: 27.0) Lower values = more scenes detected Typical range: 15-40
min_scene_len (int) – Minimum scene length in frames (default: 15)

Returns:

[

{: “scene_number”: 1, “start_time”: 0.0, “end_time”: 3.5, “duration”: 3.5, “start_frame”: 0, “end_frame”: 105

]

Return type:

List of scene dictionaries

Raises:

FileNotFoundError – If video file doesn’t exist
Exception – If scene detection fails

cw.lib.video_analysis.transcribe_audio(audio_path, model_size='large-v3', language=None)[source]

Transcribe audio using Whisper.

Parameters:

audio_path (str) – Path to audio file (MP3, WAV, or video file)
model_size (str) – Whisper model size (tiny, base, small, medium, large, large-v3) Recommended: large-v3 for best quality
language (str) – Force specific language (e.g., ‘en’), or None for auto-detect

Returns:

{

“language”: “en”, “confidence”: 0.95, “segments”: [

{
“start”: 0.5, “end”: 3.2, “text”: “Transcribed text…”, “speaker”: “narrator”, “confidence”: 0.96

]

}

Return type:

Transcription dictionary

Raises:

FileNotFoundError – If audio file doesn’t exist
Exception – If transcription fails

cw.lib.video_analysis.detect_objects(image_path, conf_threshold=0.5, model_name='yolov8x.pt')[source]

Detect objects in a single image using YOLO v8.

Parameters:

image_path (str) – Path to image file
conf_threshold (float) – Minimum confidence threshold (0.0-1.0)
model_name (str) – YOLO model variant to use

Returns:

[

{: “class”: “person”, “class_id”: 0, “confidence”: 0.95, “bbox”: [x1, y1, x2, y2], # Coordinates in pixels

]

Return type:

List of detected objects

Raises:

FileNotFoundError – If image file doesn’t exist
Exception – If detection fails

cw.lib.video_analysis.detect_objects_batch(image_paths, conf_threshold=0.5, model_name='yolov8x.pt')[source]

Detect objects in multiple images (batch processing).

Parameters:

image_paths (List[str]) – List of image file paths
conf_threshold (float) – Minimum confidence threshold
model_name (str) – YOLO model variant to use

Returns:

{: “frame_001.jpg”: [{“class”: “person”, …}, …], “frame_002.jpg”: [{“class”: “car”, …}, …], …

}

Return type:

Dictionary mapping image paths to detection lists

Raises:

Exception – If batch detection fails

cw.lib.video_analysis.summarize_objects(detections)[source]

Summarize object detections with counts and confidence scores.

Parameters:

detections (List[Dict]) – List of detections from detect_objects()

Returns:

{

“total_objects”: 15, “classes”: {

”person”: {“count”: 5, “avg_confidence”: 0.92}, “car”: {“count”: 2, “avg_confidence”: 0.85}, …

}, “most_common”: [“person”, “car”, “bottle”]

}

Return type:

Summary dictionary

cw.lib.video_analysis.analyze_camera_work(scenes)[source]

Analyze camera work based on scene metadata.

Parameters:

scenes (List[Dict]) – List of scene dictionaries from scene detection

Returns:

{: “avg_scene_duration”: 4.5, # seconds “total_scenes”: 12, “pacing”: “fast”, # slow | medium | fast “scene_transitions”: 11,

}

Return type:

Camera work analysis

cw.lib.video_analysis.analyze_lighting(image_path)[source]

Analyze lighting characteristics of an image.

Parameters:

image_path (str) – Path to image file

Returns:

{: “brightness”: 0.65, # 0-1 scale (average luminance) “contrast”: 0.42, # 0-1 scale (std deviation) “exposure”: “normal”, # underexposed | normal | overexposed “lighting_style”: “soft”, # harsh | soft | dramatic

}

Return type:

Lighting analysis dictionary

Raises:

FileNotFoundError – If image file doesn’t exist
Exception – If analysis fails

cw.lib.video_analysis.analyze_visual_style(image_paths)[source]

Analyze overall visual style across multiple frames.

Parameters:

image_paths (List[str]) – List of image file paths (keyframes)

Returns:

{

“dominant_colors”: [‘#FF5733’, ‘#3357FF’, …], “avg_brightness”: 0.58, “avg_contrast”: 0.45, “lighting_distribution”: {

”soft”: 3, “harsh”: 1, “dramatic”: 1

}, “exposure_distribution”: {

”normal”: 4, “overexposed”: 1

}

Return type:

Visual style summary

Raises:

Exception – If analysis fails

cw.lib.video_analysis.extract_dominant_colors(image_path, n_colors=5)[source]

Extract dominant colors from an image using k-means clustering.

Parameters:

image_path (str) – Path to image file
n_colors (int) – Number of dominant colors to extract

Return type:

List[str]

Returns:

List of hex color codes (e.g., [‘#FF5733’, ‘#3357FF’, …])

Raises:

FileNotFoundError – If image file doesn’t exist
Exception – If color extraction fails

cw.lib.video_analysis.analyze_sentiment(transcription, visual_style, objects_summary)[source]

Comprehensive sentiment analysis combining audio and visual data.

Parameters:

transcription (Dict) – Transcription data with segments
visual_style (Dict) – Visual style analysis results
objects_summary (Dict) – Object detection summary

Returns:

{: “overall_sentiment”: “positive”, “overall_score”: 0.65, # -1.0 to 1.0 “confidence”: 0.72, “text_sentiment”: {…}, “visual_sentiment”: {…},

}

Return type:

Combined sentiment analysis

cw.lib.video_analysis.categorize_scenes(scenes, keyframe_detections, transcription)[source]

Categorize all scenes in a video.

Parameters:

scenes (List[Dict]) – List of scene dictionaries
keyframe_detections (Dict[int, List[Dict]]) – Dict mapping scene_number to detections
transcription (Dict) – Full transcription data

Returns:

[

{: “scene_number”: 1, “start_time”: 0.0, “end_time”: 5.0, “categories”: [“people”, “lifestyle”], …

]

Return type:

List of scene dictionaries with added “categories” field

cw.lib.video_analysis.summarize_categories(categorized_scenes)[source]

Summarize category distribution across all scenes.

Parameters:

categorized_scenes (List[Dict]) – List of scenes with categories

Returns:

{

“total_scenes”: 10, “category_counts”: {

”people”: 6, “product”: 4, “lifestyle”: 3, …

}, “primary_categories”: [“people”, “product”], # Top 3

}

Return type:

Category summary

cw.lib.video_analysis.generate_audience_insights(script, visual_style, sentiment, transcription, categories, model_id='Qwen/Qwen2.5-3B-Instruct', load_in_4bit=False)[source]

Generate audience targeting insights using LLM analysis.

Parameters:

script (Dict) – Generated script data with scenes
visual_style (Dict) – Visual style analysis results
sentiment (Dict) – Sentiment analysis results
transcription (Dict) – Audio transcription data
categories (Dict) – Scene categorization summary
model_id (str) – LLM model to use for generation
load_in_4bit (bool) – Whether to use 4-bit quantization

Return type:

Dict

Returns:

Audience insights dictionary matching AudienceInsights schema

Raises:

Exception – If LLM generation fails (falls back to rule-based)

File Security

File upload security validation for video files.

Provides comprehensive security checks for uploaded video files including: - File size validation - Extension whitelisting - MIME type verification - File header (magic bytes) verification - Filename sanitization - Path traversal prevention - Malicious content detection

All validation failures are logged for security auditing.

class cw.lib.security.file_validation.FileSecurityValidator(enabled=True)[source]

Bases: object

Base class for file security validators.

All validators should inherit from this class and implement the validate() method. Validation failures are logged with audit information.

Parameters:: enabled (bool)

__init__(enabled=True)[source]

Initialize validator.

Parameters:: enabled (bool) – Whether this validator is enabled. Disabled validators always pass.

validate(file, **kwargs)[source]

Validate the uploaded file.

Parameters:

file (UploadedFile) – The uploaded file to validate
**kwargs – Additional validator-specific parameters

Raises:

ValidationError – If validation fails

Return type:

None

class cw.lib.security.file_validation.FileSizeValidator(max_size_bytes=None, min_size_bytes=1, enabled=True)[source]

Bases: FileSecurityValidator

Validates file size against configured limits.

Prevents denial-of-service attacks via oversized uploads and ensures storage constraints are respected.

Parameters:

max_size_bytes (int | None)
min_size_bytes (int)
enabled (bool)

__init__(max_size_bytes=None, min_size_bytes=1, enabled=True)[source]

Initialize file size validator.

Parameters:

max_size_bytes (Optional[int]) – Maximum allowed file size in bytes. Defaults to settings.VIDEO_MAX_UPLOAD_SIZE_BYTES
min_size_bytes (int) – Minimum allowed file size in bytes (default: 1)
enabled (bool) – Whether this validator is enabled

class cw.lib.security.file_validation.FileExtensionValidator(allowed_extensions=None, enabled=True)[source]

Bases: FileSecurityValidator

Validates file extension against a whitelist.

Prevents upload of executable files or files with misleading extensions. Case-insensitive comparison.

Parameters:

allowed_extensions (Set[str] | None)
enabled (bool)

__init__(allowed_extensions=None, enabled=True)[source]

Initialize extension validator.

Parameters:

allowed_extensions (Optional[Set[str]]) – Set of allowed extensions (with leading dot, e.g., {‘.mp4’, ‘.mov’}). Defaults to settings.VIDEO_ALLOWED_EXTENSIONS
enabled (bool) – Whether this validator is enabled

class cw.lib.security.file_validation.MimeTypeValidator(allowed_mime_types=None, verify_content=True, enabled=True)[source]

Bases: FileSecurityValidator

Validates MIME type from file content (not just declared content-type).

Uses python-magic (libmagic) to detect actual file type from content, preventing extension-based spoofing attacks.

Parameters:

allowed_mime_types (Set[str] | None)
verify_content (bool)
enabled (bool)

__init__(allowed_mime_types=None, verify_content=True, enabled=True)[source]

Initialize MIME type validator.

Parameters:

allowed_mime_types (Optional[Set[str]]) – Set of allowed MIME types. Defaults to settings.VIDEO_ALLOWED_MIME_TYPES
verify_content (bool) – If True, uses libmagic to detect MIME type from content. If False, only checks declared content_type header.
enabled (bool) – Whether this validator is enabled

class cw.lib.security.file_validation.FileHeaderValidator(enabled=True)[source]

Bases: FileSecurityValidator

Validates file headers (magic bytes) to prevent file type spoofing.

Checks that file headers match expected video file signatures.

Parameters:: enabled (bool)

VIDEO_SIGNATURES: Dict[str, List[bytes]] = {'avi': [b'RIFF'], 'mkv': [b'\x1aE\xdf\xa3'], 'mov': [b'\x00\x00\x00\x14ftypqt '], 'mp4': [b'\x00\x00\x00\x18ftypmp42', b'\x00\x00\x00 ftypmp42', b'\x00\x00\x00\x18ftypisom', b'\x00\x00\x00 ftypisom'], 'webm': [b'\x1aE\xdf\xa3']}

__init__(enabled=True)[source]

Initialize file header validator.

Parameters:: enabled (bool) – Whether this validator is enabled

class cw.lib.security.file_validation.FilenameSanitizer(remove_path_components=True, replace_spaces=True, lowercase=False, enabled=True)[source]

Bases: FileSecurityValidator

Sanitizes filenames to prevent path traversal and other attacks.

Removes or escapes dangerous characters and patterns from filenames. Does not raise ValidationError - instead modifies the filename in-place.

Parameters:

remove_path_components (bool)
replace_spaces (bool)
lowercase (bool)
enabled (bool)

DANGEROUS_PATTERNS = ['\\.\\.', '[<>:\\"|?*]', '[\\x00-\\x1f]', '^\\..*']

MAX_FILENAME_LENGTH = 255

__init__(remove_path_components=True, replace_spaces=True, lowercase=False, enabled=True)[source]

Initialize filename sanitizer.

Parameters:

remove_path_components (bool) – Remove any path components (keep only basename)
replace_spaces (bool) – Replace spaces with underscores
lowercase (bool) – Convert filename to lowercase
enabled (bool) – Whether this validator is enabled

class cw.lib.security.file_validation.VideoFileValidator(max_size_bytes=None, allowed_extensions=None, allowed_mime_types=None, verify_mime_content=True, validate_headers=True, sanitize_filenames=True, enable_all=True)[source]

Bases: object

Composite validator for video file uploads.

Chains multiple validators together and provides a single validation interface. Use this as the main entry point for video file validation.

Parameters:

max_size_bytes (int | None)
allowed_extensions (Set[str] | None)
allowed_mime_types (Set[str] | None)
verify_mime_content (bool)
validate_headers (bool)
sanitize_filenames (bool)
enable_all (bool)

__init__(max_size_bytes=None, allowed_extensions=None, allowed_mime_types=None, verify_mime_content=True, validate_headers=True, sanitize_filenames=True, enable_all=True)[source]

Initialize composite video file validator.

Parameters:

max_size_bytes (Optional[int]) – Maximum file size in bytes
allowed_extensions (Optional[Set[str]]) – Set of allowed extensions (e.g., {‘.mp4’, ‘.mov’})
allowed_mime_types (Optional[Set[str]]) – Set of allowed MIME types
verify_mime_content (bool) – Use libmagic to verify MIME from content
validate_headers (bool) – Check file headers (magic bytes)
sanitize_filenames (bool) – Sanitize filenames for security
enable_all (bool) – Enable all validators (can be overridden per-validator)

validators: List[FileSecurityValidator]

validate(file)[source]

Run all validators on the uploaded file.

Parameters:: file (UploadedFile) – The uploaded file to validate (UploadedFile for new uploads, FieldFile for existing files)
Raises:: ValidationError – If any validator fails
Return type:: None

validate_multiple(files)[source]

Validate multiple files and return validation results.

Parameters:: files (List[UploadedFile]) – List of uploaded files to validate
Return type:: Dict[str, List[str]]
Returns:: Dictionary mapping filenames to list of validation errors (empty list if valid)