Philosophy & Design Principles
Why this project exists and the principles that guide its development.
Vision
Generative Creative Lab is a platform for exploring how generative AI can augment human creativity in advertising production. The focus is on TV spot adaptation — taking a single commercial and producing culturally-appropriate variations for global markets, using AI to handle the mechanical parts of localization while keeping creative direction in human hands.
Principles
Creative AI Augments, Not Replaces
AI handles the repetitive, structured parts of the production workflow: generating image prompts from script descriptions, producing initial visual frames, and evaluating scripts against checklists. Creative direction, cultural judgment, and final approval remain with human operators.
Every automated step is reviewable in the admin interface before proceeding.
Multi-Model by Design
No single model is best for everything. The architecture supports 7+ diffusion models with different strengths:
Turbo models (4–9 steps) for rapid iteration
Full models (28–50 steps) for production quality
Specialized models (Realistic Vision, Juggernaut XL) for specific visual styles
The same principle applies to LLM models: each pipeline node can use a different model optimized for its task.
Cultural Sensitivity as a First-Class Concern
The adaptation pipeline treats cultural context as structured data, not an afterthought:
Regions, Countries, Languages form a hierarchical knowledge base with insights at each level
Audience segments and personas feed demographic, behavioral, and psychographic context into adaptation
Four evaluation gates (format, cultural, concept, brand) validate adapted scripts before acceptance
Brand guidelines are checked against adapted content to maintain voice and identity across markets
Configuration Over Code
Behavior changes should not require code changes wherever possible:
presets.json defines model configurations and settings flags
Prompt templates are database-backed Jinja2 with live editing, versioning, and cache invalidation
Reference data (regions, countries, languages, segments, brands) is managed via JSON import/export
Per-node model selection is configurable through admin without touching pipeline code
PipelineSettings controls defaults from a singleton admin page
Template Method for Extensibility
The model architecture uses the Template Method Pattern to keep concrete implementations minimal:
BaseModel provides 90% of the logic (loading, generation, LoRA management, caching)
Concrete models override only what differs (typically 20–70 lines)
Mixins add optional behaviors (CLIP token handling, debug logging, prompt weighting)
Adding a new model requires one file with one method override
This eliminates code duplication while remaining easy to understand.
Sequential Execution for GPU Safety
The Celery worker uses a solo pool (single-threaded) to prevent concurrent model loading. This is a deliberate architectural choice:
Only one model occupies VRAM at a time
Models are warm-cached between tasks for efficiency
VRAM is explicitly freed when switching between model types (enhancer → diffusion → pipeline LLM)
The trade-off is throughput for reliability — a single worker handles all tasks sequentially
Database as Source of Truth
All operational data lives in the database, not the filesystem:
Prompt templates are stored and versioned in the database (no filesystem fallback)
Job status, metadata, and results are tracked in Django ORM
Configuration is synced from JSON files to database via management commands
The
data/directory is the bootstrap source; the database is the runtime authority
Technical Foundations
The project is built on:
Django 5 + Unfold — Admin-first UI, no custom frontend needed
Celery + Valkey — Async task processing with solo pool
HuggingFace Diffusers — Diffusion model abstraction
LangGraph — Multi-agent pipeline orchestration
Outlines — Structured LLM generation (Pydantic schema constraints)
Grafana + Loki — Log aggregation and search
Sphinx + Mermaid — Documentation with diagrams