Objectives
- Let authors transform short narratives into illustrated comic pages without drawing skills.
- Automate story → panels → images → layout, while preserving visual consistency of characters/style.
Architecture
- Modules:
story_divider
(LLM → structured panels JSON),prompt_generator
,panel_cutter
,page_fitter
(layout). - Data models: Pydantic/dataclasses for
Project
→Pages
→Panels
with text/image, serializable to JSON and DB‑friendly. - Orchestration: threaded/async workers and a queue to manage GPU workloads and parallel panel generation.
Generation & Post‑processing
- Images via Diffusers (Stable Diffusion) with style templates; TorchVision/OpenCV for composition and typography.
- Quality: GFPGAN/CodeFormer + facexlib/onnxruntime/insightface to restore faces and improve fidelity.
- Consistency: character descriptors reused across prompts; optional face matching to enforce continuity.
Workflow Intelligence
- Feedback loop: quality checks (vision/OCR) can trigger prompt refinements or add missing panels.
- Optional dialogue generation to enrich panels if the source text is too narrative.
Results
- Working prototype producing coherent pages with configurable styles (realistic/cartoon/manga).
- Demonstrates an end‑to‑end creative AI pipeline from text structuring to page layout.
- Sample: Generated comic (PDF)
Skills Demonstrated
Multimodal generation, structured prompting, GPU job orchestration, image restoration, and programmatic layout design.