Comic Creator App — Text‑to‑Comics

Python

Diffusion

LLM

Vision

Experimental pipeline that turns a written scenario into comic pages: LLM divides the story into panels, Stable Diffusion generates images, and a compositor assembles pages with speech bubbles.

Storyboard transforming into a comic page with speech bubbles

Objectives

Let authors transform short narratives into illustrated comic pages without drawing skills.
Automate story → panels → images → layout, while preserving visual consistency of characters/style.

Architecture

Modules: story_divider (LLM → structured panels JSON), prompt_generator, panel_cutter, page_fitter (layout).
Data models: Pydantic/dataclasses for Project → Pages → Panels with text/image, serializable to JSON and DB‑friendly.
Orchestration: threaded/async workers and a queue to manage GPU workloads and parallel panel generation.

Generation & Post‑processing

Images via Diffusers (Stable Diffusion) with style templates; TorchVision/OpenCV for composition and typography.
Quality: GFPGAN/CodeFormer + facexlib/onnxruntime/insightface to restore faces and improve fidelity.
Consistency: character descriptors reused across prompts; optional face matching to enforce continuity.

Workflow Intelligence

Feedback loop: quality checks (vision/OCR) can trigger prompt refinements or add missing panels.
Optional dialogue generation to enrich panels if the source text is too narrative.

Results

Working prototype producing coherent pages with configurable styles (realistic/cartoon/manga).
Demonstrates an end‑to‑end creative AI pipeline from text structuring to page layout.
Sample: Generated comic (PDF)

Skills Demonstrated

Multimodal generation, structured prompting, GPU job orchestration, image restoration, and programmatic layout design.