Projects

Comic Creator App — Text‑to‑Comics

Python
Diffusion
LLM
Vision

Experimental pipeline that turns a written scenario into comic pages: LLM divides the story into panels, Stable Diffusion generates images, and a compositor assembles pages with speech bubbles.

Storyboard transforming into a comic page with speech bubbles

Objectives

  • Let authors transform short narratives into illustrated comic pages without drawing skills.
  • Automate story → panels → images → layout, while preserving visual consistency of characters/style.

Architecture

  • Modules: story_divider (LLM → structured panels JSON), prompt_generator, panel_cutter, page_fitter (layout).
  • Data models: Pydantic/dataclasses for ProjectPagesPanels with text/image, serializable to JSON and DB‑friendly.
  • Orchestration: threaded/async workers and a queue to manage GPU workloads and parallel panel generation.

Generation & Post‑processing

  • Images via Diffusers (Stable Diffusion) with style templates; TorchVision/OpenCV for composition and typography.
  • Quality: GFPGAN/CodeFormer + facexlib/onnxruntime/insightface to restore faces and improve fidelity.
  • Consistency: character descriptors reused across prompts; optional face matching to enforce continuity.

Workflow Intelligence

  • Feedback loop: quality checks (vision/OCR) can trigger prompt refinements or add missing panels.
  • Optional dialogue generation to enrich panels if the source text is too narrative.

Results

  • Working prototype producing coherent pages with configurable styles (realistic/cartoon/manga).
  • Demonstrates an end‑to‑end creative AI pipeline from text structuring to page layout.
  • Sample: Generated comic (PDF)

Skills Demonstrated

Multimodal generation, structured prompting, GPU job orchestration, image restoration, and programmatic layout design.