Demo · BookForge — AI book (PDF) publishing platform

This document treats BookForge — a platform that turns a customer's brief into a publication-quality book as a PDF as a single Azure architecture case study, explaining which services it uses, how they fit together, and why it was designed this way, from the perspective of an engineer who wants to rebuild it. (As of 2026-06-17; regions: Container Apps = Korea Central, Azure OpenAI = Korea Central)

⚠️ Disclaimer: Books produced by this demo are AI-generated. Facts and figures are for reference only; verify against original sources before any important decision.

At a glance

BookForge turns "topic / format / content / references / length → an 8-stage quality pipeline → a typeset book PDF with cover, table of contents, body, and bibliography." This demo has exactly one success metric: the quality of the book's content. So it is designed to refine in stages rather than "write it all at once."

Not a single shot, but 8 stages: plan → grounding → outline → per-chapter writing → self-critique & revision → consistency → copyedit → typesetting. It mirrors how a human writes a book (design → research → draft → revise → proofread → print).
Grounding to prevent hallucination: it builds a corpus from uploaded files (PDF/txt/md), URLs, and web search, then extracts "verified facts" per chapter. Concrete numbers and quotes are allowed only from that evidence.
Self-critique & revision loop (the key quality stage): a "demanding book editor" persona finds the weaknesses in each chapter draft and rewrites it — more concrete, deeper, leaner — without inventing numbers not in the evidence.
Publication-grade typesetting: WeasyPrint's CSS Paged Media produces a gradient cover, copyright page, auto table of contents (with page numbers), serif body, and bibliography — a PDF that looks like a book, not just text.
gpt-5.4 called keyless (Managed Identity / AAD token) — no API keys stored in code, environment, or image.
Language: Korean-first + English. Length: user-specified (chapters/pages) or model-chosen optimal length if unspecified.
🎨 Storybook mode (separate): the same platform also produces illustrated children's books. It adapts vocabulary, sentence length, and page count to a child's age (1–12), lets you pick an art style from 8 previewed swatches (watercolor, colored pencil, crayon, cut-paper, soft 3D, flat vector, soft anime, claymation), and draws one story-matched illustration per page with gpt-image-1 (keyless). gpt-5.4 writes the text.

🎨 Storybook Mode (illustrated, per-page art)

Unlike adult "books," a storybook lives or dies on art quality + age-fit + character consistency, so it uses a dedicated pipeline.

4 stages: ① story design → ② per-page text → ③ per-page illustration → ④ storybook PDF (landscape A4).
Age-fit (age): maps the age into 4 buckets (0–3 / 4–6 / 7–9 / 10–12) to tune sentence length, vocabulary, page count, and theme difficulty (e.g. 1 sentence + refrain for toddlers, short paragraphs for ages 10–12).
Style picker (art_style): 8 styles previewed on the same subject (a fox cub) as swatches (web/styles/*.png, /api/styles); the chosen style's prompt fragment is applied to every page image.
Character consistency (key): the design stage pins the hero's appearance as an English art_anchor (species/color/ clothing/traits) and prepends it to every page's image prompt — since image models have no seed, the text anchor keeps the same character/palette across pages. Images carry no text (no text); the body copy is laid over in render.
Output: cover (title overlay) → title page → spreads (large art on top + large body text below) → ending (moral).

Architecture

  Internet ──HTTPS──▶  Azure Container Apps (FastAPI + static front, Korea Central)
                        bookforge-api  (api/server.py, ingress 8000, single replica)
                        - GET  /                     input form + live progress polling UI (web/index.html)
                        - POST /api/books            (multipart) start job → {job_id}, pipeline on a daemon thread
                        - GET  /api/books/{id}       poll status (stage / progress / per-chapter / logs)
                        - GET  /api/books/{id}/pdf   download finished PDF
                        - System-assigned identity (keyless)
                              │
            ┌─────────────────┼──────────────────────────────────┐
            ▼                 ▼                                    ▼
   app/grounding.py    app/pipeline.py (8-stage orchestration)  app/render.py
    files/URL/web        plan→ground→outline→write→critique       Markdown→HTML→PDF
    → corpus + per-       →consistency→copyedit→render             WeasyPrint (cover/TOC/
    chapter facts                │                                 body/bibliography, Nanum)
                                ▼
                          app/llm.py — gpt-5.4 (keyless AAD)
                          DefaultAzureCredential → AzureOpenAI
                          foundry-uzrz5ojtsjvae · gpt-5.4
                          (reasoning model: max_completion_tokens)
                              │
                              ▼
                          app/store.py — job state & artifacts
                          container-local (demo) / Blob+PE (production)

Services and roles

Service	Role	Key point
Azure Container Apps (FastAPI)	Web backend + static front `bookforge-api`	Runs the 8-stage pipeline on a daemon thread, status polling. Single replica (consistent create/poll/download). System-assigned identity (keyless)
Azure OpenAI / Foundry — gpt-5.4	Reasoning for every stage (plan/write/critique/consistency)	Keyless (AAD token). Reasoning model → uses `max_completion_tokens`
Azure OpenAI — gpt-image-1 (storybook)	Per-page illustration generation	Keyless (AAD). No image models in koreacentral, so deployed separately in eastus2 (`foundry-bookimg-mty`). Cover 1024×1536 (high) · pages 1024×1024 (medium)
Grounding (grounding.py)	Single source of truth for evidence	uploads/URL/web → corpus → per-chapter "verified facts". Numbers/quotes confined to evidence (hallucination prevention)
WeasyPrint (render.py)	Book typesetting (HTML→PDF)	CSS Paged Media: cover, copyright, auto TOC (page numbers), serif body, bibliography. Korean fonts (Nanum/Noto) installed in the container
Storage (store.py)	Job state & artifacts	Demo uses container-local (simple, governance-independent). Production scales to Blob + Private Endpoint
Azure Container Registry (Basic)	Image storage	`az acr build` builds an image with WeasyPrint runtime + Korean fonts

Why it is designed this way (key decisions)

Quality is the goal → multi-stage refinement, not a single shot. Asking an LLM to "write a good book in one go" yields average, flat prose. So we fix the direction (thesis) in planning, design a non-redundant, escalating outline, write each chapter, then have a separate editor persona critique and rewrite it. Splitting "writer" and "editor" roles — even on the same model — visibly raises density.
Grounding to prevent hallucination. A book's credibility collapses on one "specific but wrong number." So we build a corpus from references (files/URLs/web search) and extract only the facts each chapter needs, injecting them. The writing/revision prompts insist "concrete numbers/quotes only from this evidence." With no references, the model writes from its own knowledge but is told to beware of numeric hallucination.
Continuous writing via rolling summaries. Writing chapters independently creates duplication and contradictions. After each chapter we summarize it in three sentences and feed it as context to the next chapter's prompt, so the book reads as one continuous work.
Keyless (zero secrets). The subscription blocks Azure OpenAI key auth (disableLocalAuth=true). So we obtain an AAD token via DefaultAzureCredential and, in the deployed environment, grant only a Managed Identity the Cognitive Services OpenAI User role. There is no key to leak.
A storage choice that doesn't fight governance (demo). We first tried Blob (managed identity), but central governance immediately forced storage publicNetworkAccess to Disabled, so writes from ACA (Consumption, no VNet) failed with AuthorizationFailure. For the demo we simplified to container-local storage + single replica, working reliably regardless of governance. (For persistent, multi-replica production, see below.)
Investing in book-like typesetting. Great content fails to land if it looks like "notepad text." WeasyPrint produces a cover, an auto TOC (real page numbers via CSS target-counter), a serif body, and a bibliography, so it reads as a book from first glance. Body is serif (readability), headings sans-serif (hierarchy).

Pitfalls hit and fixed (so you don't repeat them)

Debian font package name: fonts-nanum-coding does not exist on Debian trixie → build failed; install only fonts-nanum + fonts-noto-cjk, and list Nanum/Noto in the font families so Korean doesn't break.
Korean in HTTP headers → latin-1 error (500): putting Korean directly in Content-Disposition filename / X-Book-Title breaks Starlette's latin-1 encoding → fixed with RFC 5987 filename*=UTF-8''<percent-encoded> + urllib.parse.quote.
Local vs container layout difference: local is api/ + sibling app/, the container is flat under /srv → server.py auto-detects app/ (here if present, else parent) by adjusting sys.path and WEB_DIR.
Beware silent failure: status saving swallowed the Blob error and polling looked fine from the in-memory cache, which masked the real artifact-write failure → removed by simplifying the storage layer (local).

Build it yourself (summary)

# Local
cd demos/bookforge
python3 -m venv .venv && source .venv/bin/activate
pip install -r api/requirements.txt        # WeasyPrint needs system pango (macOS: brew install pango)
az login
./run-local.sh                              # → http://localhost:8000

# Deploy to Azure (Container Apps, keyless)
az login && az account set --subscription <SUB_ID>
./deploy.sh                                 # ACR build → ACA create → grant AOAI role to the managed identity
# Cleanup: az group delete -n rg-bookforge --yes --no-wait

Production scaling: for persistence and multiple replicas, move artifact storage to Blob + Private Endpoint. MCAPS governance periodically locks public storage access, so you need the VNet-injected Container Apps + Private Endpoint pattern (see this portal's "governance-immune storage" demo).

Customer value (business view)

Quality: critique/revision loop + grounded evidence → a readable book, not "plausible text" — the core differentiator for a content business.
Security: keyless (managed identity) — no API keys in code/env/image. Entra ID-based least privilege.
Speed & cost: reuse the existing Foundry (gpt-5.4) + Container Apps for fast launch. Delete the RG after the demo to control cost.
Governance: regulated customers can harden with Blob + Private Endpoint, CMK, and Customer Lockbox.

← All demos Portal home