Four expert agents debate over two rounds → a chair synthesizes a vote + consensus
This document treats the AI Investment Committee — a demo where multiple AI agents debate to analyze a stock — as one Azure architecture case study, explaining which services were used, how, and why, from the perspective of an engineer who wants to rebuild it. (As of 2026-06-15; regions: SWA = East Asia, Azure OpenAI = Korea Central)
⚠️ Disclaimer: All output is "a simulated opinion from AI agents, for demo purposes only — not investment advice." It is not a solicitation or guarantee of any return; data is an as-of cache.
The AI Investment Committee shows "type a stock/ETF → five expert agents debate over two rounds → a chair synthesizes a verdict (vote + consensus %)." The key themes are multi-agent debate · function-calling-based hallucination prevention · keyless security · real data (quotes + qualitative) with source honesty.
[Now — live input demo: type any ticker, get an instant analysis]
Internet ──HTTPS──▶ Azure Static Web Apps (Free, East Asia)
ai-invest-committee-web
- single index.html (input box + example chips + KO/EN toggle)
- on submit → calls backend /api/analyze → renders debate timeline
│ fetch(JSON)
▼
Azure Container Apps (FastAPI, Korea Central)
aic-committee-api (api/server.py, ingress 8000)
- GET /api/analyze?ticker=&lang= → live snapshot + debate
- minReplicas 0 (scale-to-zero), system-assigned managed identity
│
┌────────────┴─────────────────────────────┐
▼ ▼
app/live_data.py (live quotes) app/llm.py (gpt-5.4 debate)
- KR: Naver Finance (unofficial) DefaultAzureCredential (keyless AAD)
- US: Yahoo chart (no key) + AlphaVantage foundry-uzrz5ojtsjvae · gpt-5.4
- macro: FRED · Bank of Korea ECOS 5 roles × (R1 → R2 rebuttal → Chair)
│ returns deterministic numbers │ function calling
└──────────────▶ app/tools.py ◀──────────────┘
(single source of truth = hallucination-prevention point)
[Target — Azure-native (next roadmap, some preview)]
Static Web Apps ──▶ Foundry Connected Agents (orchestration)
├─ 5 agents = Foundry prompt/tool agents
├─ search_filings → Azure AI Search (agentic retrieval, real RAG)
└─ response grounding check → Content Safety Groundedness (preview)
Auth: Managed Identity everywhere (keyless)
| Service | Role | Why it matters |
|---|---|---|
| Azure Static Web Apps (Free) | Public frontend hosting | Single input page; on submit calls backend /api/analyze → $0 to run |
| Azure Container Apps (FastAPI) | Live analysis backend aic-committee-api |
Collects live quotes + runs the gpt-5.4 debate on demand. minReplicas 0 (scale-to-zero) for ~$0 idle. System-assigned managed identity (keyless) |
| Azure OpenAI / Foundry — gpt-5.4 | Reasoning for 5 role agents | Keyless (AAD token). A reasoning model → use max_completion_tokens, give a generous budget |
| Function calling (tools.py) | Single source of truth for numbers/filings | Agents fetch directly → no invented numbers (the essence of hallucination prevention) |
| Live data (live_data.py) | Real-time quotes/metrics | KR=Naver (unofficial) · US=Yahoo (no key)+AlphaVantage · macro=FRED·ECOS — all free |
| (Target) Azure AI Search | Filing RAG (agentic retrieval) | search_filings is simulated by snapshot filtering today → swap to a real index later |
| (Target) Content Safety Groundedness | Grounding check (preview) | Auto-verify that agent answers are actually grounded in tool-provided evidence |
Multi-agent debate, not a single LLM. Value, technical, and risk are conflicting lenses. Rather than asking one model to "answer in a balanced way," separating roles and letting them debate/rebut reduces blind spots and sharpens evidence. The chair separates agreements from disagreements, so the conclusion also conveys its uncertainty (consensus %).
Pin numbers with function calls — prevent hallucination. LLMs are good at inventing plausible
numbers. So every figure (P/E, moving averages, rates) is not pre-baked into the prompt; the agent
calls a get_* tool when needed. The evidence chips like [get_fundamentals] next to each number
prove "this came from a tool." 👉 Live proof: when a Samsung filing section isn't found, the agent
answers "data unavailable" instead of fabricating.
Keyless (zero secrets). The subscription disables key auth for Azure OpenAI
(disableLocalAuth=true). So we obtain an AAD token via DefaultAzureCredential, and in deployment
use Managed Identity + the Cognitive Services OpenAI User role. There is simply no key to leak.
Real data + source honesty. Quotes and metrics for the typed ticker are collected on the fly; fields the
free sources can't fill (e.g., some KR filing risk factors, or US valuation when the AlphaVantage limit is hit)
are shown as "no data" and the agents never fabricate them. Per-field source labels reveal where each value came from.
Live, on-demand analysis (scale-to-zero). Calling the LLM from the browser would expose keys and complicate cost/CORS control. Instead the backend (ACA + FastAPI) collects quotes and runs the gpt-5.4 debate, with minReplicas 0 so idle cost is ~zero (only the first request pays a ~12s cold start). Users analyze any ticker live.
MOCK fallback. With AZURE_OPENAI_ENDPOINT unset, a rule-based MOCK renders the same screen — useful
offline, in CI, or for teaching — and since the output schema is identical, the frontend changes nothing.