Click any component to explore it in the full case study.

Multi-Model Routing + 3-Pass Generation
Classifier
Phi-3 Mini SLM
Query intent → retrieval strategy. Confidence gate ≥ 0.85. Gradual rollout via SHA-256 routing (0% → 100% traffic control).
Fallback
Phi-4-mini
If confidence < 0.85 or timeout → Phi-4-mini-instruct via Azure OpenAI. classification_source flag tracked.
Pass 1
Claude Haiku
Tool calls + agentic loop. Data synthesis. temperature=0 for format fidelity.
Pass 2
Claude Sonnet
Platform formatting (Facebook/Instagram/Threads/Blog). Applies Layer 2 voice/tone from Cosmos.
Pass 3
Claude Sonnet
Reels script generation. HOOK + SEGMENT_1 + CTA + NARRATION. Word count enforced.
🔀
Gradual SLM Rollout + Model Override Architecture
SHA-256 routing enables 0–100% traffic shift to SLM classification without code deploys. Fallback audit tracks slm_fallback_low_confidence and slm_fallback_error events. Model ID env-var overrides per pass allow hot-swapping models.
SHA-256 rollout (0% → 100%)classification_source flagslm_fallback audit eventsCLAUDE_MODEL_ID env overridestemperature=0 (format fidelity)3-layer prompt isolation
4
AI models in pipeline
3
Generation passes
10
Pipeline types
5+
Publishing platforms
StackPythonClaude Haiku 4.5 (Pass 1)Claude Sonnet 4.5 (Pass 2–3)Phi-3 Mini (classifier)Phi-4-mini (fallback)AWS BedrockAzure OpenAILangChainCosmos DB prompts