A natural-language Q&A interface that routes through the same Azure Function, SLM classifier, and tool library used by the scheduled content pipelines. Ask "What should I rope-drop tomorrow?" and Claude Sonnet pulls live wait patterns, LL sellout history, and weather forecast from the same PostgreSQL RAG and real-time tools that generate the daily bulletin — then answers in plain conversational text instead of a multi-platform social post.
The agentChat Azure Function (POST /api/agentChat) uses the
same SLM classifier, the same 15+ live tools, and the same PostgreSQL RAG
as the scheduled content pipelines. What changes is the system prompt and the output format:
pipelines use PIPELINE_SYSTEM_PROMPT and get section-marked blog/social/Reels output;
the chat agent uses CHAT_SYSTEM_PROMPT and gets plain conversational prose.
The tool lookup, the pre-fetch layer config, and the agentic loop are identical code paths.
The only thing separating a scheduled pipeline response from a chat response is the system prompt. Same model, same tools, same data. The system prompt changes what Claude thinks its job is.
_pipeline_from_question() keyword-matches the incoming question to the most
relevant pipeline type. That pipeline type determines which tools are available and
which knowledge_types filter the RAG query — the same config Cosmos stores
for the scheduled pipeline.
| Question Keywords | Routed Pipeline | Tools Activated |
|---|---|---|
| weatherrainstormforecastheatindoor | weather_midday |
WEATHER_TOOLS: get_weather_forecast, get_weather_intelligence, get_rides_at_risk, get_ride_sensitivity… |
| rope dropride firstfirst rideopeningearly entry | rope_drop_strategy |
ROPE_DROP_TOOLS: currentWaits, lightningLane, parkSchedule, purchaseOptions, entityContext… |
| lightning lanell multipassselloutsold outreturn time | ll_intelligence_report |
LL-focused subset: lightningLane, selloutEvents, topSellouts, releaseEvents… |
| top tenrankinghistoricaltrendaveragebest timehow often | daily_trend_report |
TREND_TOOLS: waitSnapshots, weeklyAvgWait, monthlyAvgWait, dailyWaitSummary, query_intelligence… |
| anomalypredictcorrelationwhy | ai_deep_insights |
INSIGHTS_TOOLS: heavier query_intelligence queries + structured weekly aggregates |
| waitlinedownclosedstatusreliable (default) | operations_bulletin |
BULLETIN_TOOLS: currentWaits, currentDown, lightningLane, shows, advisories, entityContext… |
The knowledge retrieval, SLM classifier, and tool definitions already existed for the scheduled pipelines. A separate function would mean duplicating the RAG query logic, the tool wrappers, the SLM client with its confidence gate and fallback path, and the Bedrock LangChain setup — approximately 800 lines of tested code that would then need to stay in sync whenever the pipeline infrastructure changed.
Using the same function also means the chat agent benefits automatically from improvements to the pipeline layer — when a new tool is added to ROPE_DROP_TOOLS, the rope-drop question routing in chat gets it for free. When the Phi-3 Mini confidence threshold is tuned for better classification, both pipelines and chat benefit simultaneously.
_needs_live_tools() checks the question for historical keywords ("last week", "average", "trend", "how often", "best time", "total downtime") and short-circuits to RAG-only mode. This isn't a cost optimization — it's accuracy. If someone asks "what's the average wait for Space Mountain on weekday mornings?" and the agent calls get_current_wait_times, it gets today's snapshot and might report a number that's uncharacteristically high or low. The PostgreSQL knowledge base holds precomputed day-of-week/hour aggregates specifically for this question type; they're more accurate for the question being asked.
Live tools are activated only when the question has explicit live-data indicators: "current", "right now", "today", "status", "available", "weather". For operations_bulletin and weather pipeline routes, any of these triggers the live tool set. For trend/insights routes, even those keywords fall back to RAG because aggregated patterns are more reliable than a single current reading.
When the assembled context (system prompt + RAG results + tool definitions) exceeds 1200 characters, the agent adds cache_control: ephemeral to the system message. Bedrock caches the tokenized representation of those blocks and deducts cached tokens from billing at a lower rate on subsequent calls within the same session window.
For a multi-turn conversation where the user asks follow-up questions ("What if it rains?" after asking about rope-drop strategy), the tool definitions and RAG context don't change — the entire input is the same except for the new question and history. Without caching, each follow-up incurs the full input token cost for the tool definitions (~2000 tokens) even though they're identical to the previous call. With prompt caching, those tokens are billed as cache reads at ~10% of the standard input rate.
The interface's right panel is a "Reply Blocks" artifact composer — the intended use case is a social media manager or community manager asking "what should I reply to this guest question?" and copying the answer into Instagram comments or Twitter DMs. That use case has specific failure modes that CHAT_SYSTEM_PROMPT alone doesn't prevent.
Without SOCIAL_REPLY_PROMPT, Sonnet tends to format replies as labeled outlines: "Top Priority: ...", "Pro move: ...", "TL;DR: ...". Those labels look like AI output in a social reply and get deleted by any human editor before posting. SOCIAL_REPLY_PROMPT explicitly bans those patterns and enforces: plain text only, 1-3 paragraphs, bullets only when they make the reply paste-ready, no markdown, no metadata. The output arrives pre-formatted for social copy-paste with no editing required.
The chat UI is served by Azure Static Web Apps (Park-Whisperer-Web). Direct browser calls to the Function App would expose the API key in client-side JavaScript. Instead, all API calls go to /api/pipeline/proxy/api/agentChat — a SWA API function (pipelineProxy) that holds PARK_KNOWLEDGE_KEY in its environment and proxies the request to the Function App with the key injected server-side.
The ?agentBackend=staging URL parameter (or the PW_AGENT_BACKEND localStorage key) redirects traffic to /api/agentChat?agentTarget=staging, which the proxy resolves to the staging Function App. This allows production UI to test against a staging backend without deploying a separate frontend — useful for validating new tool definitions or routing changes before promoting to production.