An experiment using n8n as the orchestration layer for a multi-agent Park Whisperer system. A central host agent named "Walt" — inspired by Walt Disney — coordinated a cast of seven specialized sub-agents, each named after a Disney character and owning a domain of park knowledge. Two custom microservices (a Gemini AI agent service and a BigQuery log search service) ran alongside n8n in Docker Compose, with Redis caching, pgvector RAG, and Caddy as the SSL reverse proxy.
The agent hierarchy was inspired by Disney storytelling: Walt as the gracious host, orchestrating specialists who each owned a narrow domain. Sub-agents were designed to be called by Walt via n8n's sub-workflow invocation — not exposed directly to users.
The full stack ran locally (Windows host) and was designed to deploy to a self-hosted server
at jedha.sippelfamily.net. Seven Docker Compose versions iterated on service configuration,
networking, and SSL setup before the project was shelved.
Alongside the park-facing agent system, a separate SRE-mode agent was built using the same infrastructure. It provided internal operational awareness: runbook search, GCP incident monitoring, and per-service log analysis backed by the BigQuery log search microservice. A two-step log retrieval pattern (analyze → cache → retrieve) kept responses fast.
| Tool | Category | Description |
|---|---|---|
| get_runbooks | Knowledge Base | Hybrid search (BM25 keyword + vector semantic) over internal runbooks stored in pgvector. Inputs: query (semantic) + optional keyword_query (exact match). |
| check_gcp_status | GCP | Queries Google Cloud Status Dashboard JSON feed. Identifies active incidents (null end field). No input required. |
| get_log_insights | Logs | Focused keyword search across all GCP Cloud Logging entries in BigQuery. Inputs: keyword_query, time_range_minutes (default 60), max_log_entries (default 100). |
| analyze_weather_updater_logs | Logs | Step 1 of two-step log retrieval for the weather-updater service. Returns cache_key + log_count. Inputs: time_window_hours (default 24). |
| analyze_lightning_whisperer_logs | Logs | Step 1 of two-step retrieval for lightning-whisperer service. |
| analyze_park_updater_logs | Logs | Step 1 of two-step retrieval for park-updater service. |
| analyze_ride_whisperer_logs | Logs | Step 1 of two-step retrieval for ride-whisperer service. |
| analyze_weather_whisperer_logs | Logs | Step 1 of two-step retrieval for weather-whisperer service. |
| analyze_park_endpoint_logs | Logs | Step 1 of two-step retrieval for park-endpoint service. |
| analyze_weather_endpoint_logs | Logs | Step 1 of two-step retrieval for weather-endpoint service. |
| get_cached_logs | Logs | Step 2: retrieves actual log entries from Redis cache using the cache_key returned by any analyze_* tool call. |
| debug_agent_workflow | Debug | Meta-tool for debugging the agent's own tool-calling behavior. Inputs: debug_message describing the issue (e.g., "Gemini node not calling tool"). |
analyze_[service]_logs(time_window_hours): BigQuery query runs, logs stored in Redis, returns cache_key + log_count
Step 2 — get_cached_logs(cache_key): retrieve pre-fetched log entries from Redis — no repeat BQ query
Agent instructed: if log_count = 0 after step 1, do not call step 2 — state "no logs found" and stop
Pattern prevents expensive BigQuery re-queries for the same data within a conversation turn
get_runbooks: vector semantic search + optional BM25 keyword filter on pgvector store
Two input fields: query (semantic) + keyword_query (exact term match)
Agent uses keyword_query when user explicitly mentions specific error codes or exact phrases
Runbooks indexed as documents — PostgreSQL + pgvector in same Docker Compose stack
n8n's visual workflow builder is genuinely excellent for rapid agentic prototyping. The ability to wire an LLM node to a set of HTTP tool nodes, configure the system prompt inline, and immediately test the tool-call flow in the UI without writing deployment code compressed the iteration loop significantly. Sub-workflow invocation — calling one n8n workflow from another — worked cleanly as the mechanism for Walt delegating to sub-agents.
What didn't scale: n8n's workflow state is stored in its own database (SQLite or Postgres depending on config), and each workflow update requires saving through the UI. There's no git-native workflow format that makes code review natural. The configuration-as-code problem that the Azure OpenAI Assistants version solved with JSON files is harder in n8n — exported workflows are large JSON blobs with embedded credentials references. For a project that needed rapid iteration on agent behavior, this was a significant friction point compared to editing a Python system prompt in a text file.
The log search service (log_search_service/main.py) is the most re-usable component from this project. It's a FastAPI app that queries BigQuery for GCP Cloud Logging entries across two table patterns (run_googleapis_com_stdout_* and run_googleapis_com_stderr_*), caches results in Redis, and exposes two endpoints: /get_log_insights for focused keyword searches and /analyze_logs for full-service log dumps within a time window.
The two-step pattern (analyze → cache → retrieve) was specifically designed for LLM agents: the first call runs the expensive BigQuery query, caches the results in Redis, and returns only the metadata (log_count, cache_key). The agent then decides whether to retrieve the full logs based on log_count. This avoids sending potentially thousands of log lines to the LLM in the first tool response — the LLM only requests the full content if it's relevant.
The AI agent service (ai_agent_service/app.py) was a FastAPI container running Google Gemini with custom tool definitions — a separate LLM backend that n8n workflows could call via HTTP node rather than using n8n's built-in LLM nodes. This was built specifically to test Gemini's function calling compared to the OpenAI-compatible models n8n natively supports.
The service defined 5 tools: get_runbooks, check_gcp_status, get_log_insights, analyze_logs, and debug_agent_workflow. Each was a genai.FunctionDeclaration passed to Gemini's chat session, with the agentic loop handled inside the service rather than in n8n. This let Gemini's multi-turn function calling run to completion before returning a final response to n8n — simpler from n8n's perspective since it just got a complete answer back from an HTTP call, rather than managing the tool-call loop itself.
Seven Docker Compose versions over the project's lifetime is the clearest signal of where the effort went. Each version addressed a configuration problem: SSL termination in Caddy vs directly in n8n, Redis queue mode vs embedded mode, network addressing between containers, Windows vs Linux volume mount paths. The infrastructure was never wrong for more than a few hours, but it was never fully automated either — each deploy required manual steps.
The fundamental tradeoff that killed the project: self-hosted n8n + stack requires a running server with a public domain name. The test deployment used a personal domain (jedha.sippelfamily.net) on a home server. When the server went down, the agent went down. For a product designed to respond to social media interactions in real time, this was unacceptable uptime. Moving to a cloud-hosted n8n instance (n8n.cloud or a managed VM) would have cost money and removed the control that made self-hosting appealing. The Azure Functions-based production system that replaced this runs on Flex Consumption with essentially zero idle cost and no server to manage.
The persona-based naming was more than cosmetic. Giving each sub-agent a Disney character identity that mapped to a real capability helped clarify what each agent's scope should be. "Kronk knows food" is more memorable and constraining than "Sub-agent 4 handles dining." When designing the system prompt for each agent, the character's personality traits from Disney lore provided natural constraints: Jiminy Cricket (Jim) should have a conscience — he remembers the guest's history. Moana/Wayfinder should know the path even when it's unclear. Tink should surface magic that isn't on the main path.
This approach directly influenced the production system's prompt design. The production Park Agent's CHAT_SYSTEM_PROMPT uses a specific voice ("enthusiastic local expert") that creates similar constraint on scope. The lesson: agent personas aren't just branding — they're a design tool for scoping behavior. An agent with a clear character is harder to prompt-inject off-topic than an agent described only by its technical function.