Abandoned · Prophet · BigQuery · GCP · Python

Ride Whisperer
Wait Time Prediction

A per-attraction time-series forecasting engine for all four Walt Disney World parks. One Facebook Prophet model per attraction, trained on 2+ years of hourly wait data from BigQuery, with 50+ external regressors: weather (temperature, precipitation, humidity, thunderstorm probability), rolling windows at 1h/2h/4h, precipitation lags, park operating hours, holidays, and time decomposition features. 24-hour ahead forecasts written back to BigQuery.

⚠️
Abandoned at v8. The project reached v8 iterations before being shelved. The primary blocker was feature imputation at forecast time: rolling mean and lag features are computed from actual historical data and can't be faithfully forward-filled for a 24-hour forecast horizon — the regressors the model relied on most were unavailable at inference time. The dataset infrastructure and feature engineering work directly informed the design of the current PostgreSQL pre-computed aggregates system, which achieves the same analytical goals through a different architecture.
v8
Final Iteration
N×1
Models (per ride)
50+
Regressors
24hr
Horizon
2015
Training Start
4
WDW Parks

Four Stages: Acquire → Train → Forecast → Store

Training Data
BigQuery: prophet_ready_data_view unified feature view 2023-01-01 → 2025-07-01 training window (2+ years) Joined sources: historical wait CSVs (2015-2021), NWS weather, park schedules, holiday metadata Target variable y: actual wait time in seconds Partition key: entity_id (one row per attraction per hour) 50+ columns: time, weather, rolling stats, park context, one-hot categoricals
Per-Ride Training
One Prophet model per entity_id — no global model Prophet config: additive seasonality, daily + weekly + yearly, linear growth changepoint_prior_scale=0.05 — moderate trend flexibility All 50+ numerical features added via m.add_regressor() Categorical features excluded (land_name, attraction_name) — not directly supported as regressors Rides with zero observations still trained (MIN_OBSERVATIONS=0)
Future Context
Weather: weather_forecasts BQ table → temperature, precipitation, dewpoint, humidity, cloud cover, wind, thunderstorm prob Park hours: park_open_status_hourly_v1 BQ table → per-park is_park_open for 24hr horizon Historical regressors (rolling means, lags) imputed to 0 — cannot be computed without actuals Time-based features re-derived from ds: day_of_week, hour_of_day, month, quarter, is_weekend, is_holiday
Prediction
m.predict(future_ride) per ride — returns yhat + confidence interval (yhat_lower, yhat_upper) 24-hour horizon at hourly resolution Output annotated with ride_entity_id, park_name, prediction_generated_at UTC Rides with missing critical regressors skipped with error log
BigQuery Output
prophet_predicted_wait_times table — WRITE_APPEND disposition Schema: prediction_timestamp (TIMESTAMP), ride_entity_id (STRING), predicted_wait_time_seconds (FLOAT64), yhat_lower, yhat_upper, prediction_generated_at Downstream: feed to dashboard, alerts, and routing recommendations

50+ Regressors Across Five Categories

All features were computed in BigQuery and materialized into prophet_ready_data_view before training. The feature set grew from ~10 in v2 to 50+ by v8 as each iteration added a new data source or transformation.

Weather — Raw10
avg_temperature_f
precipitation_inches
dewpoint_f
apparent_temperature_f
humidity_percent
cloud_cover_percent
wind_speed_mph
is_adverse_heavy_rain_1hr
adverse_thunderstorm_prob_3hr
weather_condition_category_* (6 one-hot)
Weather — Derived13
temp_x_precipitation
is_weekend_x_temperature
precipitation_if_open
precipitation_lag_15min
precipitation_lag_30min
temp_f_rolling_mean_1hr / 2hr / 4hr
precip_sum_1hr / 2hr / 4hr
wind_speed_max_1hr / 2hr / 4hr
Wait Time — Rolling9
y_rolling_mean_1hr
y_rolling_std_1hr
y_rolling_mean_2hr
y_rolling_std_2hr
y_rolling_mean_4hr
y_rolling_std_4hr
⚠ Unavailable at inference time — imputed to 0. Known accuracy degrader.
Time Decomposition8
hour_of_day
day_of_week
is_weekend
day_of_year
month_of_year
quarter_of_year
time_of_day_category_* (4 one-hot)
Park & Holiday Context10
is_park_open
is_park_operating_hour
is_ride_open
is_holiday
park_name_magic_kingdom
park_name_epcot
park_name_hollywood_studios
park_name_animal_kingdom

BigQuery as the Central Feature Store

All data was ingested to a GCP BigQuery data warehouse under the park-whisperer.park_data_warehouse project. The training view joined four independent datasets into a single query surface.

Historical Ride Data
Source: public attraction wait-time CSVs, 2015–2021 (~7 years) Columns: datetime, SACTMIN (actual wait), SPOSTMIN (posted wait) Preprocessed: ride ID extracted from filename, missing values (-999) handled, merged with metadata.csv (holiday flags, ticket season, park hours, events count, capacity_lost_percent) Output: long_term_historical_ride_data.parquet → loaded to BQ 130+ attraction CSV files across 4 parks ingested via preprocess.py
Weather Data
Historical: historical_weather.py → NWS hourly observations → BQ Forecast: weather_forecasts BQ table (live NWS feed) Computed in BQ: adverse_heavy_rain_pred_1hr, adverse_thunderstorm_prob_3hr, weather_condition_category
Park Operations
park_open_status_hourly_v1 — hourly is_park_open flag per park Derived from park schedule data: MKOPEN/MKCLOSE, EPOPEN/EPCLOSE, HSOPEN/HSCLOSE, AKOPEN/AKCLOSE is_park_operating_hour distinguishes scheduled hours from actual operation
Feature View
prophet_ready_data_view — SQL view joining all sources Rolling windows and lag features computed in SQL (y_rolling_mean_*hr, precip_lag_*min) One-hot encoding done in SQL: CASE WHEN park_name = 'Magic Kingdom' THEN 1 ELSE 0 Training query reads from view, filters by date range, orders by ds, entity_id

prophet_predicted_wait_times — BigQuery Output Table

ColumnTypeDescription
prediction_timestampTIMESTAMPHourly timestamp for which the wait is predicted (Eastern Time, as UTC)
ride_entity_idSTRINGAttraction identifier matching entity_id in training data (e.g., "seven-dwarfs-mine-train")
predicted_wait_time_secondsFLOAT64Prophet yhat — point forecast in seconds. Divide by 60 for minutes.
yhat_lowerFLOAT6480% confidence interval lower bound
yhat_upperFLOAT6480% confidence interval upper bound
prediction_generated_atTIMESTAMPUTC timestamp when the prediction batch was computed. WRITE_APPEND — each run adds rows.

v2 → v8 — What Changed Each Iteration

v2
First working Prophet model
Basic per-ride Prophet with daily/weekly seasonality. ~10 regressors: temperature, precipitation, is_holiday, day_of_week, hour_of_day. Training data from CSV files. No BigQuery integration yet.
v4
BigQuery integration + weather expansion
Moved training data to BigQuery. Added full NWS weather set: dewpoint, apparent temperature, humidity, cloud cover, wind speed. Added park operating hours as a regressor. First prophet_ready_data_view.
v5
Rolling window features
Added y_rolling_mean/std at 1hr, 2hr, 4hr windows for autocorrelation signal. Added temp_f_rolling_mean, precip_sum, wind_speed_max at same windows. This significantly improved training accuracy but introduced the future-imputation problem.
v6
Interaction terms + lag features
Added cross-feature interactions: temp_x_precipitation, is_weekend_x_temperature, precipitation_if_open. Added precipitation lags at 15min and 30min. Added weather_condition_category and time_of_day_category one-hot encodings.
v7
LightGBM parallel experiment
Parallel track exploring LightGBM (gradient boosting) as an alternative to Prophet. Achieved better in-sample accuracy on rolling features but required more complex feature pipelines. Model artifact: lightgbm_model.joblib.
v8
Full regressor set — final iteration before abandonment
50+ regressor Prophet. Removed debug MAX_RIDES_TO_TRAIN limit — trained on all attractions. Added yearly seasonality. MIN_OBSERVATIONS set to 0 to include all entity_ids. Rolling features explicitly imputed to 0 at forecast time with warning logs. Project stalled on this imputation problem.

Why It Was Abandoned and What It Built Toward

The imputation problem that killed the forecast accuracy+

By v5, the most predictive features were the rolling window statistics: y_rolling_mean_1hr, y_rolling_mean_2hr, y_rolling_mean_4hr. These captured autocorrelation in wait times — if a ride has been running 60-minute waits for the past two hours, that's a strong signal for the next hour. In training, these values are computed from actual historical data and are highly accurate.

At forecast time, there is no "actual" data for the future. The pipeline imputed these to 0, which is demonstrably wrong — 0 means the ride has had zero wait time for the past four hours, a state that doesn't exist during park operating hours. The model trained with real values and predicted with zeroes. Every prediction was effectively generated without the features the model weighted most heavily. Adding more weather features didn't compensate for this structural gap.

Why one-model-per-ride didn't scale well in practice+

Prophet is computationally heavy: fitting a full Prophet model with 50 regressors and 2+ years of hourly data takes 30–120 seconds per attraction. With 130+ attractions across 4 parks, a single training run was measured in hours on commodity hardware. Retraining every night to keep models fresh on new data was not practical without significant infrastructure — a Cloud Run or Dataflow job with parallelism across all rides.

A global model (one model for all rides with entity_id as a categorical feature) was considered but Prophet doesn't natively support entity-level models — that's more LightGBM/XGBoost territory, which is why the v7 LightGBM parallel track existed. Neither approach reached production readiness.

What the data infrastructure enabled after abandonment+

The BigQuery warehouse, the NWS weather ingestion, the historical CSV preprocessing pipeline, and the prophet_ready_data_view feature view all survived the Prophet experiment and fed directly into the next architecture. The feature categories that proved most important — weather condition, time-of-day category, park operating hours — became the basis for the knowledge extractor taxonomy in the current PostgreSQL RAG system.

Instead of serving predictions from a live model, the current system precomputes day-of-week and hour-of-day aggregates from the same historical dataset and stores them as knowledge documents. "What's the typical wait for Space Mountain on a Saturday morning?" is answered directly from those precomputed aggregates — no real-time inference required, and the answer is based on 7+ years of actual data rather than a forecast from a model with imputed zero-value features.

What would need to change to revive it+

The imputation problem is solvable — but not with Prophet. A model architecture that treats the rolling features as autoregressive inputs natively (LSTM, N-BEATS, or a tree-based model with an explicit lag feature pipeline) would handle the forecast horizon correctly. The BigQuery feature store is largely ready; the main engineering work would be:

  • Replace Prophet with a global LightGBM or XGBoost model trained on all rides simultaneously with entity embeddings
  • Build an online feature pipeline that computes the rolling/lag features from a streaming wait-time source (the existing Cosmos DB selloutEventv2 / wait-time records) at inference time rather than from precomputed training data
  • Move training to Vertex AI or a scheduled Cloud Run job to parallelize per-ride training
  • The output schema is already defined — prophet_predicted_wait_times just needs a better upstream producer
Successor: Knowledge RAG → Digital Twin Platform ↗ All Projects ↗