GCP · BigQuery ML · Production

Weather Whisperer
BigQuery ML for Ride Operations

A production GCP pipeline that ingests real-time METAR surface observations, NWS forecasts, and upper-air radiosonde soundings into BigQuery — then runs ML-based ride impact scoring to generate per-ride OPERATE / MONITOR / CLOSE recommendations for Central Florida theme parks. Published as CORS-enabled JSON to GCS every 15 minutes and consumed directly by Azure-hosted dashboards.

3
Data Sources
5
BQ Tables
5
Curated Views
5
Cloud Run Jobs
15min
Data Freshness
$35
Max Monthly Cost

5-Layer GCP Stack

External weather APIs → Cloud Run ingestion jobs → BigQuery raw tables → curated views → JSON publisher → GCS public bucket → Azure web dashboards. Each layer is independently scalable and separately schedulable via Cloud Scheduler.

Data Sources
aviationweather.gov (METAR) University of Wyoming (Radiosonde) api.weather.gov (NWS) 3 external APIs
Ingestion
Cloud Run Jobs metar-ingestion-job atmospheric-ingestion-job forecast-ingestion-job Cloud Scheduler
Analytics / ML
BigQuery weather_analytics dataset 5 raw tables (partitioned + clustered) 5 curated views Ride impact ML models
Publishing
weather-json-publisher insights-generator-job (daily) manifest.json generation 15-min publish cycle
Storage / CDN
GCS public bucket CORS-enabled station/{id}/latest.json rides/{id}/impact_latest.json CDN-ready
Consumption
Azure Static Web Apps Azure Web Apps Direct HTTPS from GCS Cross-cloud JSON fetch
IaC / Ops
Terraform (15 files) Secret Manager (API keys) 3 service accounts (least-privilege) Preflight + validation scripts

From Raw Observation to Ride Recommendation

How METAR bytes become an OPERATE/MONITOR/CLOSE decision for a roller coaster.

1

Multi-Source Ingestion — 3 Parallel Jobs

METAR job (every 15 min): fetches surface observations from aviationweather.gov — temperature, dewpoint, pressure, wind, precipitation, visibility. Derives rolling 3hr/6hr precip, 1hr pressure change, 1hr wind change, and boolean adverse-condition flags (thunderstorm, fog, high wind, heavy rain).

Atmospheric job (every 12 hr): fetches University of Wyoming radiosonde upper-air soundings — CAPE, CIN, Lifted Index, K-Index, Total Totals, storm-relative helicity, precipitable water. These instability indices drive the atmospheric stability component of the ride impact score.

Forecast job (every hour): fetches NWS point forecasts — temperature, feels-like, humidity, precip rate, cloud cover, wind, weather condition category.

Cloud Run Jobs Cloud Scheduler Exponential backoff
2

BigQuery Raw Tables — Partitioned + Clustered

All 3 sources land in dedicated BigQuery tables partitioned by time and clustered by station ID. Partition filters prevent full table scans — essential for cost control at scale. The 5 schemas cover 26 fields in metar_observations, 12 in weather_forecasts, and 22 instability metrics in wyoming_atmospheric_profiles.

DAY partitioning station_id clustering 5 tables
3

ML Ride Impact Scoring — Daily Insights Job

The insights-generator-job (runs daily at 06:00 UTC) joins current METAR observations with each ride's weather profile (ride_weather_profiles table: per-ride wind/rain/temperature/lightning thresholds and weather sensitivity score). Produces 5 component impact scores — wind, rain, lightning, temperature, atmospheric stability — normalized 0–1. Aggregates to overall_impact_score and emits a string recommendation: OPERATE, MONITOR, or CLOSE with a natural-language reason.

ride_weather_profiles ride_impact_scores 5 impact components OPERATE / MONITOR / CLOSE
4

Curated BigQuery Views

5 views over the raw tables produce pre-joined, web-ready documents: latest station conditions, latest regional geojson, last-6-hour weather events, latest atmospheric sounding, and next-24-hour forecast. Views serve as the source of truth for the publisher — one query per view, no ad-hoc SQL in the publish job.

curated_station_latest curated_region_latest curated_forecast_next_24h curated_atmosphere_latest
5

JSON Publisher → GCS → Azure Dashboards

The publisher job (every 15 min) queries each curated view and serializes to structured JSON files in the public GCS bucket. CORS configured for web access. Azure Static Web Apps and Web Apps fetch directly from GCS over HTTPS — cross-cloud, no proxy, no CDN cost unless needed. A manifest.json at the bucket root lists all objects with last-modified timestamps for client-side freshness checks.

GCS public bucket CORS enabled manifest.json Azure web apps

5-Component Ride Impact Score

Each ride has a ride_weather_profiles record encoding its physical thresholds and sensitivity. The impact model joins live conditions against these profiles to produce normalized 0–1 component scores and a final OPERATE / MONITOR / CLOSE recommendation.

💨

Wind Impact

Current wind speed (knots) vs. per-ride wind_threshold_kt. Gust detection via 1-hour change flag.

0 → 1 score
🌧️

Rain Impact

Hourly precipitation rate vs. per-ride rain_threshold_in_hr. Rolling 3hr/6hr totals for sustained rain detection.

0 → 1 score

Lightning Impact

Thunderstorm boolean flag from METAR adverse conditions. Per-ride lightning_distance_mi threshold. Atmospheric instability (CAPE, K-Index) as leading indicator.

0 → 1 score
🌡️

Temperature Impact

Current temp vs. per-ride temperature_min_f and temperature_max_f operating envelope. Feels-like temperature from NWS forecasts.

0 → 1 score
📡

Atmospheric Impact

Radiosonde-derived instability indices: CAPE (J/kg), Lifted Index, Showalter Index, Total Totals, storm-relative helicity. Early warning layer before surface conditions change.

0 → 1 score
🎢

Overall Score → Decision

Weighted aggregate of 5 components, scaled by per-ride weather_sensitivity_score (0–1). Threshold bands: OPERATE (low), MONITOR (medium), CLOSE (high).

OPERATE · MONITOR · CLOSE

ride_weather_profiles Schema — Per-Ride Thresholds

FieldTypeDescription
ride_idSTRINGUnique ride identifier
ride_typeSTRINGroller_coaster · water_ride · outdoor_show · …
wind_threshold_ktFLOAT64Max operating wind speed (knots)
lightning_distance_miFLOAT64Closure radius for lightning (miles)
rain_threshold_in_hrFLOAT64Max operating precip rate (in/hr)
temperature_min_fFLOAT64Minimum operating temperature (°F)
temperature_max_fFLOAT64Maximum operating temperature (°F)
weather_sensitivity_scoreFLOAT64Impact weight multiplier (0–1)
outdoor_exposureBOOLSignificant outdoor exposure flag

5 Tables — Partitioned + Clustered

All raw tables use DAY partitioning on the time field and clustering on station_id — queries with time and station filters never scan more than the relevant partition slices.

metar_observations — 26 fields, partitioned by observation_time+
FieldTypeDescription
station_idSTRINGICAO station identifier (e.g. KORL)
observation_timeTIMESTAMPUTC observation time — partition key
temp_f / dewpoint_fFLOAT64Temperature and dewpoint (°F)
relative_humidityFLOAT64Relative humidity (%)
pressure_hpaFLOAT64Station pressure (hPa)
wind_speed_kt / wind_gust_ktFLOAT64Wind speed and gust (knots)
precip_rolling_3hr / 6hrFLOAT64Rolling precipitation totals (derived)
pressure_change_1hrFLOAT641-hour pressure trend (derived)
is_thunderstorm / is_fog / is_mistBOOLCondition flags from METAR present weather
adverse_heavy_rain / thunderstorm / fog / high_windBOOLAdverse event flags for quick filtering
wyoming_atmospheric_profiles — 22 instability fields, partitioned by sounding_timestamp+
FieldTypeDescription
cape_j_kgFLOAT64Convective Available Potential Energy (J/kg) — thunderstorm potential
cin_j_kgFLOAT64Convective Inhibition (J/kg) — cap strength
lifted_indexFLOAT64Lifted Index — negative = unstable, <-6 = severe
k_indexFLOAT64K-Index — thunderstorm probability (>40 = high)
total_totals_indexFLOAT64Total Totals — >55 indicates severe potential
srh_j_kgFLOAT64Storm-Relative Helicity (J/kg) — tornado potential
pwat_inchesFLOAT64Precipitable water — flood potential
lcl / lfc / el pressureFLOAT64Lifted condensation level, free convection level, equilibrium level (hPa)
thickness_1000_500_mFLOAT641000-500mb thickness — rain vs snow discriminator
weather_forecasts — 12 fields, partitioned by forecast_timestamp+
FieldTypeDescription
forecast_timestampTIMESTAMPForecast valid time (UTC) — partition key
temperature_f / apparent_temperature_fFLOAT64NWS temperature and feels-like (°F)
precipitation_intensity_mm_hrFLOAT64Forecast precip rate (mm/hr)
cloud_cover_percentFLOAT64Cloud cover (%)
weather_condition_categorySTRINGNWS categorical condition string

5 Views → 7 GCS JSON Paths

Each curated view is a stable, pre-joined query over the raw tables. The publisher serializes each to a structured JSON file at a predictable GCS path — web apps fetch by path, no API layer needed.

curated_station_latest

Latest METAR for each monitored station — temp, wind, precip, adverse flags, rolling totals.

→ station/{station_id}/latest.json

curated_region_latest

All stations aggregated to a regional GeoJSON feature collection — map layer for dashboard overlays.

→ region/latest.geojson

curated_event_window_last_6h

Weather events (adverse conditions) in the past 6 hours — event log feed for the operational dashboard.

→ events/last_6h.json

curated_atmosphere_latest

Latest radiosonde sounding — CAPE, CIN, instability indices, precipitable water. Severe weather outlook.

→ atmosphere/latest.json

curated_forecast_next_24h

NWS hourly forecast for the next 24 hours — temperature, precipitation, cloud cover, condition category.

→ forecast/next_24h.json

ride_impact_scores

Per-ride ML impact scores — 5 component scores + overall score + OPERATE/MONITOR/CLOSE + reason string.

→ rides/{ride_id}/impact_latest.json

manifest.json

Bucket root manifest listing all objects with last-modified timestamps. Client-side freshness check before fetching.

→ manifest.json (root)

5 Scheduled Jobs

All compute runs as Cloud Run Jobs — no always-on instances, billed per execution. Cloud Scheduler drives the cadence; each job is independently deployable via Terraform.

Every 15 minutes

metar-ingestion-job

Fetches current METAR observations from aviationweather.gov for all monitored Central Florida stations. Derives rolling precip, pressure change, and adverse condition booleans before writing to BigQuery.

Cloud Run→ BigQuery
Every 12 hours

atmospheric-ingestion-job

Fetches University of Wyoming radiosonde upper-air soundings (00Z + 12Z). Parses 22 atmospheric stability indices — CAPE, CIN, Lifted Index, K-Index, helicity, precipitable water. Writes to wyoming_atmospheric_profiles.

Cloud Run→ BigQuery
Every 1 hour

forecast-ingestion-job

Fetches NWS point forecasts for monitored stations. Parses hourly temperature, precip rate, cloud cover, wind, condition categories for the next 24-hour window.

Cloud Run→ BigQuery
Every 15 minutes

weather-json-publisher

Queries all 5 curated BigQuery views and serializes results to structured JSON. Uploads to GCS public bucket with CORS headers. Updates manifest.json. Azure web apps fetch directly from GCS on the next page load.

Cloud RunBigQuery →GCS
Daily — 06:00 UTC

insights-generator-job

Joins current METAR + atmospheric data with ride_weather_profiles. Computes 5 per-component impact scores, aggregates to overall_impact_score, emits OPERATE/MONITOR/CLOSE recommendations with natural-language reason strings to ride_impact_scores table.

Cloud RunBigQuery ML

15 Terraform Files · 6-Step Deployment

Full Terraform IaC with separate tfvars for production. 6 numbered scripts take a fresh GCP project to running production in ~30 minutes.

Terraform

15 IaC Files

main.tf, bigquery.tf, bigquery_views.tf, storage.tf, cloud_run_jobs.tf, cloud_scheduler.tf, service_accounts.tf, iam.tf. Production tfvars template included.

Service Accounts

3 Least-Privilege SAs

weather-ingestion-sa: BigQuery dataEditor + jobUser + SecretManager accessor.
weather-publisher-sa: BigQuery dataViewer + GCS objectAdmin.
weather-ml-trainer-sa: BigQuery dataEditor + GCS objectViewer.

Step 01

01_preflight_check.sh

Validates GCP auth, required APIs enabled, gcloud + Terraform + Docker installed, project billing active. Exits with clear list of blockers before touching any cloud resource.

Step 02–03

Infrastructure + Applications

02_deploy_infrastructure.sh: Terraform apply (~10 min). 03_deploy_applications.sh: Docker build + Cloud Run job deployments (~15 min).

Step 04–05

Validate + Monitor

04_validate_deployment.sh: triggers all jobs, checks BigQuery row counts, verifies GCS objects. 05_setup_monitoring.sh: deploys CloudWatch dashboard JSON + alert policies.

Cost

$12–35/month

BigQuery $5–15 (storage + queries with partition filters). Cloud Run $5–10 (per-execution only). GCS $1–3. Cloud Scheduler $0.10. Alert threshold: >$100/month triggers notification.

Operational SLOs

Key metrics, targets, and alert thresholds. Daily health check runs check_job_status.sh and queries BigQuery for data freshness.

MetricTargetAlert ThresholdSeverity
Job success rate> 99%< 95%P1
METAR data freshness< 20 min> 30 minP1
Job execution duration< 2 min> 5 minP2
BigQuery storage< 10 GB> 50 GBP3
GCS object count~200< 50 or > 1000P2
Monthly cost< $35> $100P2
HTTP 429 rate (external APIs)< 1/hour> 10/hourP2 — exponential backoff active

Key Engineering Choices

Why BigQuery for ML rather than a dedicated model serving layer?+

The ride impact scoring problem has a small feature set (9 input fields per ride × 5 weather variables) and a deterministic scoring function. BigQuery ML handles this as a SQL join + arithmetic, not a trained neural network. Using BigQuery keeps everything in one platform — ingest, store, transform, score, and export — with no model serving infrastructure to maintain. The "ML" here is structured feature engineering and threshold-based scoring, which is exactly the right tool when the physics is well-understood (wind knots vs. a roller coaster's known closure threshold) and explainability is required (the recommendation_reason string must be human-readable for operators).

Why radiosonde data (upper-air soundings) in addition to surface METAR?+

CAPE, CIN, and Lifted Index from upper-air soundings are leading indicators of convective development — they signal thunderstorm potential 3–6 hours before surface conditions deteriorate. METAR alone would only react to thunderstorms after they're already overhead. The atmospheric component of the ride impact score gives operations staff advance warning: a high CAPE + low CIN sounding at 12Z means the afternoon convective risk is elevated even if current surface conditions look fine. The University of Wyoming archives soundings at 00Z and 12Z — the 12-hour ingestion schedule matches this availability window exactly.

Why publish JSON to GCS instead of an API endpoint?+

The consuming application is hosted on Azure. A GCP API Gateway in front of BigQuery would add ~$50–100/month in API call costs and a round-trip to GCP on every dashboard load. Instead, the publisher job runs every 15 minutes and writes pre-computed JSON to a CORS-enabled public GCS bucket. Azure web apps fetch the JSON directly — no GCP API calls at read time. GCS standard storage for ~200 objects costs under $1/month. The manifest.json lets the client skip fetching if objects haven't changed since the last poll. This architecture decouples the compute (GCP) from the consumption (Azure) with no persistent connection required.

Why partition + cluster BigQuery tables? What does it save?+

METAR observations accumulate at ~96 records/day per station (every 15 min). At 10 monitored stations, that's ~960 rows/day, ~350,000 rows/year — small in absolute terms, but the curated views query these tables on every publisher run (every 15 min = 96 times/day). Without partitioning, BigQuery would scan the full table on every query. DAY partitioning on observation_time with clustering on station_id means the curated_station_latest view only scans today's partition (~960 rows) and skips all other stations via clustering — reducing per-query bytes scanned from GB to KB. BigQuery charges $5/TB scanned; partition + clustering drops the query cost to near zero.

Why Cloud Run Jobs instead of Cloud Functions or always-on services?+

Cloud Run Jobs are the right primitive for batch/scheduled workloads that run for seconds to minutes and then exit. Each ingestion job runs for <2 minutes every 15–60 minutes — an always-on container or Cloud Function with a long timeout would be idle >95% of the time. Cloud Run Jobs bill per CPU-second of execution, not per allocation time. Cloud Functions have a 9-minute max timeout — the atmospheric ingestion job parses complex sounding data and could approach this. Cloud Run Jobs have a 1-hour timeout with no per-invocation startup overhead once the container is cached. Cloud Scheduler triggers each job on its own schedule independently.

Digital Twin Premium ↗ EARE on AWS ↗ All Projects ↗