Advisory AI — Azure Deployment

How It Was Built

From Flat Prototype to Reproducible Deployment

The starting state: a working advisory platform deployed manually — no IaC, no CI/CD, 18+ numbered seed scripts, business logic scattered at the root. The goal: package it as a corporate-ready repository another team could clone and deploy from scratch. Kiro generated the entire transformation from a design spec.

What "zero business logic changes" means in practice

function_app.py, governance_routes.py, and code_scanner.py are relocated from the flat root into src/functions/ but never touched otherwise — not a single line changed. All new code is infrastructure: Bicep templates, pipeline YAML, Python tooling (audit + seed runner), and shell scripts. The design spec explicitly called this out as a constraint, and Kiro honored it. The blast radius audit tool was generated to verify this guarantee: it checks file presence, sizes, SHA-256 hashes, Python syntax, and import resolution across the entire repository after restructuring.

Spec-driven Zero logic changes Managed Identity auth Bicep modules Blast radius audit Hypothesis PBT

Generated 01

Bicep IaC — 4 Modules

Orchestrator main.bicep composing compute, data, storage, and monitoring modules. No state file. Consumption Plan for dev, Premium EP1 for prod. System-assigned Managed Identity on the Function App.

Generated 02

GitHub Actions — 2 Pipelines

deploy.yml: build → deploy-functions → deploy-web → verify. infra.yml: triggers on Bicep/config changes, runs az deployment group create with environment parameter files.

Generated 03

Blast Radius Audit Tool

Standalone scripts/audit.py that generates SHA-256 manifests, checks presence/size of every expected file, validates Python syntax, and checks import resolution. Runnable via make audit. Exit code 0 = all clear.

Generated 04

Unified Seed Runner

18+ numbered migration scripts collapsed into a single SeedRunner class with 4 ordered phases: verify containers, seed frameworks (NIST, ISO, EU AI Act, etc.), seed scan patterns, optional demo data. Idempotent via deterministic IDs + upsert.

Generated 05

Repository Restructure Map

Old path → new path mapping for every file: business logic into src/functions/, web assets into src/web/, seed scripts into scripts/seed/, docs into docs/reference/ and docs/archive/.

Generated 06

Environment Parameter Files

3 .bicepparam files for dev/staging/prod — controlling Function App SKU, Cosmos DB billing mode (serverless vs provisioned autoscale), storage redundancy (LRS vs GRS), and throughput limits.

Generated 07

Cosmos DB + RBAC

7 containers with partition keys and TTL configured. Managed Identity assigned Cosmos DB Built-in Data Contributor role — no connection strings in app settings. Storage similarly uses Storage Blob Data Contributor.

Generated 08

Smoke Test + Env Validation

smoke-test.sh checks health endpoint, static website, governance API, and Cosmos DB connectivity with retry logic. validate-env.sh checks all required variables before any deployment step runs.

Generated 09

3 Property-Based Tests

Hypothesis tests covering: audit manifest completeness + SHA-256 correctness, import resolution accuracy, and seed document ID determinism. 100+ random iterations each, covering the pure Python logic layers.

Deployment Flow

First-Time Deploy — End to End

How a fresh clone goes from zero to running in a new Azure subscription.

Validate Environment

bash scripts/validate-env.sh checks all required variables: Cosmos connection, AI Foundry endpoint/key, storage account, deployment name. Exits with a clear list of anything missing before any Azure API is called.

validate-env.shPre-flight check

Provision Infrastructure — Bicep

az deployment group create --template-file infra/main.bicep --parameters config/dev.bicepparam. Bicep creates resource group, Function App + plan, Cosmos DB + 7 containers, Storage Account + static hosting, Application Insights — and assigns Managed Identity RBAC roles atomically.

infra.ymlBicep nested modulesRBAC assignments

Seed Reference Data

make seed-data runs the unified SeedRunner: verifies all 7 containers exist, upserts 7 framework documents (NIST, ISO, FinOps, ModelOps, AIOps, XDR, EU AI Act), upserts multi-language scan patterns, optionally seeds demo project. Idempotent — safe to re-run.

seed_all.pyUpsert idempotentDeterministic IDs

Deploy Functions + Web

func azure functionapp publish deploys the Python 3.11 app from src/functions/. az storage blob upload-batch syncs src/web/ to the $web container with correct MIME type metadata for each extension.

deploy.ymlfunc publishblob upload-batch

Verify + Blast Radius Audit

Post-deploy: smoke-test.sh hits health, static root, governance API, and Cosmos DB connectivity with 3-retry logic. Optionally run make audit to generate SHA-256 manifest of the deployed repo and verify every expected file is present and intact.

smoke-test.shaudit.pyExit 0 = all clear

Blast Radius Audit

Verifying Zero Business Logic Was Lost

After restructuring 40+ files from a flat prototype layout into the new repository structure, the audit tool provides a machine-verifiable guarantee that nothing was accidentally dropped, truncated, or broken.

Check Type

File Presence

Every file in the expected inventory is checked for existence on disk. Missing files → AuditResult.missing[]. Unknown files (on disk but not in inventory) → AuditResult.unknown[].

Check Type

Size Threshold

Source files under 100 bytes are flagged as unexpected_size — a sign of a placeholder or partial write. The 8,600-line governance-app.js was a specific concern after the file loss event that prompted this tool.

Check Type

SHA-256 Manifest

Every file gets a SHA-256 hash recorded in the audit manifest. Output is a JSON document with per-file entries including size, hash, and category — fully diff-able between runs.

Check Type

Python Syntax Validation

py_compile runs on all .py files in src/functions/. Syntax errors are recorded in compile_results and cause exit code 1 if they affect critical path files.

Check Type

Import Resolution

Unresolved imports in governance_routes.py and code_scanner.py (the critical path) cause exit code 1. Non-critical import warnings are recorded but non-blocking.

Output

Exit Code Contract

Exit code 0 = all present, all syntax valid, all critical imports resolved. Exit code 1 = at least one critical gap. The CI/CD pipeline gates deployment on exit code 0 from make audit.

Security Design

Managed Identity — No Secrets in App Settings

The core security decision: system-assigned Managed Identity on the Function App eliminates Cosmos DB and Storage connection strings from every environment configuration.

Managed Identity RBAC

Function App identity → Cosmos DB Built-in Data Contributor on the Cosmos account
Function App identity → Storage Blob Data Contributor on the storage account
Both role assignments provisioned atomically in Bicep — no post-deploy manual steps
No connection strings or storage keys in Function App settings

Secrets Handling

Third-party API keys (NewsAPI, GNews, MediaStack, Alpha Vantage, GitHub Token) stored as Function App settings or Key Vault references
KEY_VAULT_URL optional — when set, app settings use @Microsoft.KeyVault() references
.env.example lists all vars with placeholders — never committed with real values
AI Foundry key parameterized in Bicep — not hard-coded in any module

CI/CD Security

Azure service principal credentials stored as GitHub repository secrets
validate-env.sh pre-checks all required variables before any deployment step
Bicep @allowed(['dev', 'staging', 'prod']) on environment parameter — prevents accidental deployments to unknown environments
Branch protection on main — deploy.yml only runs after PR review

Data Access Boundaries

Static website served directly from Azure Blob Storage static hosting — no proxy
Generated documents in generated-documents blob container — served via Function App pre-signed URLs only
Cosmos DB accessible only from the Function App via Managed Identity — no public key access in prod
CORS rules on storage allow GET/POST/OPTIONS from all origins (required for SPA)

Testing Strategy

3 Correctness Properties — Hypothesis PBT

IaC and pipelines are tested via Bicep validation and actionlint. The two pure Python components — the blast radius audit tool and seed ID generator — are the right target for property-based testing: pure functions with complex input spaces.

Property 1

Audit Manifest Completeness

For any directory tree — the manifest generator records every file with a SHA-256 that matches the actual content and a size that matches the actual file size. Inventory comparison correctly categorizes as present (≥100 bytes), missing, unexpected_size, or unknown.

Validates: 1.1, 1.2, 1.4

Property 2

Import Resolution Correctness

For any Python file with import statements and a set of module files — the resolver reports an import as resolved if and only if a corresponding .py file exists containing the expected exported symbols. No false positives, no false negatives.

Validates: 1.6

Property 3

Seed ID Determinism

For any (type, name) string pair — generate_document_id() returns the same ID on every call, and two different (type, name) pairs never produce the same ID. Tested with arbitrary text() strategies, 100+ iterations.

Validates: 7.4

Additional Testing — Bicep, Pipeline YAML, Seed Idempotency+

Bicep validation: az bicep build on all 5 Bicep files catches syntax errors before deployment. az deployment group what-if against a test resource group validates resource definitions without provisioning.
Pipeline YAML lint: actionlint validates deploy.yml and infra.yml syntax and GitHub Actions expression references.
Seed idempotency: seed_all.py runs twice against a Cosmos DB emulator; document counts are identical after both runs.
Python compile check: py_compile on all .py files in src/functions/ runs as the first build stage — catches syntax regressions before packaging.
Smoke test suite: Post-deploy verification of health, static website, governance API, and Cosmos DB connectivity with 3-retry logic.

Design Decisions

Key Engineering Choices

Why Bicep over Terraform or ARM JSON?+

Bicep is Azure-native, produces significantly cleaner templates than ARM JSON (no nested dependsOn boilerplate, real parameter types, module composition), and requires no state file. The team already uses Azure CLI — no new tooling needed. Terraform would add state file management complexity and a HashiCorp layer. ARM JSON is too verbose to maintain. Bicep's module decomposition mirrors the CloudFormation nested stack approach used in the AWS version, making the two deployments structurally symmetrical.

Why one seed runner instead of keeping the 18 numbered scripts?+

The 18+ numbered scripts accumulated organically — 01_create_containers.py, 02_seed_frameworks.py, 07_fix_pattern_ids.py, etc. They had overlapping responsibilities, inconsistent error handling, and had to be run in a specific order that wasn't encoded anywhere. The SeedRunner class consolidates everything: ordered phases, idempotent upserts with deterministic IDs, a SeedReport return value, and a single CLI entrypoint. Running it from scratch or re-running it on an existing database produces identical state either way.

Why Managed Identity instead of connection strings?+

Connection strings are secrets that rotate, get committed accidentally, appear in logs, and require manual distribution across environments. Managed Identity eliminates all of that: the Function App's system identity is assigned RBAC roles directly on the Cosmos DB account and Storage Account in Bicep, with no key material involved. Rotation is automatic. The only secrets that remain are third-party API keys (NewsAPI, Alpha Vantage, etc.) which have no Managed Identity equivalent — those go in Function App settings or Key Vault references.

Why Consumption Plan for dev and Premium EP1 for prod?+

Consumption Plan (Y1) costs nothing when idle — ideal for dev/staging where the app sits unused most of the day. Its cold start behavior (1–3 second delays) is acceptable during testing. Premium EP1 eliminates cold starts entirely via always-warm instances, supports VNet integration for private endpoint access, and provides predictable latency under production load. The trade-off (always-on billing) is justified for prod. The Bicep parameter file makes the switch a single line change: functionAppSku = 'EP1'.

Why the blast radius audit tool? What was the file loss event?+

During an earlier development session, a file restructuring operation (moving the flat prototype into the new directory layout) resulted in some files being partially written or silently truncated — specifically large JS files like governance-app.js (8,600 lines). The corruption wasn't immediately visible in the terminal output. The audit tool was designed as a post-restructuring verification step: generate a SHA-256 manifest of the expected inventory, check every file is present and above the minimum size threshold, validate Python syntax, and check import resolution. It runs in under a second and provides a machine-readable exit code that CI/CD can gate on.

How does this Azure deployment compare to the AWS version?+

The Azure version is the original — the platform existed on Azure before the AWS deployment was designed. The key structural difference: the Azure version required no cloud adapter layer because the business logic was already written against Azure SDKs. The AWS deployment added the full adapter abstraction to keep those 7,500 lines of business logic unchanged while switching the cloud backend. Both use the same shared seed data (frameworks.py, patterns.py), deterministic document IDs, and idempotent upsert strategy. The Bicep module structure was intentionally mirrored in CloudFormation nested stacks for the AWS version.

Advisory AI — Azure DeploymentPrototype to Production IaC

From Flat Prototype to Reproducible Deployment

What "zero business logic changes" means in practice

Bicep IaC — 4 Modules

GitHub Actions — 2 Pipelines

Blast Radius Audit Tool

Unified Seed Runner

Repository Restructure Map

Environment Parameter Files

Cosmos DB + RBAC

Smoke Test + Env Validation

3 Property-Based Tests

Azure Service Layers

First-Time Deploy — End to End

Validate Environment

Provision Infrastructure — Bicep

Seed Reference Data

Deploy Functions + Web

Verify + Blast Radius Audit

7 Cosmos DB Containers

governance-projects

governance-artifacts

governance-frameworks

governance-patterns

governance-audit

stock-data

news-articles

Verifying Zero Business Logic Was Lost

File Presence

Size Threshold

SHA-256 Manifest

Python Syntax Validation

Import Resolution

Exit Code Contract

Dev · Staging · Prod Parameter Files

Managed Identity — No Secrets in App Settings

Managed Identity RBAC

Secrets Handling

CI/CD Security

Data Access Boundaries

3 Correctness Properties — Hypothesis PBT

Audit Manifest Completeness

Import Resolution Correctness

Seed ID Determinism

Key Engineering Choices

Advisory AI — Azure Deployment
Prototype to Production IaC