Specialty Jewelry Retailer is the world's largest jewelry retailer — operating jewelry brand A, jewelry brand B, jewelry brand C, H.Samuel, Ernest Jones, and a portfolio of banner brands with thousands of retail locations across North America and the UK. Their business is one of the most seasonally concentrated in retail: the Oct–Feb window (Holiday Rush through Valentine's Day) represents the majority of their annual revenue, which means their analytics and data platform must be sized for peak retail season, not average load. The engagement was a targeted AWS Professional Services migration for their analytics stack — Hadoop, Solr, Oracle, Informatica ETL, and Mulesoft — not a general estate assessment. This made the pricing exercise uniquely demanding: you can't use average utilization for cluster sizing; you have to model peak compute capacity, peak IO throughput, and peak data volumes simultaneously.
The three-scenario TCO model (Like-for-Like from inventory, Rightsized from actual usage, and Peak Usage for holiday season) reflects exactly this complexity. The Hadoop worker nodes are r5.8xlarge instances — 256GB RAM, 32 vCPUs, IO2 with 20K IOPS — because the data processing workload during holiday shopping season requires both compute headroom and sustained disk throughput. The model also included an important architectural decision: should Mulesoft's ActiveMQ be lifted and shifted to EC2, or should Specialty Jewelry Retailer migrate to Amazon's native managed messaging service (AmazonMQ)? We modeled both options in the same workbook and flagged the trade-off explicitly.
Use this when selling to: Specialty retailers, luxury goods companies, and any retail or e-commerce organization with a Hadoop/Spark analytics platform and strong seasonal demand patterns. Also strong for any client evaluating a targeted analytics platform migration separate from their general estate, or clients at the "should we use managed AWS services vs. lift-and-shift" decision point.
Additional PROD services: AWS WAF ($9,312/mo, 20 Web ACLs × 20 rules each), AWS Transit Gateway (data transfer), IBM Tivoli monitoring (1 × m6i.large). DEV/TEST include parallel stacks of all five platforms plus Solr and messaging layers. Total modeled nodes across all environments: ~70+ instances.
| Stack | Nodes | AWS Instance | On-Demand /mo | 3yr RI /mo | Notes |
|---|---|---|---|---|---|
| Hadoop Workers | 11 | r5.8xlarge (256G / 32 vCPU / 1.7TB IO2) | $106,170 | $96,975 | Peak-season sizing; IO2 20K IOPS per node |
| Hadoop Masters | 3 | m5.8xlarge (128G / 32 vCPU / 1.2TB IO2) | $8,201 | $6,292 | IO2 10K IOPS per node |
| Oracle RDS | 1 (MultiAZ) | db.m5.4xlarge (64G / 16 vCPU / 2.5TB IO1) | $12,094 | $10,568 + $10,512 upfront | IO1 50K IOPS; 2nd node covered by MultiAZ pricing |
| Solr | 4 | m6i.8xlarge (128G / 32 vCPU / 1.7TB GP3) | $5,422 | $2,971 | Search index caching requires high-memory tier |
| AWS EMR | Managed | Managed Hadoop Service | $2,759 | $2,759 | Managed alternative to EC2 Hadoop; no RI option |
| Informatica ETL | 4 | r6i.4xlarge (128G / 16 vCPU / 2TB GP3) | $3,364 | $1,756 | Memory-optimized for large ETL transformations |
| Mulesoft ESB | 6 | m6i.xlarge (16G / 4 vCPU / 90GB GP3) | $1,147 | $688 | 3yr No Upfront RI |
| ActiveMQ | 6 | m6i.xlarge (16G / 4 vCPU / 90GB GP3) | $1,147 | $688 | OR AmazonMQ managed broker — both priced; select one |
| AWS WAF | Service | 20 ACLs × 20 rules/ACL | $9,312 | $9,312 | Web Application Firewall; no RI; $370 one-time setup |
| Tivoli Monitoring | 1 | m6i.large (8G / 2 vCPU / 64GB GP3) | $119 | $81 | IBM Tivoli monitoring node |
| PROD Subtotal (ex-TGW) | ~$149,735/mo | ~$131,130/mo | Transit Gateway data transfer additional (variable by volume) |
the retailer's analytics platform cannot be sized for average load — it must be sized for peak. The Oct–Feb window (Holiday Rush: Christmas, Valentine's Day) plus Back to School represents the high-demand period for jewelry retail. In the TCO model, 150 days/year are designated as peak days. During peak, Hadoop workers process significantly higher data volumes: more transactions, more customer behavioral data, more inventory signals, more search queries feeding the Solr layer. The Hadoop workers are r5.8xlarge instances (256G RAM, IO2 20K IOPS, 1.7TB storage) — not because average utilization requires this, but because the peak workload does.
This creates an interesting cost optimization opportunity: cloud elasticity means you don't have to keep 11 r5.8xlarge workers running 24/7 year-round. The model explicitly identified a $25K/month or $500K+/year savings from automating DEV/UAT/PROD scaling — spinning down or downsizing non-production environments outside peak windows, and implementing auto-scaling for the production Hadoop cluster during off-peak periods. This is the direct counter-argument to "the cloud is more expensive than on-prem" for seasonal workloads: on-prem is always sized for peak; cloud can be sized for average and burst for peak.
The three-scenario pricing model makes this tangible: Like-for-Like (always-on peak sizing) vs. Rightsized (average utilization baseline) vs. Peak Usage (holiday throughput). The delta between the Rightsized and Peak scenarios represents the cloud elasticity dividend — the cost you'd incur on-prem every day but only need to pay in cloud during the ~150 peak days.