Senior Technical Program Manager | Cloud Infrastructure, AI/ML Platforms, and Responsible AI Governance
I lead large, cross-functional infrastructure and platform programs in financial services and high-scale tech. 14+ years turning ambiguous, multi-team technical bets into shipped, measurable outcomes: resilient platforms, zero-downtime migrations, and governance that scales.
Focus: Cloud infrastructure (GCP / GKE) | AI/ML and LLM platforms | Responsible AI governance | Cybersecurity | FinOps and capacity optimization
| Program | Problem solved | Outcome |
|---|---|---|
| Distributed Authentication Platform | Legacy auth failing under peak load and regional outages | Multi-region, zero-downtime design at 7K+ TPS, 99.9% availability |
| Cloud Cost Intelligence Platform | No real-time cost attribution across teams | ~$2.3M annual savings, ~30 to 35% BigQuery reduction, 8+ teams onboarded |
| Responsible AI Governance Framework | Inconsistent, unauditable model risk decisions | Operating model, lifecycle gates, and policy for AI/ML governance |
| LLM Platform Program | No structured path from LLM prototype to production | Reference architecture, eval rubric, and phased rollout plan |
| Model Eval and Release Pipeline | Models shipped without consistent quality gates | Runnable evaluation and release-gating pipeline with CI |
| Cloud Migration Readiness Framework | No visibility into dependencies and go-live risk | Centralized readiness and dependency governance across 40+ components |
- MongoDB Atlas blue/green sharding migration: sub-50ms p99, zero-downtime cutover
- AIOps and responsible-AI governance program: ~40% engineering velocity gain
GCP / GKE | AWS | MongoDB Atlas | BigQuery | Kafka | Terraform | Grafana | Jira / Confluence | GitHub Copilot
Repositories here are program case studies based on real work. Metrics are from production programs; code is illustrative unless a repo states otherwise.