Crusoe Energy
AI & HPC managed services
Building managed infrastructure for GPU-intensive AI training and inference workloads — Kubernetes, Slurm, and the operational tooling around them.
// Senior Staff Software Engineer
Distributed systems·AI infrastructure·Crusoe Energy
I design and operate the distributed systems that keep production running at scale — and the harder the operational problem, the more I want it. If I do my job right, you have no idea I exist.
Distributed systems at scale
Architecting the data, compute, and control planes that run production — and keeping them running when they're under load.
Operating at production scale
On-call, incidents, capacity, observability — the operational discipline that turns a working system into a reliable one.
DevOps & GitOps
Self-service CI/CD, infrastructure-as-code, container delivery, and the deployment automation behind two decades of production systems.
AI & HPC infrastructure
Managed services for GPU-intensive training and inference. Kubernetes, Slurm, and heterogeneous compute at scale.
Crusoe Energy
Building managed infrastructure for GPU-intensive AI training and inference workloads — Kubernetes, Slurm, and the operational tooling around them.
Workday
Two-phase migration architecture — DataSync for transfer, EMR Spark for transformation — that decoupled copy from logic, allowing thorough validation and reuse downstream. Delivered without disrupting production workloads.
Workday
Architected and operated the EKS-based telemetry platform that replaced a sprawl of per-system tooling — single pane of glass across every environment, lower capex, and the operational signal engineers actually trusted.
Symantec
Led the production deployment of Symantec Endpoint Protection Cloud — the company's first SaaS product — and built the self-service CI/CD pipeline behind it from scratch.
| Years | Where | What |
|---|---|---|
| 2025 – now | Crusoe Energy | AI & HPC infrastructure. Managed services for GPU workloads at scale. Current |
| 2019 – 2024 | Workday | Distributed infrastructure, DevOps tooling, and fleet-wide observability. DataLake migration to AWS. Kubernetes platform for public-cloud delivery with zero-downtime deploys. |
| 2014 – 2019 | Symantec | Cloud security. First SaaS product to production. Established in-house DevOps practice — self-service CI/CD, IaC, and microservice containerization with Docker & Kubernetes. |
| 2008 – 2014 | USC · NASA JPL | MS & PhD coursework. Earth-science data systems at JPL. Built a git-based assignment-delivery and grading pipeline as TA — early DevOps instincts. |
| 2004 – 2008 | KFUPM | BS, Computer Engineering. Hardware-software fundamentals. |
Twenty years of building distributed systems that have to stay up — not because of a master plan, but because the harder problems kept being more interesting than the easier ones. Started in security at Symantec, moved into infrastructure engineering and cloud architecture at Workday, now working on AI/HPC at Crusoe.
I tend to be the person on the team who picks up the thing nobody else wants to own — the migration that has to be invisible, the system that has to work for ten different teams with conflicting needs, the launch that can't slip. Patient with detail, allergic to drama, comfortable on the bridge when production is on fire.
Stack
// elsewhere
I don't post much, but this is where I am when I do. The fastest way to reach me is LinkedIn.