About the Role
We’re scaling AI agents that think, plan, and act — and you’ll be the one giving them the firepower. As an ML Infra Engineer, you’ll own the foundations that let our models train faster, deploy smoother, and learn more efficiently. This is a high-impact role for someone who lives at the bleeding edge of infra + intelligence.
You’ll be responsible for:
- Building and maintaining the infrastructure behind our ML training and inference stacks
- Optimizing GPU usage, training pipelines, and model serving at scale
- Collaborating with ML engineers to make every training run count
- Keeping everything fast, reliable, and transparent
We’d love to see:
- 4+ years in ML infrastructure, MLOps, or performance engineering
- Experience with orchestration tools (e.g., Kubernetes, Ray, Airflow)
- Deep knowledge of GPU tuning, distributed systems, and observability
- A strong desire to unblock others and improve the dev loop