Staff Machine Learning Engineer – ML Platform / AI Infrastructure
AI Healthcare Startup | Clinical AI at Scale | Hybrid (San Francisco)
We’re hiring a Staff Machine Learning Engineer (ML Platform) to join a leading healthcare AI company building production systems that power clinical AI across millions of patient encounters.
This role sits at the intersection of ML and platform engineering — owning the infrastructure that enables fast, reliable iteration on model quality. You’ll build the systems that allow ML teams to ship improvements in days, not weeks.
You’ll work on:
- Evaluation systems, release gates, and automated grading
- Observability, tracing, and debugging for ML workflows
- Chart context retrieval and data pipelines for model inputs
- Feedback loops that convert real-world usage into training signal
- Preference systems for clinician-specific model behaviour
- Model serving, performance, and reliability at scale
What You’ll Do:
- Build and scale evaluation and release infrastructure for ML models
- Develop tooling to debug, reproduce, and analyze model regressions
- Design data pipelines for large-scale, unstructured clinical data
- Improve latency, reliability, and observability across ML systems
- Enable faster experimentation and iteration across ML teams
What We’re Looking For:
- 5–8+ years in ML engineering or software engineering (ML focus)
- Strong backend skills (Python, TypeScript, or similar)
- Experience with ML systems, MLOps, or data infrastructure
- Track record of improving model development velocity or quality
- Comfortable operating across ML, infra, and platform layers
Nice to have:
- Experience with evaluation systems or ML observability
- Background in healthcare or regulated environments
- Experience with large-scale retrieval or long-context systems
This Role Is:
- Hybrid — San Francisco (3 days onsite)
- $250K–$300K base + equity