Senior DevOps Engineer

Lumicity

Santa Rosa, CA

Senior DevOps / Site Reliability Engineer

Location: San Francisco Bay Area (Hybrid) Level: Senior Type: Full-Time

The company

Series B healthcare AI company that has grown revenue by a tremendous amount. More than 100 enterprise healthcare organizations use our platform to automate complex, compliance-critical operational workflows — the kind of work that used to require large manual teams and still carries serious downstream risk if it breaks.

We're about 100 people, well-funded, and at an inflection point: our platform is scaling fast, our engineering team is growing, and reliability is becoming mission-critical. This isn't a company that's been around long enough to accumulate decades of technical debt. You'd be building the right foundation from the start.

The role

We're hiring a Senior DevOps Engineer or Site Reliability Engineer — depending on where your experience and interests land.

Both roles sit within our engineering team, report into engineering leadership, and work closely with backend and ML engineers. The difference is in focus:

DevOps track: Infrastructure as code, CI/CD, deployment systems, developer experience, and platform reliability.
SRE track: Observability, incident management, SLO frameworks, and production reliability across distributed systems.

Whichever track you're on, this is a hands-on, high-ownership role. You'll have real production responsibility and real impact on how the platform performs at scale.

What you'll work on

Design and evolve AWS-based cloud infrastructure using Terraform
Own and improve CI/CD pipelines (GitHub Actions) for fast, safe deployments
Standardize deployment patterns across serverless workloads (Lambda), containerized services (ECS), and workflow orchestration systems
Define observability standards across metrics, logs, and traces using OpenTelemetry, Datadog, Grafana, and Sentry
Build proactive detection for reliability risks, latency regressions, and performance degradation
Partner with backend and ML teams to debug distributed system issues, including Postgres performance
Lead and support incident response and root cause analysis
Automate security and compliance workflows (access controls, audit readiness, vulnerability management)
Participate in on-call rotation

What we're looking for

Must have:

7+ years in DevOps, SRE, or infrastructure engineering in a B2B SaaS environment
Strong production AWS experience
Deep hands-on Terraform (IaC) experience
CI/CD pipeline ownership (GitHub Actions or equivalent)
Experience with serverless and containerized services in production
Postgres in production (performance, tuning, operations)
Observability tooling: metrics, logs, traces — and the ability to turn signals into action
Scripting fluency (Python, Bash, or similar)
High ownership mindset — you're not waiting to be assigned an incident, you're already thinking about failure modes

Nice to have:

Experience in healthcare, fintech, or other regulated environments
ClickHouse or high-scale analytics systems
OpenTelemetry and modern observability architecture
ML infrastructure experience

Why join now

Define reliability and infrastructure standards before they calcify
Tight collaboration with product, backend, and ML — no siloed infra team
Meaningful equity in a company with strong investor backing and real market traction
Modern cloud-native stack: AWS, Terraform, GitHub Actions, ECS, Lambda, Aurora Postgres, Datadog, OpenTelemetry

Senior DevOps Engineer

Job Information

Related jobs

Trending Job Titles

Trending Locations

Trending Companies

Trending Categories