Director of AI (FDE)

Colossus Technologies Group
San Francisco, CA

Director of Agent Systems Engineering (Forward Deployed Engineering)

Location: Remote (U.S.) — Monthly team meetups

Compensation: Up to ~$300K base + equity


About the Role

We’re partnering with a fast-growing healthtech company building AI agents that operate real clinical workflows — including patient intake, administrative automation, and clinician support.

After deploying these systems in production, one thing became clear:

The hard problem isn’t the model — it’s making the system reliable enough to run real workflows.

These workflows are long-running, touch multiple systems (EHRs, APIs, internal tools), and require correctness, traceability, and resilience when things break.

To support this, the company is scaling its Forward Deployed Engineering (FDE) function from ~20 → 50 engineers this year.

We’re hiring a Director of Agent Systems Engineering to lead part of this organization.


What You’ll Do

This role sits at the intersection of AI engineering and real-world deployment.

You will lead teams responsible for turning complex workflows into production-grade agent systems — and ensuring those systems are reliable, observable, and repeatable.

Key responsibilities include:

  • Designing repeatable delivery systems for deploying agent workflows into production
  • Mentoring and scaling engineering pods, driving execution and technical excellence
  • Capacity planning and delivery predictability across multiple concurrent deployments
  • Setting the technical bar for reliability, observability, and system correctness
  • Driving architecture and system design, including debugging multi-step workflows and failure modes
  • Feeding learnings back into the core platform as reusable primitives and abstractions

This is a hands-on leadership role — you’ll be close to architecture, system behavior, and real production issues.


What We’re Looking For

We’re looking for engineers who think in systems, not just models.

Strong candidates will have:

  • Experience building and scaling distributed systems or platform infrastructure
  • Exposure to AI/LLM-based systems, ideally including agent workflows or orchestration
  • A deep understanding of reliability, observability, and failure handling in production systems
  • Experience working with complex, multi-step workflows across multiple services or APIs
  • A track record of turning repeated patterns into reusable platform capabilities
  • Leadership experience managing teams and driving delivery in ambiguous environments

Backgrounds may include:

  • Platform / infrastructure engineering
  • ML platform or AI systems
  • Workflow orchestration / distributed systems
  • Forward deployed or customer-facing engineering roles

// // //