Staff Software Engineer, AI Infrastructure

Harrison Clarke
San Francisco, CA

LLM Infrastructure / Agent Runtime Engineer

San Francisco, CA (Hybrid)


Harrison Clarke is partnering with a well-funded AI company building foundational infrastructure for the next generation of autonomous AI systems.


We're seeking an exceptional LLM Infrastructure Engineers to help design and scale the systems that power production-grade AI agents. This is a rare opportunity to work at the intersection of distributed systems, model orchestration, reasoning frameworks, and developer platforms as AI moves from simple chat interfaces to fully autonomous software.

You'll be joining a team focused on building the runtime layer that enables agents to plan, reason, execute actions, utilize tools, manage memory, and operate reliably at scale.


What You'll Work On

  • Design and build highly scalable agent execution runtimes capable of handling millions of model invocations and tool calls
  • Develop orchestration systems for multi-agent workflows, planning, task decomposition, and long-running autonomous processes
  • Build infrastructure for memory management, context retrieval, state persistence, and agent observability
  • Create reliable execution frameworks for tool use, function calling, code execution, and external integrations
  • Optimize latency, throughput, reliability, and cost across large-scale LLM deployments
  • Develop evaluation, monitoring, tracing, and debugging systems for agent performance
  • Collaborate closely with research and applied AI teams to productionize cutting-edge agent architectures
  • Help define the infrastructure layer powering the next generation of AI-native applications


What We're Looking For

  • Strong software engineering fundamentals with expertise in distributed systems and backend infrastructure
  • Experience building large-scale systems in Python, Go, Rust, or similar languages
  • Deep understanding of modern LLM architectures and inference workflows
  • Experience working with agent frameworks, orchestration systems, or AI infrastructure platforms
  • Familiarity with vector databases, retrieval systems, memory architectures, and context management
  • Strong knowledge of cloud infrastructure, Kubernetes, containerization, and production deployment practices
  • Experience designing APIs, SDKs, and developer-facing platforms
  • Ability to operate across infrastructure, platform, and AI application layers


Strongly Preferred

  • Experience with LangGraph, OpenAI Agents SDK, AutoGen, CrewAI, Temporal, Prefect, or similar orchestration frameworks
  • Experience building agent evaluation pipelines and observability tooling
  • Familiarity with model serving frameworks such as vLLM, TensorRT-LLM, TGI, or Ray Serve
  • Knowledge of distributed workflow engines and event-driven architectures
  • Experience scaling AI products from prototype to production


Why This Opportunity?

  • Work on foundational technology defining how autonomous AI systems operate in production
  • Join a high-caliber team of engineers and researchers from leading AI labs and infrastructure companies
  • Significant technical ownership and influence over core platform architecture
  • Competitive compensation package including meaningful equity
  • Backed by top-tier investors with substantial runway


If you're excited about building the infrastructure layer that enables AI agents to become reliable, scalable, and production-ready, we'd love to speak with you.

// // //