This position may be filled at multiple levels depending on the candidate’s experience and qualifications.
We are building a modern, software‑defined infrastructure platform designed to support the next generation of applications across the organization.
This platform is designed around open-source technologies, commodity hardware, and automation-first principles. It provides the core compute, storage, networking, and container runtime capabilities required to run distributed systems at scale across both data centers and cloud environments.
This is a greenfield engineering effort focused on defining how infrastructure is built and operated using code, APIs, and declarative systems, with reliability, observability, and repeatability designed in from the start.
Our approach emphasizes:
- Linux-first systems design
- Kubernetes as a core abstraction layer
- Infrastructure-as-code and GitOps workflows
- Open observability standards (Prometheus, OpenTelemetry)
- Distributed, software-defined storage and networking
Engineers on this team build and own foundational systems that applications across the organization rely on for secure, reliable operation.
Role Overview
This role will be responsible for designing, building, and operating the foundational infrastructure that powers the company’s technology platform. This includes compute, storage, networking, and container infrastructure supporting enterprise applications, internal platforms, and hybrid cloud environments.
This role focuses on delivering reliable, scalable, and automated infrastructure platforms across data centers and cloud environments. Infrastructure engineers operate the foundational platforms that support modern workloads, including virtualization platforms, storage systems, networking, and container infrastructure such as Kubernetes clusters.
Key Responsibilities
Infrastructure Engineering
- Design, deploy, and operate infrastructure platforms including compute, storage, networking, and container infrastructure
- Build and maintain scalable infrastructure across on-premises data centers and cloud environments
- Operate and support Kubernetes clusters and their underlying infrastructure
- Ensure high availability, reliability, and performance of infrastructure systems
- Support hybrid infrastructure environments and platform services that run on top of them
Automation & Infrastructure as Code
- Develop and maintain infrastructure automation using modern programming languages (Go, Python, Java)
- Implement infrastructure provisioning and configuration through infrastructure-as-code tools such as Terraform
- Standardize infrastructure deployment and lifecycle management
- Build tooling that improves operational efficiency and infrastructure reliability
Platform Integration
- Support infrastructure dependencies for container platforms and distributed systems
- Deploy, upgrade, and maintain Kubernetes clusters and related infrastructure components
- Operate infrastructure services including IaaS platforms and storage systems
- Collaborate with platform engineering teams supporting CI/CD, messaging, observability, and developer platforms
Observability & Reliability
- Implement monitoring and observability using Prometheus, Grafana, and OpenTelemetry
- Participate in incident response and root cause analysis
- Contribute to reliability improvements and operational maturity
Security & Access Management
- Implement infrastructure security best practices
- Support identity and access management and secrets management systems
- Collaborate with security teams to ensure infrastructure resilience and compliance