AWS SRE Engineer

Cliff Services Inc
Austin, TX

Job Title: AWS SRE Engineer

Location: Austin, TX

Job Summary

We are seeking a highly skilled AWS Site Reliability Engineer (SRE) with a strong Java background to support and scale cloud-native applications. The ideal candidate will ensure reliability, performance, and scalability of distributed systems running on AWS.

Key Responsibilities

  • Design, implement, and manage scalable infrastructure on AWS
  • Monitor system performance, availability, and reliability using observability tools
  • Develop automation solutions for deployment, monitoring, and incident response
  • Troubleshoot production issues and perform root cause analysis
  • Collaborate with development teams to improve application reliability and performance
  • Implement CI/CD pipelines and infrastructure as code (IaC) practices
  • Ensure system security, compliance, and cost optimization in AWS environments

Required Skills

  • Strong hands-on experience with AWS services (EC2, S3, Lambda, RDS, CloudWatch, etc.)
  • Solid programming background in Java
  • Experience with containerization tools (Docker, Kubernetes)
  • Expertise in monitoring/logging tools (Prometheus, Grafana, ELK, CloudWatch)
  • Experience with CI/CD tools (Jenkins, GitHub Actions, etc.)
  • Knowledge of Infrastructure as Code (Terraform, CloudFormation)
  • Strong understanding of distributed systems and microservices architecture

Preferred Qualifications

  • Experience working in SRE/DevOps environments
  • Familiarity with incident management and on-call support
  • Exposure to high-availability and fault-tolerant system design
  • Experience with Agile/Scrum methodologies

Experience

  • 5+ years of relevant experience in SRE/DevOps roles
  • Proven experience supporting production systems in AWS

Additional Notes

  • Strong communication and collaboration skills required
  • Ability to work in a fast-paced, high-impact environment
// // //