SRE – Application / Platform SRE (USC & GC Only)

Ampstek
Bloomfield, CT

Job Title: SRE – Application / Platform SRE

Location: Bloomfield , CT (Onsite)

Job Type: 12+ Months Contract


Job Description:

Experience in monitoring, troubleshooting, performance tuning, capacity planning, and automation, along with strong exposure to distributed data processing frameworks like Spark, Flink, and Kafka.

Hadoop Cluster Administration & Operations

• Ensure 24x7 system reliability, incident response, and operational readiness for global applications.

• Lead troubleshooting efforts during outages/performance incidents; perform root cause analysis (RCA) and implement preventive actions.

• Define and maintain operational metrics and reliability goals (availability, latency, throughput, resource utilization).

• Improve system stability via proactive monitoring, alerting, and capacity planning

• Big Data & Streaming Support

• Support deployments and operations across:AWS Cloud, Kubernetes, containerized environments

• Implement and maintain cluster reliability in Kubernetes environments: Resource quotas, access control, permissions, namespace management


Thanks & Regards


Himanshu Verma | Recruiter – US Staffing

Email: Himanshu.v@ampstek.com | Desk: (609)-527-8914

Ampstek LLC – Global IT Partner | www.ampstek.com

// // //