Job Title: Data Centre Operations Lead
Location: Rockville, MD - Onsite from day1 - Contract to hire
Duration: 6 months - Contract to hire
Job Description:
• Lead the data center operations team, providing guidance, training, and support to ensure high performance and operational excellence. Act as the primary point of contact for all data center-related issues and escalations.
• Oversee the daily operations of data center facilities, ensuring high availability and reliability of all systems.
• Manage data center infrastructure technology stack end to end – VMWare/VxRail/Citrix/Logic Monitor/Moog Soft/AD/Azure AD SSO, Azure Security Policy/PKI/Windows & Linux Servers/Vulnerability management/Beyond Trust Password Safe and AD-Bridge/Storage & Backup tools etc.
• Ensure adherence to operational standards and best practices.
• Drive the major incidents and potential incidents end to end with periodic updates to client stake holders for approvals/recommendations.
• Lead, mentor, and manage a team of data center operation engineers.
• Provide guidance and support for professional development and performance improvement.
• Coordinate and manage the team's daily activities, ensuring alignment with organizational goals and priorities.
• Lead the response to data center incidents, ensuring timely resolution and minimal impact on business operations.
• Perform root cause analysis and implement preventive measures to avoid recurrence of issues.
• Develop and maintain incident management processes and procedures.
• Plan and oversee scheduled maintenance and upgrades of data center infrastructure.
• Ensure that all hardware and software components are up-to-date and functioning optimally.
• Coordinate with vendors and service providers for maintenance and support activities.
• Monitor and analyze data center resource usage, ensuring efficient utilization and avoiding over-provisioning.
• Conduct capacity planning to support future growth and demand.
• Implement optimization strategies to enhance performance and reduce operational costs.
• Ensure data center infrastructure adheres to security policies, standards, and best practices.
• Implement and maintain security controls to protect data and systems.
• Ensure compliance with regulatory requirements and industry standards (e.g., ISO 27001, HIPAA).
• Develop and implement disaster recovery and business continuity plans for data center operations.
• Ensure regular testing and validation of disaster recovery procedures.
• Ensure data center infrastructure is resilient and can recover quickly from failures or disruptions.
• Work closely with other IT teams, business units, and stakeholders to understand requirements and deliver solutions that meet their needs.
• Collaborate with vendors and service providers to evaluate and integrate new technologies and services.
• Communicate effectively with stakeholders, providing regular updates on data center operations and performance.
• Maintain comprehensive documentation of data center infrastructure, configurations, processes, and procedures.
• Generate regular reports on data center performance, incidents, and operational metrics.
• Ensure documentation is up-to-date and accessible to relevant stakeholders.
Qualifications we seek in you!
Minimum Qualifications / Skills
• Bachelor’s degree in Computer Science, Information Technology, Electrical Engineering, or a related field. Advanced degrees or relevant professional training are a plus.
• Minimum 10 years of experience in data center operations, with at least 5 years in a leadership or senior technical role.
• Extensive experience in data center operations, with a proven track record of managing large-scale data center environments.
• Strong leadership and team management skills, with the ability to motivate and develop a high-performing operations team.
• In-depth knowledge of data center infrastructure, including servers, storage, networking, power, and cooling systems.
• Excellent problem-solving and analytical skills, with the ability to diagnose and resolve complex technical issues.
• Experience with incident and problem management, change management, and capacity planning.
• Strong understanding of compliance, security, and regulatory requirements related to data center operations.
• Effective communication and interpersonal skills, with the ability to interact with stakeholders at all levels.
• Experience in vendor management and contract negotiations.
• A proactive approach to continuous improvement and innovation in data center operations.
Preferred Qualifications/ Skills
• Relevant certifications from Microsoft, VMWare Citrix and Storage vendors are highly desirable.
• Experience with ITIL or other IT service management frameworks.
• Familiarity with cloud computing and hybrid data center environments.
• Excellent communication and collaboration skills, with the ability to effectively interact with technical and non-technical stakeholders at all levels of the organization.
• Strong analytical and problem-solving skills, with the ability to identify root causes of issues and implement effective solutions in a timely manner.
• Proven ability to work independently as well as part of a team, with a proactive and self-motivated attitude towards achieving project goals.
Expert in Active Directory/Azure AD/SSO/App Proxy/PKI (L3/L4)
Expert in VxRail/VMWare and server hardware management in Data centers. (L3/L4)
Expert in Networking - DNS/DHCP & troubleshooting. (L3/L4)
Expert in Windows Server OS troubleshooting. Linux experience would be an add-on. (L3/L4)
Experience in Cloud Compute/Management - AWS/Azure
Experience in Observability (Monitoring/AIOps) - AWS/Azure
Experience in Automation/PowerShell and other scripting solutions.
Good understanding of Laptop/Intune Management.
Good understanding/experience of Backup and Storage