IBM Tivoli/IBM Workload Scheduler (TWS/IWS)

Sira Consulting, an Inc 5000 company
Warsaw, IN

Job Title: Workload Automation (IBM Tivoli-TWS/IWS)

Location: Warsaw, IN- Onsite

Job Type: Contract

Duration: 6 Months


Role Overview

We are seeking a Lead Workload Automation Engineer / Architect to define and drive the enterprise architecture, strategy, and operational model for IBM Tivoli/IBM Workload Scheduler (TWS/IWS) across distributed environments (on-prem and cloud). This role sets platform standards and reference designs, leads modernization and major upgrades/migrations, governs reliability and security practices, and serves as the senior technical partner for application, databases, and infrastructure organizations to deliver resilient, scalable scheduling services for mission-critical workloads. In addition, assist and supervise two job scheduling teams.

Key Responsibilities

• Own the end-to-end architecture for the TWS/IWS platform (components, topology, environments, integrations), including standards, patterns, and reference implementations.

• Provide technical oversight for additional (3rd-party) job scheduling platforms where used; establish operating standards, integration patterns, and support processes to ensure consistent controls and reliability.

• Lead enterprise-scale installations, upgrades, and migrations; define cutover/rollback strategies, coordinate change windows, and ensure readiness across dependent teams.

• Lead assessments of legacy scheduler instances and batch frameworks to identify candidates for retirement, consolidation, or migration; produce target-state recommendations, sequencing/roadmaps, and risk-based migration plans.

• Define reliability engineering practices for workload automation: availability targets, capacity planning, performance tuning, monitoring/alerting, and continuous improvement.

• Design and validate high-availability and disaster recovery solutions (including DB2 HADR where applicable); plan and execute regular DR tests and remediate gaps.

• Establish governance for workload onboarding and job design: scheduling standards, dependency modeling, naming conventions, calendars, critical path optimization, and SLA/SLO management.

• Architect and productionize automation for platform operations and self-service (e.g., provisioning, reporting, batch controls) using shell/Python/Perl and enterprise tooling.

• Own security and compliance posture: access model (LDAP/SSO), least-privilege controls, audit evidence, vulnerability remediation, and secure configuration baselines.

• Manage and develop two teams (e.g., platform engineering and operations): set priorities and operating rhythms, oversee delivery and support outcomes, coach/mentor team members, and drive performance management in partnership with leadership.

• Be available for major outages and critical events related to job scheduling, including QEND activities up to four (4) times per fiscal year, providing incident leadership, stakeholder communications, and post-incident follow-up.

• Participate in an on-call rotation and provide after-hours/weekend support as needed to maintain scheduling availability and meet business SLAs.

• Support a global operating model by working flexibly across EMEA and US business hours to provide required coverage and stakeholder overlap.

• Serve as escalation point for complex incidents; lead root-cause analysis and drive problem management to prevent recurrence.

• Mentor and guide engineers; lead technical design reviews, documentation/runbook standards, and knowledge sharing across the organization.

• Deep dive into other job scheduling teams like Automate, AS400 and Robot and assist in supervising these teams in IT Operations.

Required Qualifications

• High School Diploma or equivalent

• 10+ years of experience in enterprise workload automation, including 7+ years of hands-on IBM TWS/IWS/IWA administration in distributed environments.

• Bachelor’s degree or 10+ years of equivalent IT industry service experience

• For senior/lead equivalent roles, 8+ years of relevant ITSM/major incident operations experience may be required.

• IT Technology Certification is a plus

• Proven experience in a lead/architect capacity: defining platform standards/reference designs, guiding cross-team implementations, and making architecture decisions for reliability, scalability, and security.

• Strong Linux/UNIX engineering and production troubleshooting experience, including performance and availability triage.

• Advanced automation/scripting skills (shell plus Python and/or Perl) with experience building supported, maintainable operational tooling.

• Demonstrated ability to lead complex incident response and root-cause analysis, and to drive preventative action through problem management.

• Strong change leadership in regulated production environments (planning, risk management, implementation, validation, rollback) aligned with ITIL processes.

• Excellent stakeholder communication and ability to influence across applications, database, infrastructure, and security teams.

Preferred Qualifications

• DB2 administration experience, including High Availability Disaster Recovery (HADR); familiarity with Oracle/Postgres and SQL.

• Experience with TWS/IWS integrations and APIs (REST/SOAP), event-based scheduling, and real-time/on-demand workload patterns.

• Experience with Tivoli Dynamic Workload Console (TDWC/TDWB) and critical path monitoring.

• Experience integrating file transfer solutions (e.g., SFTP/PGP/GPG, managed file transfer platforms) into batch workflows.

• Experience with SAP and other enterprise application integrations via TWS extended agents.

• Experience building dashboards/metrics and integrating with observability platforms (e.g., Grafana/Graphite).

• Experience defining platform standards, leading upgrades/migrations, and coordinating cross-team delivery (e.g., change windows, cutovers, rollback planning).

• Familiarity with cloud patterns and automation (e.g., infrastructure-as-code concepts, container/VM scheduling considerations) in support of workload modernization.

• Hands-on experience across ITSM processes (Incident, Problem, Change, Knowledge) in an enterprise environment.

• ServiceNow experience, including incident lifecycle management, documentation standards, and reporting.

• Working knowledge of ITIL concepts and IT service management best practices.

• Artificial Intelligence – Navigating all the AI APP – know how to communicate with it and know when not to use it when it does not meet your or the companies’ expectations

• Strong analytical and problem-solving skills to investigate issues and drive resolution.

• Ability to manage multiple tasks in a high-volume, high-urgency operations environment.

• Strong written and verbal communication skills, including confident facilitation on conference bridges.

• Able to write and review technical documentation and knowledge articles.

Skills & Tools

• Workload Automation: IBM TWS/IWS/IWA, TDWC/TDWB, dynamic scheduling, JSDL

• Operating Systems: Linux, UNIX (AIX/SunOS), Windows (agent support)

• Databases: DB2 (HADR), Oracle/Postgres (familiarity)

• Scripting: Shell, Python, Perl

• ITSM/Monitoring: ITIL processes; integrations with tools such as ServiceNow, AppDynamics, OBM, Grafana/Graphite

• Security: LDAP/SSO concepts, role-based access, audit/patch compliance

// // //