Machine Learning Operations Engineer

Smart IT Frame LLC
Dallas, TX

Job Title: Machine Learning Operations Engineer

Location: Dallas ,TX or Pittsburg PA





Skill

  • s:Apache Kaf
  • kaHadoop Hi
  • vePand
  • asPyth


on
Position Descripti

  • on:We are seeking an experienced MLOps Engineer with strong expertise in Python and big data technologies to join our team. This role focuses on operational excellence, including optimizing feature engineering pipelines and maintaining machine learning models in production environments. Desired candidate will work closely with platform and data science teams to ensure scalable, reliable, and high-performance ML workflows using existing framewor
  • ks.This position will be performed onsite five days a week from any our client sites in Dallas, t/Strongsville, OH/Pittsburg,


PA
Future Duties and Responsibilit

  • ies:Optimize and maintain large-scale feature engineering pipelines using PySpark, Pandas, and PyArrow on Hadoop-based infrastruct
  • ure.Refactor and modularize ML codebases to enhance reusability, maintainability, and performa
  • nce.Collaborate with platform teams on compute capacity planning, resource allocation, and system upgra
  • des.Integrate with existing model serving frameworks to support testing, deployment, and rollback proces
  • ses.Monitor and troubleshoot production ML pipelines, ensuring high reliability, low latency, and cost efficie
  • ncy.Contribute to internal ML platforms by sharing insights, proposing improvements, and documenting best practi
  • ces.Build near real-time ML pipelines using Kafka and Spark Stream
  • ing.Work with AWS and SageMaker MLOps ecosys


tem.
Required Qualifications to be Successful in this

  • role:6+ years of experience in software engineering, data engineering, or MLOps r
  • oles.Strong programming expertise in Python, with hands-on experience in Pandas, PySpark, and PyA
  • rrow.Deep understanding of the Hadoop ecosystem, distributed computing, and performance tu
  • ning.Experience with CI/CD pipelines and best practices in ML environm
  • ents.Hands-on experience with monitoring tools for ML pipeline health and perform
  • ance.Strong collaboration skills with experience working in cross-functional teams (platform, data science, engineer
  • ing).Experience contributing to or building internal MLOps frameworks/platf
  • orms.Familiarity with SLURM clusters or other distributed job schedu
  • lers.Exposure to Kafka, Spark Streaming, or other real-time data processing technolo
  • gies.Understanding of ML lifecycle management, including versioning, deployment, and drift detec


tion.
// // //