MLOps Engineer

ALIS Software LLC
New York, NY


Role: MLOps Engineer

Location: NYC, NY

Hybrid: 3 Days Onsite

Duration: Long term


Roles & Responsibilities

  • Design, deploy, and operate end‑to‑end production ML pipelines across Dev, QA, and Prod environments.
  • Set up and manage AWS SageMaker pipelines, endpoints, and monitoring for large scale inference workloads, including embedding generation, named entity recognition, reranking, and video processing.
  • Own GPU & CPU infrastructure selection, scaling, & optimization, including instance benchmarking, autoscaling behavior, & load testing.
  • Deploy, monitor, & operate inference services that support hundreds of thousands of queries per day across text, image, & video pipelines.
  • Establish standardized ML deployment patterns at AP, including:
  • Containerization and orchestration strategies
  • Environment isolation (Dev / QA / Prod)
  • Versioned promotion, rollback, and recovery mechanisms
  • Implement monitoring, alerting, drift detection, and evaluation metrics for production ML systems, tracking latency, error rates, throughput, and model/data drift.
  • Enable A/B testing & controlled rollout strategies for ML models in partnership with engineering & product teams.
  • Partner closely with ML Engineers, Data Scientists, DevOps, and Platform teams to:
  • Operationalize new models and pipeline improvements
  • Promote systems across environments safely
  • Ensure deployments meet reliability, scale, and cost targets
  • Manage high-throughput I/O and data movement for large collections of media assets (text, images, video), avoiding CPU, network, and storage bottlenecks.
  • Reduce operational risk by enforcing reproducibility, observability, security, & cost control across production ML systems.
  • This role owns:
  • Deployment, scaling, and runtime operation of ML systems
  • ML infrastructure configuration and orchestration
  • Monitoring, alerting, A/B testing infrastructure, and drift detection
  • Reliability, cost control, and production governance
  • This role does NOT own:
  • Designing model architecture
  • Feature engineering or data science outputs
  • Model accuracy or inference logic (These are owned by ML Engineers and Data Science)
  • Required Skills & Experience
  • Hands‑on experience deploying and operating ML inference systems in production.
  • Experience with AWS SageMaker, including pipelines, endpoints, monitoring, and multi‑environment deployments.
  • Expertise deploying ML models using PyTorch and TensorFlow from an operational and serving perspective.
  • Proven experience with model deployment and orchestration, including containerized inference and autoscaling.
  • Experience selecting, evaluating, and optimizing compute resources (GPU/CPU) for production ML workloads.
  • Experience setting up monitoring, evaluation metrics, and A/B testing frameworks for ML systems in production.
  • Ability to collaborate effectively with ML Engineers, Data Scientists, and platform teams in a shared ownership model.
  • Strongly Preferred
  • Experience running ML workloads over large‑scale text, image, and video datasets.
  • Operational experience supporting ML systems involving Transformer‑based NLP models (e.g., BERT‑family models), Computer vision models, Ranking & reranking systems
  • Familiarity operating systems that use common ML model types such as Convolutional & feed‑forward neural networks, Ranking algorithms, Approximate Nearest Neighbor methods (HNSW)


// // //