Computer Vision Engineer

Stem IT
Ashburn, VA

We are seeking a mission-driven Senior Computer Vision Engineer to support a high-impact YOLO-based model design and implementation. In this role, you will join a multidisciplinary team tasked with building a system for detecting, assessing, and tracking drugs and weapons being trafficked into the United States. This position combines technical innovation with national security, enabling data-driven insights that will keep our country safer.


This is a full time, long term initiative. Hybrid in Ashburn, VA.


Key Responsibilities:

  • Model Training & Tuning: Develop and fine-tune YOLO architectures for maximum accuracy (mAP) and efficiency.
  • Video Tracking: Implement robust single-camera tracking using algorithms like ByteTrack, BoT-SORT, or DeepSORT.
  • Multi-Camera Re-ID: Architect systems for Multi-Target Multi-Camera Tracking (MTMCT) to maintain object persistent IDs across non-overlapping camera views.
  • Condition Optimization: Use advanced data augmentation and domain adaptation techniques to ensure models perform reliably in diverse weather and lighting.
  • Performance Engineering: Optimize model inference using TensorRT, OpenVINO, or ONNX to achieve real-time performance on target hardware.
  • Pipeline Development: Build and maintain automated MLOps pipelines for data labeling, model validation, and deployment.


Required Skills & Qualifications:

  • Clearance: must be able to obtain a TS/SCI (Client will sponsor)
  • Computer Vision Mastery: 3+ years of experience with object detection, segmentation, and tracking.
  • YOLO Expertise: Deep familiarity with the Ultralytics ecosystem and recent YOLO iterations (v8, v10, v11, or 2026 research models).
  • Programming: High proficiency in Python and C++, with experience in PyTorch or TensorFlow.
  • Libraries: Hands-on experience with OpenCV, Filter-based tracking (Kalman Filters), and Re-ID feature embedding libraries.
  • Data Engineering: Proficiency in building custom datasets, handling class imbalance, and implementing synthetic data generation.
  • Hardware: Experience deploying on NVIDIA Jetson/Orin, GPUs, or mobile edge devices.


Preferred Qualifications:

  • Computer Science degree or a related field.
  • Experience with Spatio-Temporal constraints for camera-to-camera hand-off logic.
  • Knowledge of GStreamer or NVIDIA DeepStream for high-throughput video pipelines.
// // //