Назад
Company hidden
обновлено 7 дней назад

Senior ML Platform / ML Infrastructure Engineer II (AI)

Формат работы
hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
Canada
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior ML Platform Engineer (AI): Designing, building, and operating production-grade ML/data platforms with an accent on serving, reliability, and developer experience. Focus on owning real-time and batch inference on SageMaker, implementing ultra-low-latency serving patterns with Redis/Valkey, and establishing model lifecycle governance.

Location: Hybrid model in Canada (Toronto or Montreal), with 2 days/week in-office attendance required.

Company

hirify.global is the #1 loyalty app for mobile gamers, rewarding users for discovering and playing new mobile games.

What you will do

  • Design, build, and operate standardized training-to-serving pipelines using Airflow for SageMaker endpoints.
  • Manage real-time and batch inference on SageMaker, including multi-model endpoints, autoscaling, and deployment strategies.
  • Implement ultra-low-latency serving patterns with Redis/Valkey for feature caching and online retrieval.
  • Provision and manage ML/data infrastructure using Terraform for AWS resources and observability stacks.
  • Establish and manage model lifecycle governance, including registries, approval workflows, and audit trails.
  • Implement end-to-end observability for ML workflows, covering data freshness, drift/quality, and performance SLOs.

Requirements

  • 5+ years of experience building and operating production-grade ML/data platforms.
  • Strong software engineering skills in Python, Go, or Java, building resilient services and APIs.
  • Deep experience with AWS SageMaker inference (endpoint configuration, containerization, autoscaling).
  • Expertise with online feature stores like Redis/Valkey in ML serving contexts.
  • Proven Terraform experience managing ML and data infrastructure end-to-end.
  • Airflow orchestration at scale, including dependency modeling, sensors, and integrations.

Nice to have

  • Familiarity with ML frameworks (scikit-learn, XGBoost, PyTorch, TensorFlow) from a platform-integration perspective.
  • Knowledge of GitOps patterns.

Culture & Benefits

  • Inviting and fun work environment with team lunches, game nights, and company-wide events.
  • Culture rooted in growth, supported by smart, dynamic, and enthusiastic people.
  • Data-driven approach to learning, improving, and adapting.
  • Environment encouraging idea sharing, pushing boundaries, and calculated risks.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →