TL;DR
Senior ML Platform Engineer (AI): Designing, building, and operating production-grade ML/data platforms with an accent on serving, reliability, and developer experience. Focus on owning real-time and batch inference on SageMaker, implementing ultra-low-latency serving patterns with Redis/Valkey, and establishing model lifecycle governance.
Location: Hybrid model in Canada (Toronto or Montreal), with 2 days/week in-office attendance required.
Company
hirify.global is the #1 loyalty app for mobile gamers, rewarding users for discovering and playing new mobile games.
What you will do
- Design, build, and operate standardized training-to-serving pipelines using Airflow for SageMaker endpoints.
- Manage real-time and batch inference on SageMaker, including multi-model endpoints, autoscaling, and deployment strategies.
- Implement ultra-low-latency serving patterns with Redis/Valkey for feature caching and online retrieval.
- Provision and manage ML/data infrastructure using Terraform for AWS resources and observability stacks.
- Establish and manage model lifecycle governance, including registries, approval workflows, and audit trails.
- Implement end-to-end observability for ML workflows, covering data freshness, drift/quality, and performance SLOs.
Requirements
- 5+ years of experience building and operating production-grade ML/data platforms.
- Strong software engineering skills in Python, Go, or Java, building resilient services and APIs.
- Deep experience with AWS SageMaker inference (endpoint configuration, containerization, autoscaling).
- Expertise with online feature stores like Redis/Valkey in ML serving contexts.
- Proven Terraform experience managing ML and data infrastructure end-to-end.
- Airflow orchestration at scale, including dependency modeling, sensors, and integrations.
Nice to have
- Familiarity with ML frameworks (scikit-learn, XGBoost, PyTorch, TensorFlow) from a platform-integration perspective.
- Knowledge of GitOps patterns.
Culture & Benefits
- Inviting and fun work environment with team lunches, game nights, and company-wide events.
- Culture rooted in growth, supported by smart, dynamic, and enthusiastic people.
- Data-driven approach to learning, improving, and adapting.
- Environment encouraging idea sharing, pushing boundaries, and calculated risks.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →