TL;DR
SRE / MLOps Engineer (Ray.io): Maintaining, deploying, and improving AI/ML platform services using Ray.io with an accent on DevOps practices and automation. Focus on ensuring robust, scalable, and highly available distributed ML systems.
Location: Remote
Company
hirify.global is a global software development service company that helps businesses across the world develop successful software products.
What you will do
- Design, implement, and maintain CI/CD pipelines for AI/ML platform services.
- Manage and troubleshoot Kubernetes clusters, Docker containers, and cloud infrastructure.
- Ensure high availability (99.999%), system reliability, and security across platforms.
- Automate operational tasks, monitoring, and deployment workflows.
- Deploy and maintain Ray.io clusters, ensuring workload scheduling and distributed job reliability.
- Collaborate with developers to integrate distributed ML pipelines into automated CI/CD workflows.
Requirements
- Strong Python and C++ development experience (2–4 years).
- Hands-on experience with Ray.io: cluster deployment, workload management, distributed task scheduling.
- Familiarity with Ray ecosystem libraries and integration with ML tooling.
- Solid understanding of Kubernetes, Docker, Linux fundamentals, and DevOps practices.
- Fluent in English (spoken and written).
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →