TL;DR
Machine Learning Engineer (AI): Leading the design and development of low-latency Algo inference services and scaling robust real-time decisioning engines with an accent on seamless model deployment, versioning, and A/B testing at runtime. Focus on continuously optimizing latency, throughput, and cost-efficiency in high-performance ML inference systems.
Location: Remote
Company
hirify.global is a Software Development Company specializing in various IT solutions.
What you will do
- Lead the design and development of low-latency Algo inference services handling billions of requests per day.
- Build and scale robust real-time decisioning engines, integrating ML models with business logic.
- Collaborate closely with Data Science to ensure reliable and seamless model deployment into production.
- Design systems for model versioning, shadowing, and A/B testing at runtime.
- Guarantee system scalability, high availability, and comprehensive observability.
- Continuously optimize latency, throughput, and cost-efficiency using modern tooling and techniques.
Requirements
- 5+ years in high-performance backend/ML inference systems.
- Expertise in Python, low-latency APIs and real-time serving frameworks (FastAPI, Triton Inference Server, TorchServe, BentoML).
- Experience with scalable service architecture, message queues (Kafka, Pub/Sub), and async processing.
- Deep understanding of model deployment practices, online/offline feature parity, and real-time monitoring.
- Experience in cloud environments (AWS, GCP, or OCI) and container orchestration (Kubernetes).
- Experience working with In-memory and NoSQL databases (e.g., Aerospike, Redis, Bigtable).
- Experience with observability tools (e.g., Prometheus, Grafana, OpenTelemetry) for monitoring, alerting, and troubleshooting.
- B.Sc. or M.Sc. in Computer Science, Software Engineering, or a related field.
Culture & Benefits
- Remote work options available.
- Operate autonomously and collaborate effectively with cross-functional teams.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →