Назад
Company hidden
6 дней назад

Senior AI Infrastructure Engineer (AI)

66 500 - 104 500
Формат работы
remote (только Europe)
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
Portugal/Germany
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior AI Infrastructure Engineer (AI): You will own the infrastructure that brings AI models to life in production, optimizing LLM inference, deploying real-time voice AI agents, and scaling GPU clusters. Focus on inference optimization, real-time video processing, model serving at scale, and GPU workload orchestration.

Location: Europe. Must be based in Portugal or Germany.

Salary: €66,500 - €104,500 a year

Company

hirify.global is shifting healthcare from human-first to AI-first through its AI Care platform, making world-class healthcare available anytime, anywhere.

What you will do

  • Design, build, and maintain the inference infrastructure that powers hirify.global's AI products, ensuring high throughput, low latency, and cost efficiency.
  • Own the end-to-end deployment pipeline for AI models - from real-time computer vision to large language models.
  • Architect and scale Kubernetes clusters for GPU-accelerated workloads, including autoscaling strategies and resource scheduling.
  • Build and operate the infrastructure behind hirify.global's real-time AI agents, including WebRTC cluster provisioning.
  • Drive inference scaling strategies and evaluate emerging AI infrastructure tools to keep hirify.global at the cutting edge.
  • Collaborate with ML Engineers, Data Scientists, and Product teams to translate model requirements into robust, production-ready infrastructure.

Requirements

  • 5+ years of experience in infrastructure engineering, with at least 2 years focused on AI/ML workloads in production environments.
  • Strong experience with Kubernetes for orchestrating GPU-accelerated workloads.
  • Hands-on experience with model serving and inference optimization frameworks for both real-time computer vision and large language model workloads.
  • Solid understanding of LLM inference optimization techniques.
  • Experience with Infrastructure as Code (Terraform or similar) and GitOps methodologies for managing complex, GPU-enabled environments.
  • Fluent in English (written and oral).

Nice to have

  • Experience with LLM serving engines such as vLLM, SGLang, or LLM-D.
  • Experience with NVIDIA Triton Inference Server and TensorRT for real-time computer vision workloads.
  • Experience with Istio or similar service mesh.
  • Experience provisioning infrastructure on AWS, Azure, or GCP.

Culture & Benefits

  • A stimulating, fast-paced environment with lots of room for creativity.
  • Career development and growth, with a competitive salary.
  • A flexible environment where you can control your hours (remotely) with unlimited vacation.
  • Remote or Hybrid work policy.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →