Назад
Company hidden
1 день назад

AI Solution Architect (AI Infrastructure)

Формат работы
remote (только Europe/Russia)/hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
Serbia/Poland/Cyprus +1 еще
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

AI Solution Architect (AI Infrastructure): Designing and deploying large-scale GPU clusters, containerized training pipelines, and production inference systems with an accent on automation, infrastructure as code, and orchestration. Focus on architecting end-to-end GPU clusters and building scalable IaC modules for high-performance AI workloads.

Location: Hybrid or Remote (Poland, Serbia, Cyprus, Georgia)

Company

hirify.global is a global provider of infrastructure and software solutions for AI, cloud, network, and security, operating 210+ edge locations and 50+ cloud regions.

What you will do

  • Design end-to-end GPU cluster architectures (on-premises and cloud) using Ansible, Terraform, Kubernetes, and Slurm.
  • Lead technical deep-dives, conduct workshops, and present architectural solutions to stakeholders.
  • Build and maintain Infrastructure as Code (IaC) modules to automate provisioning and scaling of GPU resources.
  • Produce technical whitepapers, runbooks, and training materials for clients.
  • Partner with engineering and product teams to translate customer insights into product enhancements.

Requirements

  • 3+ years of experience in Cloud or GPU AI Infrastructure DevOps.
  • Proven track record of deploying multi-node, multi-GPU clusters at scale.
  • Hands-on expertise with Ansible, Terraform, Kubernetes (K8s), and Slurm.
  • Proficiency in Python or Go.
  • Solid understanding of ML ecosystems, models, and production deployment patterns.
  • Must be based in Poland, Serbia, Cyprus, or Georgia

Nice to have

  • Experience deploying high-availability inference infrastructure for production AI workloads.
  • Ability to optimize distributed training pipelines using MLflow, PyTorch, TensorFlow, or JAX.
  • Familiarity with GitOps workflows, Docker, Helm charts, and CI/CD for ML.
  • Knowledge of Hugging Face transformers and Scikit-learn.

Culture & Benefits

  • Competitive compensation and flexible working hours.
  • Hybrid or remote work options depending on the role.
  • Possibility to work from anywhere in the world for up to 45 days per year.
  • Private medical insurance for employees and their families.
  • Extra paid vacation and sick leave days.
  • Language courses, team sports, and modern offices with snacks and drinks.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →