AI Solution Architect (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
AI Solution Architect (GPU Infrastructure): Designing and deploying large-scale GPU clusters and production inference systems with an accent on automation, infrastructure as code, and orchestration. Focus on building scalable containerized training pipelines and ensuring seamless deployment across thousands of GPUs.
Location: Poland, Serbia, Lithuania, Cyprus, Georgia. Remote, hybrid, or office options available depending on the role.
Company
Global provider of infrastructure and software solutions for AI, cloud, network, and security, powering digital experiences with a vast network of edge locations and GPUs.
What you will do
- Design end-to-end GPU cluster architectures for on-premises and cloud environments using Ansible, Terraform, Kubernetes, and Slurm.
- Develop and maintain Infrastructure as Code (IaC) modules to automate the provisioning, scaling, and monitoring of GPU resources.
- Lead technical deep-dives, workshops, and present architectural solutions to stakeholders.
- Collaborate with engineering and product teams to integrate customer insights into product enhancements.
- Produce technical whitepapers, runbooks, and training materials while hosting webinars.
Requirements
- 3+ years of experience in Cloud or GPU AI Infrastructure DevOps.
- Proven track record of deploying large-scale, multi-node, multi-GPU clusters.
- Proficiency in Python or Go.
- Hands-on experience with Ansible, Terraform, Kubernetes (K8s), and Slurm.
- Solid understanding of ML ecosystems, tooling, and production deployment patterns.
- Excellent verbal and written communication skills for translating complex technical concepts.
Nice to have
- Experience deploying high-availability inference infrastructure for production AI workloads.
- Skill in optimizing distributed training and inference pipelines with MLflow, PyTorch, TensorFlow, or JAX.
- Familiarity with GitOps workflows, Docker, Helm charts, and CI/CD for ML.
- Knowledge of Hugging Face transformers and Scikit-learn.
Culture & Benefits
- Flexible working hours and choice of remote, hybrid, or office work.
- Travel perk: work from anywhere in the world for up to 45 days per year.
- Private medical insurance for employees and their families.
- Additional paid vacation and sick leave days.
- Professional development support, including language classes.
- Modern office space with free snacks, drinks, and team sports activities.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →