Senior DevOps Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior DevOps Engineer (AI): Ensuring the reliability and scalability of high-load production systems with an accent on Kubernetes orchestration, monitoring, and infrastructure automation. Focus on building robust SLIs/SLOs, optimizing cloud resources across GCP/AWS, and supporting complex on-prem deployments for generative AI products.
Company
builds award-winning AI products, including voice assistants and agentic architectures, with a focus on privacy and on-premises deployment.
What you will do
- Manage system reliability by defining SLIs/SLOs and eliminating performance bottlenecks.
- Design and maintain comprehensive monitoring, alerting, and Grafana dashboards.
- Conduct load testing and capacity planning to ensure system scalability.
- Lead incident investigations, on-call rotations, and postmortem processes.
- Develop and maintain Kubernetes-based infrastructure on GCP and AWS.
- Collaborate with developers to improve CI/CD pipelines and support on-prem customer deployments.
Requirements
- 5+ years of experience in SRE or DevOps roles managing high-load production systems.
- Deep practical expertise in Docker and Kubernetes in production environments.
- Strong proficiency in Prometheus, Alertmanager, and Grafana.
- Solid experience with Python for automation and tooling.
- Operational knowledge of GCP or AWS, Linux, and networking.
- Must be based in Europe for this fully remote role.
Nice to have
- Experience with GPU/ML serving technologies like Triton, vLLM, or run:ai.
- Knowledge of real-time telephony (SIP, WebRTC) or streaming data (Kafka, ClickHouse).
- Deep experience with GitOps practices (ArgoCD) and secure environment deployments.
Culture & Benefits
- Fully remote work environment across Europe.
- High engineering standards with a focus on production-grade solutions.
- 21 days of vacation plus public holidays and 5 sick days.
- Private English lessons provided via Preply.
- Fast-paced startup environment with enterprise-level stability and revenue.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →