Infrastructure Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Infrastructure Engineer (AI/SRE): Building and shipping agentic automation and internal tools to solve complex infrastructure problems with an accent on AI-powered workflows and platform reliability. Focus on designing agentic incident response, managing EKS clusters, and optimizing a global bare-metal CDN.
Location: Hybrid in London (2-3 days/week). Remote considered for strong candidates outside London.
Company
is a global streaming service, production company and film distributor dedicated to elevating great cinema.
What you will do
- Design and build agentic workflows for infrastructure operations, including incident response, observability, and support ticket triaging.
- Build and maintain MCP servers and AI-powered automation that connects various infrastructure systems.
- Design, run, and evolve EKS clusters as the primary platform for all services.
- Operate and extend an in-house bare-metal CDN spanning multiple global locations and manage AWS infrastructure via Terraform and Chef.
- Build CI/CD pipelines (Jenkins, ArgoCD, Helm) and Kubernetes operators to automate platform workflows.
- Define and uphold SLO commitments to ensure system availability and reliability.
Requirements
- 5+ years of experience in infrastructure, platform engineering, or DevOps.
- Strong software engineering skills with a track record of building and shipping tools in the infrastructure space.
- Deep expertise in Kubernetes internals, Linux, and networking, including bare-metal server automation.
- Proficiency with Infrastructure as Code (Terraform) and distributed systems.
- Security-minded approach to integrating LLM APIs and agentic tools into infrastructure.
- Technical leadership experience in architecting solutions and mentoring other engineers.
Nice to have
- Experience building Kubernetes operators, controllers, or custom platform abstractions.
- Background in backend or full-stack development (Ruby, Python, or Go).
- Experience with Kafka, event-driven architectures, or data pipeline design.
- Experience building with LLM APIs, MCP, or agentic frameworks.
- Domain knowledge in video streaming.
Culture & Benefits
- Small team environment with high autonomy and direct impact.
- SRE-driven culture focused on eliminating toil through automation.
- Flexible hybrid work model with potential for remote arrangements for exceptional talent.
- Inclusive environment committed to diversity and equal opportunity employment.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →