Lead Systems Reliability Engineer (Linux & Distributed Systems)

Тип работы

fulltime

Грейд

lead

Английский

Страна

Вакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Lead Systems Reliability Engineer (Linux & Distributed Systems): Building and maintaining a high-scale data-driven advertising platform with an accent on performance engineering, hardware optimization, and distributed systems reliability. Focus on tuning Linux kernels, benchmarking next-gen hardware, and designing automation for stateful systems at massive scale.

Location: London

Company

The world’s leading independent platform for digital advertising, helping brands reach audiences across the open internet.

What you will do

Lead engineering teams to plan and manage global work streams, systems, and data structures across cloud and traditional datacenters.
Design and improve infrastructure automation tailored for stateful systems at scale.
Own the operations for Linux-based systems running Aerospike, Kafka, and MongoDB.
Review new use cases and serve as a technical point of contact within an on-call rotation.
Benchmark and analyze next-generation hardware offerings to optimize system throughput.

Requirements

Deep expertise in the Linux operating system.
Proven leadership experience and the ability to mentor other engineers.
Advanced troubleshooting skills using the scientific method to isolate CPU and IO bottlenecks.
Must be based in London

Nice to have

Experience with physical on-prem hardware internals and operations.
Background in performance testing and tuning.
Knowledge of relational or NoSQL databases.
Experience with Ansible, PyInfra, Chef, Prometheus, or Kubernetes.
Proficiency in Python, Ruby, Rust, Bash, Golang, or C#.

Culture & Benefits

Opportunity to work with bleeding-edge hardware, including nodes with 300TB NVMe and 512 cores.
Direct collaboration with major vendors like AMD to run PoCs and optimize technology.
Inclusive and diverse global team environment that encourages curiosity and critical thinking.
Work on systems with massive scale, processing over 5MM QPS per node.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →