Назад
Company hidden
обновлено 23 часа назад

Lead Systems Reliability Engineer (Linux & Distributed Systems)

Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
UK
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Lead Systems Reliability Engineer (Linux & Distributed Systems): Building and maintaining a high-scale data-driven advertising platform with an accent on performance engineering, hardware optimization, and distributed systems reliability. Focus on tuning Linux kernels, benchmarking next-gen hardware, and designing automation for stateful systems at massive scale.

Location: London

Company

The world’s leading independent platform for digital advertising, helping brands reach audiences across the open internet.

What you will do

  • Lead engineering teams to plan and manage global work streams, systems, and data structures across cloud and traditional datacenters.
  • Design and improve infrastructure automation tailored for stateful systems at scale.
  • Own the operations for Linux-based systems running Aerospike, Kafka, and MongoDB.
  • Review new use cases and serve as a technical point of contact within an on-call rotation.
  • Benchmark and analyze next-generation hardware offerings to optimize system throughput.

Requirements

  • Deep expertise in the Linux operating system.
  • Proven leadership experience and the ability to mentor other engineers.
  • Advanced troubleshooting skills using the scientific method to isolate CPU and IO bottlenecks.
  • Must be based in London

Nice to have

  • Experience with physical on-prem hardware internals and operations.
  • Background in performance testing and tuning.
  • Knowledge of relational or NoSQL databases.
  • Experience with Ansible, PyInfra, Chef, Prometheus, or Kubernetes.
  • Proficiency in Python, Ruby, Rust, Bash, Golang, or C#.

Culture & Benefits

  • Opportunity to work with bleeding-edge hardware, including nodes with 300TB NVMe and 512 cores.
  • Direct collaboration with major vendors like AMD to run PoCs and optimize technology.
  • Inclusive and diverse global team environment that encourages curiosity and critical thinking.
  • Work on systems with massive scale, processing over 5MM QPS per node.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →