Назад
Company hidden
19 часов назад

Staff Network Site Reliability Engineer (AI)

179 500 - 224 300$
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Network Site Reliability Engineer (AI): Building and running the fundamental network infrastructure for a full-stack AI cloud platform with an accent on reliability targets, automation, and scalability. Focus on designing safer change workflows, evolving observability, and solving complex network failures in high-throughput systems.

Location: United States (Must be authorized to work in the US)

Salary: $179,500 - $224,300 USD

Company

hirify.global is building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment.

What you will do

  • Define and own reliability goals for network services and critical paths (SLIs/SLOs, availability targets).
  • Drive reliability improvements for site readiness and inter-site connectivity (DCI).
  • Own incident response, lead investigations, and turn failures into durable fixes.
  • Build and evolve observability via actionable metrics, logs, traces, and alerting.
  • Design safer change workflows, including automation, CI/CD, and canarying for network changes.
  • Collaborate with network and platform teams to embed operability into system designs.

Requirements

  • Strong production Linux fundamentals and a structured approach to debugging complex systems.
  • Solid understanding of networking basics (control plane vs data plane, latency/loss, failure domains).
  • Hands-on experience operating and improving high-availability systems.
  • Ability to write and maintain automation in Go or Python.
  • Experience with modern infrastructure tooling such as IaC, CI/CD, and container platforms.
  • Must be authorized to work in the United States.

Nice to have

  • Experience with load balancers, tunneling, NAT64, or other datapath-heavy systems.
  • Low-level networking performance background (eBPF/XDP, DPDK, kernel networking internals).
  • Experience building network-safe delivery pipelines with automated verification and drift detection.
  • Background in large-scale network observability and routing/flow telemetry.

Culture & Benefits

  • Competitive compensation and benefits packages.
  • Career growth and continuous learning opportunities.
  • Culture of flexibility, ownership, and bold thinking.
  • Opportunity to work on impactful AI projects within an international environment.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →