Назад
Company hidden
2 дня назад

Network Engineer, Capacity And Efficiency (AI)

320 000 - 405 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
middle/senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Network Engineer, Capacity And Efficiency (AI): Owns the cost, utilization, and attribution story for non-accelerator infrastructure with an accent on network, compute, and storage backbone. Focus on building the network observability stack, hunting for efficiency, and driving cost attribution.

Location: San Francisco, CA or New York City, NY. Currently, we expect all staff to be in one of our offices at least 25% of the time.

Salary: $320,000 - $405,000 USD

Company

hirify.global’s mission is to create reliable, interpretable, and steerable AI systems.

What you will do

  • Build the network observability stack by designing and deploying telemetry pipelines.
  • Analyze inter-region traffic patterns, identify hot links and stranded capacity, and quantify the dollar impact.
  • Design and operate traffic classification, marking, and shaping across the backbone.
  • Tie network spend back to the teams and workloads that generate it.
  • Partner across the company to influence teams to achieve outcomes.
  • Extend our intent-based network configuration systems and write the tooling that turns your efficiency findings into safe, reviewable, and impactful changes.

Requirements

  • Have 5+ years operating large-scale production networks.
  • Be genuinely fluent across the stack: BGP, ECMP, VXLAN/EVPN or equivalent overlays, QoS, and L1/optical basics.
  • Know at least one major CSP’s networking model deeply — AWS or GCP — and understand how their overlays interact with physical underlays.
  • Have built or operated network telemetry at scale.
  • Comfortable writing Python or Go to build tooling, telemetry pipelines, infrastructure-as-code, config management for network devices and automation, that you’ll ship to production.
  • Think quantitatively by default and communicate crisply.

Nice to have

  • SRE experience for large-scale network infrastructure.
  • Background on a cloud provider's networking team or a cloud networking product team.
  • Familiarity with AI/ML infrastructure traffic patterns.
  • Experience with HPC fabrics like InfiniBand, RoCE v2, lossless Ethernet, or custom high-radix topologies and an understanding of how job placement, congestion management, and adaptive routing interact at scale.
  • Background in traffic engineering for large backbones and the operational judgment to know when TE is worth the complexity.
  • Hands-on time with multi-cloud connectivity.
  • Experience building cost/chargeback systems for shared infrastructure, or FinOps exposure in a large cloud environment.

Culture & Benefits

  • Competitive compensation and benefits.
  • Optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours.
  • Lovely office space in which to collaborate with colleagues.

Hiring process

  • We encourage you to apply even if you do not believe you meet every single qualification.
  • We think AI systems like the ones we're building have enormous social and ethical implications.
  • We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →