Назад
Company hidden
2 дня назад

Network Reliability Engineer (AI)

210 000 - 240 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Network Reliability Engineer (AI): Designing and operating the global network and reliability layer for a high-performance private supercomputer with an accent on distributed compute, ML workloads, and real-time analytics. Focus on building scalable network architecture, automating infrastructure, and ensuring mission-critical system reliability.

Location: Must be based in San Francisco, California (On-site)

Salary: $210,000 – $240,000

Company

hirify.global is a pioneering Causal AI platform helping Fortune 100 enterprises prove business outcomes using trusted, causal evidence.

What you will do

  • Architect and operate scalable, secure network architecture for large-scale machine learning workloads.
  • Own network device configuration management end to end to ensure consistency and reliability.
  • Improve system and network performance through automation, observability, and proactive capacity planning.
  • Implement and manage complex network protocols including BGP, VPNs, and external peering.
  • Build and maintain comprehensive monitoring, alerting, and incident response systems.
  • Partner across engineering and data science to drive a culture of performance and reliability.

Requirements

  • 8+ years in network or infrastructure engineering, with 5+ years in datacenter operations.
  • Extensive hands-on experience with network devices (firewalls, switches, load balancers) and protocols like BGP, QoS, MPLS, and IPsec.
  • Experience designing and operating modern datacenter network fabrics (spine-leaf, EVPN/VXLAN, ECMP).
  • Proficiency in network automation and IaC tooling (Ansible, Terraform, Nornir) and IPAM/DCIM platforms.
  • Strong operational experience with Linux-based production infrastructure and Kubernetes networking.
  • Solid scripting skills in Python or Bash for debugging and automation.

Nice to have

  • Experience with NVIDIA networking technologies (Cumulus Linux, InfiniBand, Spectrum-X, BlueField DPUs).
  • Familiarity with data-intensive platforms like Spark, Airflow, or Kafka.
  • Experience with storage network protocols such as NFS, LustreFS, or iSCSI.
  • Background in high-compliance or SOC 2 environments.

Culture & Benefits

  • Work on cutting-edge infrastructure including one of the world's fastest private supercomputers.
  • High-impact role with ownership over architecture decisions for Fortune 100-scale systems.
  • Generous equity program to ensure meaningful ownership.
  • Transparent compensation philosophy based on real-time market data.
  • Collaborative environment with top-tier engineering talent.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →