Company hidden

3 месяца назад

Tech Lead (Network Observability)

180 000 - 260 000$

Формат работы

onsite

Тип работы

fulltime

Грейд

lead

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Tech Lead (Network Observability): Leading the architecture and development of a high-performance network monitoring platform for RDMA, RoCE, and InfiniBand networks with an accent on cross-stack observability and telemetry pipelines. Focus on optimizing low-latency data collection, designing scalable backend services, and troubleshooting complex network layers for AI GPU clusters.

Location: Onsite in Palo Alto, California

Salary: $180,000 - $260,000

Company

hirify.global is pioneering software-driven AI fabrics to increase GPU cluster utilization through cross-stack observability and performance acceleration.

What you will do

Lead the architecture, design, and development of scalable network monitoring platforms for RDMA, RoCE, InfiniBand, and TCP/IP infrastructure.
Build backend telemetry services, observability dashboards, alerts, diagnostics, and anomaly detection workflows.
Troubleshoot complex production issues across application, OS, server, and network layers.
Establish engineering standards, drive automation, and define technical roadmaps with cross-functional teams.
Mentor engineers on distributed systems and high-performance networking best practices.

Requirements

Degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field.
Strong programming experience in C++, Go, Python, or Rust.
Proven experience leading engineering teams or complex infrastructure projects.
Hands-on experience with RDMA, RoCE, InfiniBand, and Linux networking (TCP/IP, routing, congestion control).
Experience with monitoring and visualization tools such as Prometheus, Grafana, Datadog, or OpenTelemetry.
Must be able to work onsite in Palo Alto, California

Nice to have

Experience supporting AI/ML, HPC, or GPU cluster infrastructure workloads.
Knowledge of eBPF, XDP, DPDK, or Linux kernel networking tools (tcpdump, Wireshark, ethtool).
Experience with Kubernetes, cloud infrastructure (AWS, GCP, Azure), and infrastructure automation.
Experience designing time-series data systems and high-cardinality telemetry platforms.

Culture & Benefits

Competitive compensation and eligibility for the company's equity program.
Catered lunch.
Friendly and inclusive workplace culture.
Comprehensive benefits package.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Похожие вакансии

Company hidden

7 дней назад

remote (USA)/hybrid/onsite

united_states

lead

Tech Lead (Network Observability)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Lead Software Engineer (AI)

Software Engineer (Scalable Systems)

Staff Software Engineer (Virtualization)

Senior Lead Software Engineer (Network Infrastructure)

Software Engineer, Cloud/Backend (Cybersecurity)

Software Engineer – Scalable Systems

Разработка

Game Dev

Design и Creative

Аналитика

Менеджмент

People & Business

Tech Lead (Network Observability)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Categories

Похожие вакансии

Lead Software Engineer (AI)

Software Engineer (Scalable Systems)

Staff Software Engineer (Virtualization)

Senior Lead Software Engineer (Network Infrastructure)

Software Engineer, Cloud/Backend (Cybersecurity)

Software Engineer – Scalable Systems