Назад
4 часа назад

Staff Engineer, Network Observability

207 000 - 275 000$
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Engineer, Network Observability (Network Observability): Define and evolve the technical direction for network observability, building resilient telemetry systems with an accent on scalable collectors, persistence, and alerting across logs/metrics/events/flows. Focus on leading cross-team standardization, making high-leverage architectural tradeoffs for reliability and incident response, and mentoring engineers while operating as a senior escalation point.

Location: Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA

Salary: $207,000–$275,000 (base)

Company

CoreWeave provides cloud infrastructure and tools for building and scaling AI.

What you will do

  • Set technical direction for network observability across multiple teams, aligning platform, data models, and telemetry strategy with long-term goals.
  • Design and evolve scalable observability solutions using collectors (e.g., gNMI, SNMP, Prometheus scraping, OpenTelemetry), persistence (e.g., Loki, ClickHouse), and visualization/alerting (e.g., Grafana, Alertmanager) with a focus on reliability and future scale.
  • Standardize observability patterns and improve signal quality across logs, metrics, events, flows, and related diagnostics.
  • Lead high-leverage technical tradeoffs with engineering leadership to improve resilience, scalability, and operator efficiency.
  • Serve as a go-to expert for critical observability challenges and coordinate during incidents.
  • Mentor engineers through technical reviews and design guidance; participate in RFCs and architectural decisions; join rotating on-call as a senior escalation point.

Requirements

  • Deep expertise building flexible network observability solutions across collectors, distribution, processing, persistence, alerting, analytics, and visualization.
  • Experience as a Network Engineer, SRE, Software Engineer, or Systems Engineer in large-scale environments, with a track record operating observability or infrastructure platforms for multiple teams.
  • Proven ability to lead through ambiguity and make sound architectural and operational tradeoffs balancing near-term needs and long-term maintainability.
  • Strong systems thinking and practical experience designing resilient, scalable solutions that improve visibility and incident response.
  • Proficiency with Python, Go, and Bash; familiarity with configuration management and templating (e.g., Ansible, Jinja2).
  • Hands-on Linux and IP networking knowledge, including routing/switching and network troubleshooting; experience with networking platforms such as SONiC, HPE Junos, NVIDIA Cumulus Linux, Nokia SR OS, or SR Linux.

Nice to have

  • Experience applying machine learning techniques/tools to proactively detect performance or security anomalies in network traffic.
  • Experience with OpenTelemetry, Jaeger, Zipkin, or similar end-to-end tracing tooling.
  • Experience shaping technical roadmaps and leading platform investments that improved reliability or scalability across multiple teams.
  • Network certifications such as CCNA, CCNP, or similar.

Culture & Benefits

  • Medical, dental, and vision insurance fully paid by CoreWeave; company-paid life insurance; disability coverage.
  • 401(k) with generous employer match; Flexible PTO; tuition reimbursement; ESPP participation.
  • Health Savings Account and Flexible Spending Account; mental wellness benefits via Spring Health.
  • Paid parental leave and family-forming support; childcare support with Kinside.
  • Flexible, casual work environment with a culture focused on innovative disruption.

Hiring process

  • Interviews and technical evaluation focused on observability/networking expertise and architectural judgment.
  • Discussion of role fit, experience alignment, and collaboration/mentorship approach.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →