Назад
Company hidden
4 дня назад

Senior/Staff Site Reliability Engineer (AI)

180 000 - 250 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior/lead
Английский
b2
Страна
US
Релокация
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior/Staff Site Reliability Engineer (AI): Managing and operating Kubernetes infrastructure at scale with an accent on AI-driven automation, networking, and deployment reliability. Focus on building CI/CD pipelines, optimizing cluster performance, and driving system-wide reliability through infrastructure-as-code and advanced monitoring.

Location: Onsite in San Francisco, CA. Visa sponsorship and relocation to San Francisco provided.

Salary: $180,000–$250,000 plus equity.

Company

An innovative technology company building AI-driven infrastructure and development tools.

What you will do

  • Own and operate Kubernetes infrastructure, including lifecycle management, upgrades, and networking.
  • Build and maintain robust CI/CD pipelines and deployment infrastructure.
  • Leverage AI to automate production issue resolution and enhance software reliability.
  • Define SLOs and develop incident response processes.
  • Manage load balancing, service mesh configurations, and anomaly detection systems.

Requirements

  • 5+ years of experience in managing critical production systems and software development workflows.
  • Strong production experience with Kubernetes at scale and infrastructure-as-code (Terraform, Ansible).
  • Deep knowledge of Linux networking, container networking (CNI, VXLAN, BGP), and DNS.
  • Proficiency in Python and either Go or Bash for automation.
  • Experience with logging, monitoring, and alerting (Prometheus, Grafana, Datadog).
  • Excellent communication skills and ability to drive technical decisions across teams.

Nice to have

  • Experience with GPU and AI/ML workload management.
  • Knowledge of kernel-based monitoring like eBPF and XDP.
  • Experience with security tooling (hirify.globalco, SIEM) and distributed storage systems like Ceph.

Culture & Benefits

  • Comprehensive health, dental, and vision insurance coverage.
  • Equity package as part of compensation.
  • Opportunities for professional learning and growth.
  • Regular team events and offsites to build culture.
  • Visa sponsorship and relocation assistance provided.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →