Назад
Company hidden
обновлено 22 дня назад

Senior Software Engineer II (AI)

165 000 - 242 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Software Engineer II (AI): Building and optimizing a Kubernetes-native inference platform with an accent on latency, throughput, and reliability improvements. Focus on designing scalable distributed systems, advanced optimizations, and ensuring strict P99 SLAs at scale.

Location: Hybrid in Sunnyvale, CA or Bellevue, WA with remote work considered for candidates located more than 30 miles from an office

Salary: $165,000–$242,000

Company

hirify.global is a publicly traded AI cloud platform company delivering superior infrastructure performance and technical expertise to accelerate AI innovation.

What you will do

  • Lead design reviews and drive architecture across multiple services and teams.
  • Define and own SLIs/SLOs and improve reliability through post-incident actions.
  • Implement advanced optimizations such as micro-batch schedulers and speculative decoding.
  • Strengthen incident response including capacity planning and autoscaling policies.
  • Mentor junior engineers and elevate coding and testing standards.
  • Own key areas like request routing, adaptive scheduling, and GPU resource isolation.

Requirements

  • Must be a U.S. person or eligible to access export controlled information per U.S. Government regulations.
  • 5–8 years experience building distributed systems or cloud services.
  • Strong coding skills in Python or Go; familiarity with networked systems and performance.
  • Hands-on experience with Kubernetes at production scale, CI/CD, and observability tools.
  • Practical knowledge of inference internals including batching, caching, and mixed precision.
  • Proven track record improving tail latency and service reliability.

Nice to have

  • Contributions to inference frameworks like vLLM, Triton, or TorchServe.
  • Experience with CUDA kernels, NCCL/SHARP, RDMA/NUMA, or GPU interconnect topologies.
  • Experience leading multi-team initiatives or customer partnerships on critical launches.

Culture & Benefits

  • Comprehensive medical, dental, vision, and life insurance fully paid by employer.
  • Flexible PTO, paid parental leave, and family-forming support.
  • 401(k) with employer match and employee stock purchase program.
  • Flexible work environment with hybrid priority and remote options for distant candidates.
  • Casual work environment with catered lunches and a culture focused on innovation.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →