Назад
Company hidden
6 дней назад

Software Engineer, Compute Infra (AI)

180 000 - 440 000$
Формат работы
onsite
Тип работы
fulltime
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Software Engineer, Compute Infra (AI): Designing, building, and operating massive-scale clusters and orchestration platforms that power frontier AI training, inference, and agent workloads. Focus on achieving superior scalability, isolation, resource efficiency, and fault-tolerance compared to off-the-shelf solutions.

Location: Must be located in Palo Alto, CA

Salary: $180,000 - $440,000 USD

Company

hirify.global’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.

What you will do

  • Build and manage massive-scale clusters to host, persist, train, and serve AI workloads with extreme reliability and performance.
  • Design, develop, and extend an in-house container orchestration platform that achieves superior scalability, isolation, resource efficiency, and fault-tolerance.
  • Collaborate with research teams to architect and optimize compute clusters specifically for large-scale training runs, inference services, and real-time applications.
  • Profile, debug, and resolve complex system-level performance bottlenecks, resource contention, scheduling issues, and reliability problems across the full stack.
  • Own end-to-end infrastructure initiatives with first-principles design, rigorous testing, automation, and continuous optimization to support frontier AI compute demands.

Requirements

  • Deep expertise in virtualization technologies (KVM, Xen, QEMU) and advanced containerization/sandboxing (Kata, Firecracker, gVisor, Sysbox, or equivalent).
  • Strong proficiency in systems programming languages such as C/C++ and Rust.
  • Proven track record profiling, debugging, and optimizing complex system-level performance issues, with deep knowledge of Linux kernel internals, resource management, scheduling, memory management, and low-level engineering.
  • Hands-on experience building or significantly enhancing distributed compute platforms, orchestration systems, or high-performance infrastructure at scale.
  • Ability to thrive in a fast-paced, meritocratic environment with full ownership, high standards, and a focus on rigorous execution.

Nice to have

  • Experience in Linux kernel development, hypervisor extensions, or low-level system programming for compute-intensive workloads.
  • Proven track record operating or designing large-scale AI training/inference clusters (GPU/TPU scale).
  • Experience with custom runtimes, isolation techniques, or bespoke platforms for specialized AI compute.
  • Familiarity with performance tools, tracing, and debugging in production distributed environments.

Culture & Benefits

  • Base salary is just one part of our total rewards package at hirify.global, which also includes equity, comprehensive medical, vision, and dental coverage.
  • Access to a 401(k) retirement plan.
  • Short & long-term disability insurance, life insurance, and various other discounts and perks.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...