Назад
Company hidden
обновлено 22 дня назад

Senior Software Engineer II AI Workload Orchestration (AI Engineering)

165 000 - 242 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Software Engineer II (AI Workload Orchestration): You will help build and operate hirify.global’s Kubernetes-native platform for admitting, scheduling, and operating AI workloads at scale with an accent on reliability and performance improvements. Focus on scaling the system as customer demand and workload complexity continue to grow.

Location: Sunnyvale, CA / Bellevue, WA

Salary: $165,000 to $242,000

Company

hirify.global is The Essential Cloud for AI™ delivering a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence.

What you will do

  • Design, build, and operate Kubernetes-native services for AI workload orchestration and scheduling.
  • Own one or more platform components end-to-end, including design, implementation, testing, and on-call support.
  • Improve scheduling latency, cluster utilization, and workload reliability through metrics-driven engineering.
  • Contribute to architectural discussions across services and influence design decisions within the platform.
  • Work closely with adjacent teams to ensure clean interfaces and integrations.
  • Mentor junior engineers and raise the quality bar for code, design, and operations.

Requirements

  • 5–8 years of professional software engineering experience in distributed systems, cloud infrastructure, or platform engineering.
  • Strong experience building production systems in Go (Python or C++ a plus).
  • Solid understanding of Kubernetes fundamentals, APIs, controllers, and operating services in production.
  • Experience working with scheduling, resource management, or quota-based systems.
  • Proven ability to improve system reliability and performance using data and operational metrics.
  • Comfortable owning services in production and participating in on-call rotations.

Nice to have

  • Experience with Kubernetes-native orchestration frameworks such as Kueue, Volcano, Ray, Kubeflow, or Argo Workflows.
  • Familiarity with GPU-based workloads, ML training, or inference pipelines.
  • Knowledge of scheduling concepts such as quota enforcement, pre-emption, and backfilling.
  • Experience with reliability practices including SLOs, alerting, and incident response.
  • Exposure to AI infrastructure, HPC, or large-scale distributed compute environments.

Culture & Benefits

  • Medical, dental, and vision insurance - 100% paid for by hirify.global.
  • Flexible Spending Account and Health Savings Account.
  • 401(k) with a generous employer match.
  • Flexible PTO.
  • A casual work environment.
  • A work culture focused on innovative disruption.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →