TL;DR
Technical Program Manager, Compute (AI): Driving the planning, coordination, and execution of programs to keep hirify.global's compute infrastructure running efficiently at scale with an accent on compute lifecycle, supply procurement, and capacity allocation. Focus on partnering with Infrastructure, Systems, Research, Finance, and Capacity Engineering to shape processes and tooling.
Location: Must be in one of our offices at least 25% of the time (San Francisco, CA | New York City, NY | Seattle, WA)
Salary: $365,000 - $435,000 USD
Company
hirify.global’s mission is to create reliable, interpretable, and steerable AI systems.
What you will do
- Own and drive critical programs across the compute lifecycle, coordinating execution across multiple engineering, research, and operations teams
- Build and maintain operational visibility into the compute fleet, ensuring the organization has a clear picture of supply, demand, utilization, and health
- Lead cross-functional coordination for compute transitions: bringing new capacity online, migrating workloads, and managing decommissions across cloud providers and hardware platforms
- Partner with engineering and research leadership to navigate competing priorities and drive alignment on how compute resources are planned, allocated, and used
- Identify and close operational gaps across the compute pipeline, whether through new tooling, improved processes, or better cross-team communication
Requirements
- 7+ years of technical program management experience in infrastructure, platform engineering, or compute-intensive environments
- Have led complex, cross-functional programs involving multiple engineering teams with competing priorities and ambiguous requirements
- Have experience working with research or ML teams and translating their needs into operational plans and technical requirements
- Are comfortable diving deep into technical details (cloud infrastructure, cluster management, job scheduling, resource orchestration) while maintaining program-level visibility
- Thrive in ambiguous, fast-moving environments where you need to define scope and build processes from the ground up
- Strong communication skills and can engage credibly with engineers, researchers, finance, and executive leadership
Nice to have
- Experience managing compute capacity across multiple cloud providers (AWS, GCP, Azure) or hybrid cloud/on-premise environments
- Familiarity with job scheduling, resource orchestration, or workload management systems (Kubernetes, Slurm, Borg, YARN, or custom schedulers)
- Experience with GPU or accelerator infrastructure, including the unique challenges of large-scale ML training and inference workloads
Culture & Benefits
- Competitive compensation and benefits
- Optional equity donation matching
- Generous vacation and parental leave
- Flexible working hours
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →