TL;DR
Senior Staff Technical Program Manager (Reliability): Lead strategy, execution, and continuous improvement of critical reliability initiatives across multi-cloud infrastructure and product engineering teams. With an accent on large-scale distributed systems, cloud infrastructure, and operational excellence. Focus on driving multi-quarter programs, risk management, and enhancing reliability culture across the organization.
Location: Onsite in Bellevue, Mountain View, or San Francisco, United States
Salary: $191,400–$252,720 USD
Company
hirify.global is a leading data and AI company providing a unified data intelligence platform used by over 10,000 organizations worldwide, including many Fortune 500 companies.
What you will do
- Lead long-term reliability strategy and multi-quarter roadmaps partnering with senior engineering leadership.
- Drive end-to-end execution of critical reliability programs including planning, risk management, and delivery.
- Partner with engineering teams to influence technical direction and improve scalability, fault tolerance, and operational tooling.
- Elevate reliability culture by driving adoption of best practices such as error budgets, incident reviews, and design-for-resilience patterns.
- Define and implement program governance, metrics, and documentation to scale reliability efforts.
Requirements
- Must have 10+ years managing large-scale technical programs in cloud infrastructure, distributed systems, or SRE.
- Experience with multiple hyperscale cloud providers (AWS, Azure, GCP) and multi-region architectures.
- Proven success leading reliability programs focused on availability, failover, and operational excellence.
- Strong understanding of infrastructure, distributed systems, and SRE practices; engineering or SRE background preferred.
- Ability to manage complex cross-organizational dependencies and multi-quarter timelines.
- Experience working onsite in the United States locations specified.
Nice to have
- Background in distributed systems engineering, platform infrastructure, or cloud services.
- Experience with large-scale compute fleets, container orchestration, and autoscaling.
- Familiarity with reliability methodologies like SLOs, chaos engineering, and incident management frameworks.
- Expertise with Jira or equivalent program tracking tools.
- Advanced degree in Computer Science, Engineering, or related field.
Culture & Benefits
- Comprehensive benefits and perks tailored to employee needs.
- Commitment to diversity and inclusion in hiring and workplace culture.
- Opportunities to work with cutting-edge data and AI infrastructure technologies.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →