TL;DR
Principal Engineer, Operational Excellence & Resilience (Cybersecurity): Building and optimizing technology resilience functions, driving strategy and execution across infrastructure, applications, and products with an accent on technical resilience architecture, disaster recovery leadership, and chaos engineering. Focus on defining comprehensive resilience strategies, ensuring service reliability, and validating continuous improvement in cloud-native cybersecurity environments.
Location: Remote (USA)
Salary: $145,000 - $220,000 per year
Company
hirify.global is a global leader in cybersecurity, dedicated to stopping breaches with its advanced AI-native platform for modern organizations.
What you will do
- Facilitate cross-organizational coordination for technology resilience initiatives, serving as a central point for alignment.
- Own and maintain enterprise-wide technology resilience standards and governance across all domains.
- Drive comprehensive technical resilience architecture, including infrastructure redundancy and fault tolerance.
- Lead enterprise technical recovery strategy development and implementation, focusing on RTO/RPO.
- Partner to define and implement product resilience standards, including feature flagging and scalability frameworks.
- Provide technical oversight and aggregation of technology resilience risks, establishing and monitoring KPIs.
Requirements
- 10+ years of direct experience in technology resilience, disaster recovery, site reliability engineering, or related technical disciplines in enterprise-scale cloud-native environments.
- Deep understanding of infrastructure redundancy patterns, application resilience design, chaos engineering principles, and enterprise disaster recovery across hybrid cloud architectures.
- Proven experience with feature management systems, progressive deployment strategies, multi-tenant architecture resilience, and scalability engineering practices.
- Proven ability to drive strategic initiatives across large technology organizations, influencing senior stakeholders and leading complex, cross-functional programs.
- Experience establishing and monitoring resilience KPIs, including system uptime, MTTR, RTO/RPO objectives, and deployment success metrics.
- Bachelor's degree in Computer Science, Information Systems, Engineering, Risk/Resilience, or equivalent practical experience.
- Ability to provide leadership support during crisis events, including nights and weekends when required.
Nice to have
- Experience leading technology resilience functions in high-growth, cloud-native technology companies.
- Advanced knowledge of chaos engineering tools and practices (Chaos Monkey, Litmus, Gremlin, etc.).
- Experience with modern resilience patterns including circuit breakers, bulkheads, and progressive delivery.
- Background spanning infrastructure operations, site reliability engineering, and product engineering.
- Experience with observability and monitoring platforms supporting resilience objectives.
Culture & Benefits
- Market leader in compensation and equity awards.
- Comprehensive physical and mental wellness programs.
- Competitive vacation and holidays, paid parental and adoption leaves.
- Professional development opportunities for all employees.
- Employee Networks, geographic neighborhood groups, and volunteer opportunities.
- Vibrant office culture with world-class amenities.
- Great Place to Work Certified across the globe.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →