TL;DR
Staff Production Engineer (Security, AI): Providing technical leadership and strategic direction for the reliability, scalability, and security of a cloud platform and security systems with an accent on designing scalable security infrastructure, security automation, and incident response. Focus on influencing architecture, driving long-term technical strategy for operational excellence, and mentoring senior engineers.
Location: This role is based in our offices in Livingston, NJ, New York, NY, Sunnyvale, CA, or Bellevue, WA. Remote work may be considered for candidates located more than 30 miles from an office, based on specialized skill sets. Applicants must be a U.S. person (U.S. citizen, national, lawful permanent resident, refugee, or asylee) or eligible to obtain required export authorization.
Salary: $188,000–$275,000
Company
hirify.global is the AI Hyperscaler™, delivering a cloud platform of cutting-edge services powering the next wave of AI, and became a publicly traded company (Nasdaq: CRWV) in March 2025.
What you will do
- Lead the design, implementation, and evolution of scalable, highly available security infrastructure using Kubernetes and cloud-native technologies.
- Define and drive long-term technical strategy for reliability, security automation, and operational excellence.
- Build and standardize automation, monitoring, and alerting systems to proactively identify and mitigate systemic risks.
- Partner with engineering teams to improve system performance, reduce latency, and increase service uptime.
- Serve as a technical escalation point during major incidents, leading complex incident response and root cause analysis.
- Mentor senior engineers and set best practices for reliability engineering, security operations, and infrastructure management.
Requirements
- 8+ years of experience in site reliability engineering, DevOps, security engineering, or security operations.
- Deep expertise with Kubernetes, container orchestration, and cloud-native architectures at scale.
- Advanced proficiency in automation and systems programming using languages such as Python, Go, or Bash.
- Proven experience operating and securing large-scale distributed systems in high-availability environments.
- Demonstrated ability to lead complex technical initiatives and influence teams beyond direct ownership.
- Must be a U.S. person (U.S. citizen, national, lawful permanent resident, refugee, or asylee) or eligible to obtain required export authorization.
Nice to have
- Strong familiarity with observability platforms such as Prometheus, Grafana, or Datadog.
- Experience designing and operating security-critical infrastructure in major cloud environments (AWS, Azure, GCP).
- Track record of driving reliability and security improvements across multi-team or company-wide initiatives.
Culture & Benefits
- Medical, dental, and vision insurance with 100% company-paid premiums.
- Company-paid Life Insurance and voluntary supplemental life insurance.
- Flexible Spending Account and Health Savings Account.
- 401(k) with a generous employer match and Employee Stock Purchase Program (ESPP).
- Mental Wellness Benefits through Spring Health and Family-Forming support by Carrot.
- Paid Parental Leave and flexible, full-service childcare support with Kinside.
- Flexible PTO and a casual work environment.
- Catered lunch each day in our office and data center locations.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →