Staff Site Reliability Engineer

174 000 - 226 000$

Формат работы

hybrid

Тип работы

fulltime

Грейд

senior

Английский

Страна

Описание вакансии

Текст:

TL;DR

Staff Site Reliability Engineer (SRE): Improving and protecting the reliability, performance, and operability of production systems while evolving an AWS-based infrastructure with an accent on modern SRE practices, including SLIs/SLOs, error budgets, and reliability reviews. Focus on leading multi-sprint, multi-engineer reliability or performance initiatives and building tooling and automation using LLMs/AI tools.

Location: Ability to work a hybrid schedule – Tuesday–Thursday in-office. We welcome applicants from across the U.S. where we are registered to do business and able to support employment. Currently, this excludes the following states: Alabama, Alaska, Connecticut, Hawaii, Kentucky, Mississippi, Nebraska, New Mexico, North Dakota, Rhode Island, South Dakota, West Virginia, and Wyoming.

Salary: $174,000 - $226,000 base salary range + annual bonus

Company

hirify.global is building a software platform that empowers today’s commercial contractors, transforming the multi-billion dollar commercial contracting industry.

What you will do

Own reliability domains end-to-end, including strategy, roadmap, and execution.
Drive modern SRE practices across services, including SLIs/SLOs, error budgets, and reliability reviews.
Lead multi-sprint, multi-engineer reliability or performance initiatives, coordinating work across teams.
Design and maintain end-to-end observability (metrics, logs, traces, dashboards, and alerts).
Partner with product and engineering teams to design reliable services and influence system design.
Contribute code to services, tooling, and automation, and use LLMs/AI tools to accelerate delivery.

Requirements

8+ years of experience operating complex, user-facing SaaS systems and working on production systems and reliability-focused initiatives.
Proven experience leading multi-sprint, multi-engineer projects to successful completion with clear business impact.
Thorough understanding of modern SRE practices, such as defining and implementing SLIs/SLOs and error budgets.
Strong software engineering skills in at least one modern language (Python or Node.js/TypeScript).
Strong observability skills, including designing metrics, logging, and tracing for multi-service systems.
Experience working with AWS in production and collaborating within Infrastructure as Code workflows.

Culture & Benefits

Generous equity grant.
Flexible PTO and hybrid work schedules.
Work from home stipend.
Hubs in Los Angeles, San Francisco, Toronto, and Raleigh with hybrid work schedules and lunch provided for in-office days.
Fast-paced, collaborative, and dynamic work environment.
Opportunities for growth and career advancement.