Эта вакансия в архиве
Посмотреть похожие вакансии ↓Описание вакансии
Текст:
TL;DR
Senior Software Engineer (Reliability): Building and maintaining incident response processes and reliability tooling for a high-scale financial platform with an accent on observability, incident leadership, and operational excellence. Focus on designing failure mitigation strategies, improving MTTD/MTTR metrics, and driving post-incident governance across hundreds of services.
Location: Must be based in New York, NY (Hybrid: 3 days/week in-office)
Salary: $196,000–$230,000 USD
Company
A leading financial technology company on a mission to democratize finance for all.
What you will do
- Drive long-term reliability and observability strategy across infrastructure.
- Lead incident mitigation efforts, coordinating service owners and facilitating time-sensitive decisions.
- Develop and maintain incident management processes to minimize customer impact.
- Define and maintain global dashboards and alerts tied to critical user journeys and business metrics.
- Evolve incident response tooling and drive post-incident governance and learning.
- Design next-generation failure mitigation strategies and build frameworks for improved observability.
Requirements
- 5+ years of software engineering experience with significant production operations background.
- 2+ years focused on reliability engineering, infrastructure, or distributed systems.
- Hands-on experience in incident leadership roles (e.g., Incident Commander, primary on-call).
- Deep knowledge of observability frameworks (e.g., OpenTelemetry, Prometheus, Grafana) and fault-tolerant architecture.
- Experience with multi-region/multi-cluster architectures and capacity planning.
- Must be able to work from the New York office at least 3 days per week.
Culture & Benefits
- Performance-driven compensation including bonus programs and equity ownership.
- 100% paid health insurance for employees and 90% for dependents.
- 401(k) matching and employer-paid life/disability insurance.
- Lifestyle wallet for wellness and learning expenses.
- Generous time off including holidays, sick time, and parental leave.
- Exceptional office experience with catered meals and collaborative workspaces.
Похожие вакансии
5 дней назад
Head of IT and Production Support
180 000 - 230 000$
Datadog
23 часа назад
Senior Software Engineer (Observability)
175 000 - 240 000$
8 часов назад
Senior Support Engineer (AI)
234 000 - 260 000$
6 дней назад
Senior Software Engineer (Reliability)
6 дней назад
Sr. Site Reliability Engineer
175 000 - 200 000$
6 дней назад
Senior Staff Site Reliability Engineer
181 000 - 263 000$