Назад
3 часа назад

Manager, Software Engineering (Resilience Engineering)

178 000 - 228 000CAD
Формат работы
remote (только Canada/United_states)
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
US/Canada
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Manager, Software Engineering (Resilience Engineering) (Infrastructure): Leading the Resilience Engineering team to ensure system reliability through proactive validation and chaos engineering with an accent on production load testing and fault injection. Focus on designing platforms for safe production experimentation and establishing reliability guardrails.

Location: Remote (Must be based in Canada)

Salary: $178,000 - $228,000 CAD per year

Company

Affirm is reinventing credit to make it more honest and friendly, offering consumers flexible buy now and pay later options without hidden fees.

What you will do

  • Define and drive the vision for resilience engineering, establishing production load testing and chaos engineering as core practices.
  • Lead and mentor a team of engineers building platforms and tooling for safe production experimentation.
  • Own the design and evolution of platforms for controlled production load testing and fault injection with strong safety safeguards.
  • Build systems providing end-to-end observability, traceability, and auditability for all resilience experiments.
  • Partner with infrastructure, product, and security leadership to embed resilience validation into the software development lifecycle.
  • Evangelize a culture of proactive failure testing across the engineering organization.

Requirements

  • Proven experience leading engineering teams in reliability, infrastructure, or distributed systems.
  • Hands-on experience with production load testing, chaos engineering, or large-scale system validation (e.g., Gremlin, Harness).
  • Strong understanding of failure modes in distributed systems, including latency and cascading outages.
  • Experience building systems with strong safety guarantees (isolation, rate limiting, guardrails).
  • Proficiency with cloud-native environments (AWS, Kubernetes) and observability tooling.
  • Strong programming background in Python, Kotlin, Java, or similar.
  • Must be based in Canada.

Culture & Benefits

  • 100% subsidized medical, dental, and vision coverage for employees and their dependents.
  • Generous flexible spending wallets for technology, food, lifestyle, and family forming expenses.
  • Competitive vacation and holiday schedules to ensure rest and recharge.
  • Employee Stock Purchase Plan (ESPP) allowing the purchase of company shares at a discount.
  • Remote-first culture with high flexibility for employees within their country of employment.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →