Назад
Company hidden
23 часа назад

Intermediate Site Reliability Engineer (AIOps)

115 000 - 128 000CAD
Формат работы
hybrid
Тип работы
fulltime
Грейд
middle
Английский
b2
Страна
US/Canada
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Intermediate Site Reliability Engineer (AIOps): Design and implement AI-powered solutions for observability, incident response, and cloud-native platform optimization with an accent on ML-based anomaly detection, generative AI automation, and predictive reliability. Focus on building self-healing systems, time-series forecasting, and fault-tolerant infrastructure using AI agents and orchestration.

Location: Hybrid in Toronto, Ontario; must reside within commutable distance to Mississauga or Salt Lake City office. In-office events required including weekly/bi-weekly/monthly team events.

Salary: CAD $115,000–$128,000 + bonus + benefits

Company

Leading health tech SaaS platform serving 30,000+ long-term and post-acute care providers with AI-accelerated innovation.

What you will do

  • Build ML-based anomaly detection, pattern recognition, and smart telemetry enhancement for AI-driven observability.
  • Develop event-driven workflows, self-healing systems, and automate incident response using generative AI and agent orchestration.
  • Implement time-series forecasting, predictive modeling, AI-powered autoscaling, and cost-aware resource allocation.
  • Engineer scalable, fault-tolerant cloud-native systems and participate in on-call rotations for critical incident response.
  • Integrate APIs for data exchange and run AIOps workshops to enable teams with AI maturity models and responsible AI practices.

Requirements

  • 5+ years in software engineering with SRE principles and AI/ML in production environments.
  • Strong debugging, problem-solving, and system design skills; passion for automation and operational excellence.
  • Languages: Python, Java, Bash, Terraform.
  • Platforms: Azure, Kubernetes, Docker.
  • Tools: Datadog, Prometheus, AppDynamics, ELK, GitHub Actions, Jenkins, ArgoCD, Spinnaker; Databases: SQL Server, PostgreSQL, MySQL.
  • ML/AI: MCP framework, AI agents, Vector store, LangChain, RAG.

Nice to have

  • Experience with AIOps platforms.
  • Contributions to open-source or AI communities.
  • Familiarity with Responsible AI frameworks.
  • Participation in AI hackathons or conferences.

Culture & Benefits

  • Benefits from Day 1 including retirement plan matching, flexible PTO, wellness programs.
  • Parental & caregiver leaves, fertility & adoption support, continuous development program.
  • Employee Assistance Program, allyship and inclusion communities, employee recognition.
  • Flexibility, growth opportunities, and meaningful work in a founder-led, privately held company.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →