Software Engineer (Site Reliability Engineering)

Формат работы

onsite

Тип работы

fulltime

Грейд

principal

Английский

Страна

Вакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Software Engineer (Site Reliability Engineering): Building and optimizing high-scale cloud services with an accent on platform uptime, performance, and health data visualization. Focus on transforming monitoring strategy into active, high-fidelity signals for real-time alerting and incident response, and integrating reliability testing into software development lifecycles.

Location: Onsite in San Francisco, Seattle, Palo Alto, or Bellevue, USA

Company

hirify.global is a technology organization managing high-level frameworks to measure platform uptime and performance, bridging reporting and individual engineering teams.

What you will do

Provide input into long-range platform requirements and operational guidelines, making health data actionable for service owners.
Analyze and understand service telemetry, driving continuous improvement of health signals.
Partner with internal engineering teams to integrate global availability standards into monitoring pipelines and automated alerting flows.
Identify and mitigate onboarding friction by leveraging automated test suites for streamlined reliability signals.
Serve as a technical subject matter expert for centralized infrastructure services (logging, monitoring, and data platforms).
Quarterback the integration of failure signals into standard engineering workflows, ensuring automated work items and proactive investigations.

Requirements

A related technical degree.
5+ years of proven experience in production environments (software engineer, systems engineer, service owner, or lead developer).
Fluency in Java or a similar object-oriented language (Python, C++, etc.).
Deep understanding of telemetry systems and experience building or managing production monitoring and alerting frameworks.
Experience using Linux environments and the ability to navigate complex, distributed system architectures.
Familiarity with core web technologies: HTTP, JSON, REST, and XML.

Nice to have

Previous experience in a Service Owner or Technical Lead role within a high-scale, multi-tenant cloud environment.
Strong background in Site Reliability Engineering (SRE) principles and industry-standard availability best practices.
Experience with automated testing frameworks (e.g., Selenium, Integration testing, or Chaos Engineering).
Log parsing and data analysis experience using platforms such as Splunk or ELK.
Experience with SQL and relational databases (PostgreSQL, Oracle, etc.).

Culture & Benefits

Be part of the Availability Standards team, influencing platform uptime and performance.
Follow a consultative engineering approach, partnering with service owners.
Advocate for the customer and influence the product roadmap by ensuring world-class availability.
Work within a team focused on maintaining world-class availability.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...

Software Engineer (Site Reliability Engineering)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Associate DevOps Engineering (Fintech)

Senior Cloud Site Reliability Engineer (AI)

Staff Production Engineer, Security (AI)

Site Reliability Engineer

Senior Site Reliability Engineer (Satellite Connectivity)

Principal Site Reliability Engineer (Network)