Назад
1 день назад

Site Reliability Engineer

112 000 - 156 000$
Формат работы
remote (Global)
Тип работы
fulltime
Английский
b2
vacancy_detail.hirify_telegram_tooltipВакансия из Telegram канала -

Мэтч & Сопровод

Покажет вашу совместимость и напишет письмо

Описание вакансии

Site Reliability Engineer
#удаленка
Company: Wormholefoundation
Salary: $112k - $156k estimated
🔹What you'll be doing:
-Act as first responder and incident commander during production incidents
-Lead incident triage, root cause analysis, and retrospective documentation
-Build detailed incident timelines and preventative runbooks
-Respond to incidents related to: performance issues, CCQ failures or degraded throughput, observability pipeline outages, and core Wormhole products
-Deliver remediation recommendations and implement approved fixes
-Improve reliability and uptime across all Wormhole services
-Strengthen observability, monitoring, and alerting systems
-Harden infrastructure for security and operational resiliency
-Enhance deployment workflows and reduce operational friction
-Lead incident response, analysis, and continuous improvement
-Support operational tooling used by engineering, DevOps, and validator partners

Who you are:
-Relevant tertiary qualifications in computer science or a closely related field (bachelors/masters) and/or relevant work experience over at least five years
-Established experience as incident commander across multiple stakeholders in global team
-Familiarity with metrics and log analysis tools (e.g., Grafana), incident response tools (e.g., PagerDuty), GitHub administration and related tools
-Deep understanding of reliability engineering, observability, and incident response for distributed systems
-Ability to write and debug code in any of the following: Go, Rust, Java
-Strong experience operating in Grafana or Datadog or Splunk and/or Kubernetes in production environments
-Experience securing distributed systems and public-facing infrastructure
-Ability to operate independently, document clearly, and lead during incidents
-Solid understanding of cloud computing environments (AWS and GCP preferred) and willingness to keep up to date with their changing offerings.
-Excellent and proactive written and verbal communication
-Ideal candidate will be based in ET or GMT time zone or the ability to work those hours
Contact:

🔥 Подписаться на наши каналы / @best_itjob / @it_rab

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник -