Назад
Company hidden
7 дней назад

Senior Site Reliability Engineer (AI)

Формат работы
hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
Germany
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Site Reliability Engineer (AI): Design, maintain, and optimize complex distributed systems with an accent on Kubernetes administration, monitoring solutions, and system reliability. Focus on building scalable infrastructure, incident response, and continuous improvement in a hybrid work environment.

Location: Hybrid in Munich, Germany

Company

hirify.global is a global communications platform powered by Language AI, focused on breaking down language barriers with human-sounding translations and intelligent writing suggestions.

What you will do

  • Design, maintain, and optimize complex distributed systems ensuring high availability and performance.
  • Manage and troubleshoot Kubernetes environments as the team’s administrator.
  • Develop and implement monitoring solutions to ensure system reliability and meet SLOs.
  • Participate in on-call rotations, respond to incidents, and contribute to post-mortem analyses.
  • Promote a culture of continuous improvement by proactively identifying and solving problems.

Requirements

  • Location: Hybrid work with presence in Munich, Germany
  • Proven experience with complex distributed systems and low-level interactions.
  • Hands-on expertise in Kubernetes administration and abstractions.
  • Professional experience in Python or Go and strong software engineering and system design skills.
  • Experience building monitoring solutions for operational excellence.
  • Excellent communication skills for technical and non-technical stakeholders.

Nice to have

  • Experience with AWS and Terraform for infrastructure management.
  • Familiarity with Prometheus, Grafana, Loki, OpenTelemetry, PagerDuty, ArgoCD, FastAPI, Redis, RabbitMQ, and Postgres.
  • Previous on-call experience handling incidents effectively.

Culture & Benefits

  • Diverse international team with over 90 nationalities and global presence.
  • Open communication and regular feedback culture.
  • Hybrid work schedule with flexible hours and team synchronization.
  • Regular in-person team events and monthly hack days.
  • 30 days annual leave plus mental health resources.
  • Virtual shares offering ownership mindset and rewards linked to company growth.
  • Competitive benefits tailored to employee location.

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →