Senior Observability Engineer (SaaS)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Observability Engineer (SaaS): Designing, building, and operating a central observability platform based on the Grafana stack on Kubernetes with an accent on platform reliability, automation, and AI-assisted insights. Focus on defining standards for telemetry, scaling multi-tenant infrastructure, and enabling cross-functional teams through self-service capabilities.
Location: Hybrid options, remote work is only available within the EU. Offices in Munich (Germany) and Timisoara (Romania).
Company
is a market leader in B2B Workforce Management Software, listed on SDAX and TecDAX, focusing on optimizing the balance between profitability and people.
What you will do
- Design, scale, and operate the central observability stack (Grafana, Loki, Tempo, Prometheus/Mimir) on Kubernetes across multiple cloud providers.
- Analyze stakeholder requirements to develop the observability roadmap and translate them into clear priorities and user stories.
- Automate routine operations, provisioning, and capacity checks to reduce manual overhead and increase reliability.
- Define and evolve global standards for metrics, logs, traces, and alerting patterns to enable self-service for product teams.
- Implement and operate AI observability components and evaluate AI-assisted diagnostics for incident response.
- Ensure security, compliance, and cost-effectiveness through RBAC, tenant isolation, and cardinality guardrails.
Requirements
- Proven background as an Observability, SRE, or Platform Engineer in cloud-native environments.
- Deep expertise with the Grafana stack (Prometheus, Mimir, Loki, Tempo) in production at scale.
- Strong proficiency with Kubernetes, microservices, and modern instrumentation including OpenTelemetry.
- Experience operating a multi-tenant observability platform within a SaaS context.
- Ability to align technical roadmaps with business goals and communicate effectively with engineering stakeholders.
- Must be based within the EU for remote arrangements.
Nice to have
- Exposure to Langfuse or Clickhouse.
- Experience with major cloud hyper-scalers and their native observability services.
Culture & Benefits
- Competitive rewards including profit-sharing and an employee stock program.
- Flexible work culture with hybrid options and 30 days of vacation.
- Structured onboarding and continuous leadership development via the Academy.
- Health and wellbeing support, including corporate wellness programs and Wellhub membership.
- Engaging environment featuring seasonal company events, team retreats, and an in-house barista.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →