Назад
Company hidden
2 часа назад

Senior SRE Engineer (Observability Focus)

Формат работы
hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
Poland/Cyprus/Bulgaria
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior SRE Engineer (Observability Focus): Own end-to-end observability by designing and operating the telemetry stack for production visibility across hybrid AWS and on-prem environments with an accent on metrics, logs, and traces at scale. Focus on building VictoriaMetrics/OpenSearch/OpenTelemetry pipelines, running Kafka-based telemetry transport, and delivering Grafana dashboards and alerting that engineers actually use.

Location: Warsaw (Poland) / Sofia (Bulgaria) / Limassol (Cyprus) / Remote

Company

hirify.global is a trading platform expanding globally with a focus on cutting-edge technology and client experience.

What you will do

  • Own the full observability stack: VictoriaMetrics (metrics), OpenSearch (logs), and OpenTelemetry (traces) from pipeline design to day-2 operations.
  • Architect and operate VictoriaMetrics clusters, including vmagent scraping, remote write, vmalert rules, and cardinality control.
  • Operate OpenSearch clusters with ISM, hot-warm-cold architecture, shard tuning, and ingest pipelines via Data Prepper.
  • Build and maintain OpenTelemetry Collector pipelines (receivers/processors/exporters) and instrument services across Java, Python, and JS/TS.
  • Run Kafka as the telemetry transport layer (topic design, partition strategy, consumer lag monitoring, throughput tuning).
  • Build Grafana dashboards/alerting and improve sampling strategies, batching, and context propagation; contribute to incident response and post-mortems.

Requirements

  • 6+ years in DevOps/SRE/platform engineering, including 2+ years focused on observability tooling at production scale.
  • Hands-on VictoriaMetrics (or Prometheus) expertise: MetricsQL/PromQL, exporters, service discovery, remote write, downsampling, and retention management.
  • Solid OpenSearch/Elasticsearch skills: cluster operations, Query DSL, ISM policies, and ingest pipeline design.
  • Production experience with OpenTelemetry: Collector configuration, OTLP, context propagation, and instrumentation across multiple languages.
  • Strong Kafka skills: producer/consumer patterns, consumer group management, Kafka Connect, Schema Registry, and JMX monitoring (Strimzi on Kubernetes is a plus).
  • Working knowledge of Kubernetes (operators, Helm), Argo CD/GitOps, and Terraform/Ansible; scripting in Bash or Python for automation.

Culture & Benefits

  • Hybrid work model (#LI-Hybrid) with additional workation days to work remotely from anywhere (restrictions apply).
  • Generous time off with an annual leave policy and extra paid volunteer days.
  • Comprehensive health and pension benefits, including location-specific perks.
  • Employee referral program.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →