Senior Software Engineer (Observability)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Software Engineer (Observability): Design, build, and maintain the Grafana-based observability platform for deep visibility into system health, performance, and behavior with an accent on metrics, logs, traces, and SLO/SLI frameworks. Focus on scalable pipelines for high-cardinality data, OpenTelemetry instrumentation, incident detection, and continuous improvement of monitoring capabilities.
Remote (100% distributed global team)
Company
Pioneering the Agentic Data Plane (ADP), a new category in AI infrastructure for connecting AI agents with enterprise data and systems, built on a multi-modal data streaming engine.
What you will do
- Design, build, and maintain observability platform using Grafana stack (Grafana, Mimir, Loki, Tempo, Alloy/Agent)
- Develop and optimize dashboards, alerts, and SLO/SLI frameworks for actionable insights
- Build scalable metrics, logging, and distributed tracing pipelines for cloud and on-premise
- Instrument services with OpenTelemetry for standards-based telemetry
- Partner with platform teams to improve incident detection, root-cause analysis, and MTTR
- Evaluate new observability tools, contribute to internal tooling and automation
- Participate in on-call rotation
Requirements
- 5+ years in software engineering focused on observability, monitoring, or infrastructure
- Deep experience with Grafana stack (Grafana, Mimir/Prometheus, Loki, Tempo) in production
- Strong understanding of metrics, logging, distributed tracing at scale
- Experience with OpenTelemetry for instrumentation
- Proficiency in systems-level language (Go preferred) and scripting (Python, Bash)
- Experience with Kubernetes in public clouds (AWS, GCP, Azure)
- Comfortable in 100% distributed team, GitHub collaboration
- Understanding of time-series DBs, log aggregation, query languages (PromQL, LogQL)
Nice to have
- Strong Go understanding
- SaaS platform observability at scale
- eBPF-based observability or profiling (Pyroscope, Parca)
- Infrastructure-as-code (Terraform, Pulumi), GitOps
- Streaming platforms (Kafka, )
- Multi-tenant observability platforms
- Open-source contributions (Grafana, Prometheus, OpenTelemetry)
Culture & Benefits
- Fast-moving, diverse, people-first organization with global teams
- Culture based on trust, transparency, communication, and kindness
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →