TL;DR
Observability Engineer (AI): Owning and evolving the observability and monitoring platform for AI infrastructure with an accent on designing and maintaining high-quality metrics pipelines, dashboards, and actionable alerts. Focus on establishing observability standards across services, partnering with engineering teams for instrumentation, and supporting incident response.
Location: Onsite in Las Vegas, Nevada, USA
Company
hirify.global Cloud builds seamless, secure, reliable, and resilient AI infrastructure at scale, empowering builders and supporting AI innovation.
What you will do
- Own and evolve the observability and monitoring platform, with Grafana and Prometheus at its core.
- Design, build, and maintain high-quality metrics pipelines.
- Create clear, actionable Grafana dashboards and define meaningful, low-noise alerts.
- Establish and enforce observability standards across services (metrics, logs, traces).
- Partner with engineering teams to instrument applications correctly.
- Support incident response by helping teams understand issues quickly.
Requirements
- Strong hands-on experience with Grafana and Prometheus.
- Deep understanding of metrics-based observability.
- Experience designing monitoring and alerting systems at scale.
- Strong knowledge of alerting best practices (burn rates, SLO-based alerts).
- Experience working with distributed systems and cloud or Kubernetes environments.
- Ability to reason about system behavior using telemetry.
Nice to have
- Experience with OpenTelemetry.
- Familiarity with logs and traces (Loki, Tempo, Jaeger).
- Kubernetes observability experience.
- Infrastructure-as-Code experience (Terraform, Helm).
Culture & Benefits
- Mission-driven company with competitive salary and stock options.
- 100% paid Medical, Dental, and Vision insurance.
- Flexible Spending Account and 401(k).
- Flexible PTO, Paid Holidays, and Parental Leave.
- Mental Health Benefits through Spring Health.
- Opportunity to build the future of AI infrastructure at Exascale.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →