Staff AI Engineer (Observability)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Staff AI Engineer (Observability): Building and delivering high-performance AI-driven features for observability platforms with an accent on incident triage, automated analysis, and LLM-powered workflows. Focus on prototyping, scaling agentic components, and integrating AI solutions into complex infrastructure and developer workflows.
Location: Must be based in the USA (USA time zones only)
Salary: $174,986 - $220,000
Company
is the company behind the open observability cloud, providing flexible, scalable tools for monitoring and incident response used by over 35 million users.
What you will do
- Develop high-performance AI features to help users detect, triage, and resolve incidents using observability data.
- Prototype, test, and validate LLM- and agent-powered workflows for incident lifecycle management.
- Collaborate with product managers, designers, and data analysts to integrate agentic components into developer tools.
- Take full ownership of AI solutions, ensuring they are scalable, maintainable, and aligned with user needs.
- Utilize AI tools effectively to enhance both product functionality and internal development workflows.
Requirements
- Must be based in the USA and available during USA time zones.
- Strong experience with LLMs, prompt engineering, and building GenAI-powered applications.
- Proven track record of delivering production-grade software used by real users.
- Experience working in cloud-native environments (AWS, GCP, or Azure).
- Proficiency in using observability tools to troubleshoot system behavior.
Nice to have
- Experience building agent frameworks or multi-agent workflows.
- Familiarity with infrastructure tooling like Kubernetes, Docker, or Terraform.
- Experience with model fine-tuning techniques.
- Background in building observability tooling.
Culture & Benefits
- 100% remote-first company with a global collaborative culture.
- Equity provided via Restricted Stock Units (RSUs) for all team members.
- Global annual leave policy of 30 days per annum, including reserved shutdown days.
- Innovation-driven environment with high autonomy and transparent communication.
- Access to frontier AI models and a company-funded budget for AI-assisted development tools.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →