Senior Scalability Engineer (Observability)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Scalability Engineer (Observability): Designing and building an organization-wide observability platform and internal productivity tools with an accent on the LGTM stack, high-performance log indexing, and SQL analytics. Focus on architecting custom observability products using Rust, Python, and React to improve system visibility and developer efficiency.
Location: Remote (Must be based in the US, salary range based on NY)
Salary: $160,000 - $220,000 USD
Company
Enterprise health technology company providing pharmacy benefit management (PBM) and health benefit management solutions for employers and health plans.
What you will do
- Architect and maintain the LGTM stack (Loki, Grafana, Tempo, Mimir/Prometheus) as the primary observability platform across all engineering teams.
- Develop production-grade internal tools using React/TypeScript frontends with Python and Rust backends to improve debugging and monitoring.
- Build high-performance log indexing systems in Rust to enable sub-second search across billions of log lines.
- Implement SQL analytics solutions leveraging AWS Athena, DuckDB, or ClickHouse for deep investigations and trend analysis.
- Design intelligent alerting systems and dashboards to reduce noise and accelerate incident response.
- Define organization-wide observability standards and mentor engineers on instrumentation and logging best practices.
Requirements
- 10+ years of software or infrastructure engineering experience with demonstrated technical leadership.
- Proficiency in React/TypeScript for frontend development and Python (Flask/SQLAlchemy) for backend services.
- Deep production experience with the LGTM stack and AWS CloudWatch Logs/Metrics.
- Experience with SQL-based log analytics (AWS Athena, DuckDB) and search engines like OpenSearch or Elasticsearch.
- Proven track record of handling high-volume structured and unstructured data at scale.
- Must be based in the United States.
Nice to have
- Production experience with Rust for building high-performance data processing or search systems.
- Expertise in Infrastructure as Code using Terraform.
- Deep knowledge of OpenTelemetry, Jaeger, or Zipkin.
- Prior experience in Pharmacy Benefits Management (PBM) or healthcare technology.
Culture & Benefits
- Fully remote work arrangement.
- Opportunity to build foundational infrastructure for a rapidly growing healthcare platform.
- Collaborative Agile/Scrum engineering environment.
- Strong commitment to diversity, equity, and inclusive workplace culture.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →