Performance Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Performance Engineer (AI): Building and maintaining automated performance testing frameworks for LLM inference, microservices, and infrastructure with an accent on VoIP quality characterization and system latency. Focus on designing high-scale load testing infrastructure, tuning database performance, and establishing SLIs/SLOs to ensure system reliability in healthcare.
Location: Palo Alto
Company
is a generative AI company specializing in safety-focused LLMs designed for autonomous clinical conversations in healthcare.
What you will do
- Design and maintain automated performance testing frameworks for LLM inference, REST/gRPC microservices, and infrastructure (PostgreSQL, Redis, message queues).
- Integrate performance suites into CI/CD to gate deployments against latency and throughput regressions.
- Measure and track VoIP quality metrics (MOS, jitter, packet loss, echo) and build synthetic call load testing infrastructure.
- Define SLIs/SLOs and develop real-time visibility dashboards using Grafana.
- Partner with ML, Speech, Backend, and Infra teams to translate performance findings into prioritized engineering work.
- Contribute to incident reviews and author runbooks to help other engineers instrument their services.
Requirements
- 10+ years in performance engineering with a strong software development and SRE background.
- Proven experience building automated performance test harnesses using tools like Locust, k6, Gatling, or JMeter.
- Deep expertise in PostgreSQL performance tuning and Redis optimization.
- Strong grasp of distributed systems fundamentals, including queueing theory, tail latency, and backpressure.
- Fluency with observability tools such as Prometheus, Grafana, and Cloudwatch.
- Working knowledge of SIP, RTP/RTCP and experience with VoIP testing tools like SIPp.
Nice to have
- Experience benchmarking ML inference servers (vLLM, TensorRT-LLM, Triton).
- Kubernetes workload profiling and resource right-sizing.
- Chaos engineering experience using Toxiproxy, Gremlin, or Chaos Monkey.
- Background in healthcare tech or high-reliability, latency-sensitive real-time communications.
Culture & Benefits
- Opportunity to work on category-creating technology that transforms patient outcomes at a global scale.
- Collaboration with a world-class team of AI pioneers and researchers from Stanford, Google, Meta, and NVIDIA.
- Strong financial backing from leading investors including a16z, CapitalG, and General Catalyst.
- High-impact role with ownership over performance across the entire technical stack.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →