Staff + Sr. Software Engineer (Cloud Inference)

320 000 - 485 000$

Формат работы

hybrid

Тип работы

fulltime

Грейд

senior/lead

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Staff + Sr. Software Engineer (Cloud Inference): Scaling and optimizing Claude's inference server and load balancer across major cloud providers with an accent on validation pipelines and release reliability. Focus on designing high-performance CI/CD infrastructure, reducing merge-to-production cycle time, and resolving cross-platform performance bottlenecks.

Location: Hybrid: Must be based in San Francisco, CA or Seattle, WA (expected in office at least 25% of the time)

Salary: $320,000 - $485,000 USD

Company

A public benefit corporation dedicated to creating reliable, interpretable, and steerable AI systems.

What you will do

Bring up inference for new model architectures and ship them to cloud platforms in sync with first-party platforms.
Integrate new inference features, such as structured sampling and prompt caching, into production on CSPs.
Design, build, and own CI/CD infrastructure for the inference server and load balancer across cloud platforms.
Identify and resolve config drift and cross-platform bugs at the source.
Optimize validation processes to reduce merge-to-production cycle time and compute costs.
Analyze observability data to identify and remediate performance bottlenecks and cost anomalies.

Requirements

Significant experience in high-performance, large-scale distributed systems serving millions of users.
Track record of building automation or test infrastructure that measurably improves release velocity.
Experience operating services on AWS, GCP, or Azure, with exposure to Kubernetes and Infrastructure as Code.
Ability to collaborate cross-functionally and take end-to-end ownership of complex problems.
Bachelor's degree or equivalent combination of education and professional experience.

Nice to have

Experience with LLM inference optimization, batching, and caching strategies.
Understanding of multi-region deployments, request routing, and global traffic management.
Proficiency in Python or Rust.

Culture & Benefits

Competitive compensation with optional equity donation matching.
Generous vacation and parental leave.
Flexible working hours and collaborative office environments.
Visa sponsorship available for qualified candidates.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →