Назад
Company hidden
3 дня назад

Staff + Sr. Software Engineer (Cloud Inference)

320 000 - 485 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
senior/lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff + Sr. Software Engineer (Cloud Inference): Scaling and optimizing Claude's inference server and load balancer across major cloud providers with an accent on validation pipelines and release reliability. Focus on designing high-performance CI/CD infrastructure, reducing merge-to-production cycle time, and resolving cross-platform performance bottlenecks.

Location: Hybrid: Must be based in San Francisco, CA or Seattle, WA (expected in office at least 25% of the time)

Salary: $320,000 - $485,000 USD

Company

A public benefit corporation dedicated to creating reliable, interpretable, and steerable AI systems.

What you will do

  • Bring up inference for new model architectures and ship them to cloud platforms in sync with first-party platforms.
  • Integrate new inference features, such as structured sampling and prompt caching, into production on CSPs.
  • Design, build, and own CI/CD infrastructure for the inference server and load balancer across cloud platforms.
  • Identify and resolve config drift and cross-platform bugs at the source.
  • Optimize validation processes to reduce merge-to-production cycle time and compute costs.
  • Analyze observability data to identify and remediate performance bottlenecks and cost anomalies.

Requirements

  • Significant experience in high-performance, large-scale distributed systems serving millions of users.
  • Track record of building automation or test infrastructure that measurably improves release velocity.
  • Experience operating services on AWS, GCP, or Azure, with exposure to Kubernetes and Infrastructure as Code.
  • Ability to collaborate cross-functionally and take end-to-end ownership of complex problems.
  • Bachelor's degree or equivalent combination of education and professional experience.

Nice to have

  • Experience with LLM inference optimization, batching, and caching strategies.
  • Understanding of multi-region deployments, request routing, and global traffic management.
  • Proficiency in Python or Rust.

Culture & Benefits

  • Competitive compensation with optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours and collaborative office environments.
  • Visa sponsorship available for qualified candidates.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →