Staff Software Engineer (AI Inference)
ΠΡΡΡ & Π‘ΠΎΠΏΡΠΎΠ²ΠΎΠ΄
ΠΠ»Ρ ΠΌΡΡΡΠ° Ρ ΡΡΠΎΠΉ Π²Π°ΠΊΠ°Π½ΡΠΈΠ΅ΠΉ Π½ΡΠΆΠ΅Π½ Plus
ΠΠΏΠΈΡΠ°Π½ΠΈΠ΅ Π²Π°ΠΊΠ°Π½ΡΠΈΠΈ
TL;DR
Staff Software Engineer (AI Inference): Building and optimizing high-performance inference systems for large-scale AI models with an accent on compute efficiency, intelligent request routing, and fleet-wide orchestration. Focus on solving complex distributed systems challenges across diverse AI accelerators and cloud platforms to serve millions of users and enable breakthrough research.
Location: London, UK. This role operates under a location-based hybrid policy, requiring staff to be in one of the offices at least 25% of the time. Visa sponsorship is available, with reasonable efforts made to secure a visa if an offer is extended.
Salary: Β£325,000 β Β£390,000 GBP
Company
Anthropic is a public benefit corporation with a mission to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society.
What you will do
- Identify and address key infrastructure blockers for serving Claude to millions of users globally.
- Design intelligent routing algorithms to optimize request distribution across thousands of accelerators.
- Autoscale the compute fleet to dynamically match supply with demand for production, research, and experimental workloads.
- Build production-grade deployment pipelines for releasing new AI models.
- Integrate new AI accelerator platforms to maintain hardware-agnostic competitive advantage.
- Analyze observability data to fine-tune performance based on real-world production workloads.
Requirements
- Significant software engineering experience, particularly with distributed systems.
- Familiarity with performance optimization, large-scale service orchestration, and intelligent request routing.
- Experience implementing and deploying machine learning systems at scale.
- Proficiency in Python or Rust.
- At least a Bachelor's degree in a related field or equivalent experience.
Nice to have
- Familiarity with LLM inference optimization, batching strategies, and multi-accelerator deployments.
- Experience with load balancing or traffic management systems.
- Knowledge of Kubernetes and cloud infrastructure (AWS, GCP).
Culture & Benefits
- Competitive compensation and benefits with optional equity donation matching.
- Generous vacation and parental leave.
- Flexible working hours.
- Collaborative environment focused on high-impact AI research.
- Emphasis on advancing long-term goals of steerable, trustworthy AI.
- Regular research discussions to ensure pursuit of high-impact work.
ΠΡΠ΄ΡΡΠ΅ ΠΎΡΡΠΎΡΠΎΠΆΠ½Ρ: Π΅ΡΠ»ΠΈ ΡΠ°Π±ΠΎΡΠΎΠ΄Π°ΡΠ΅Π»Ρ ΠΏΡΠΎΡΠΈΡ Π²ΠΎΠΉΡΠΈ Π² ΠΈΡ ΡΠΈΡΡΠ΅ΠΌΡ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡ iCloud/Google, ΠΏΡΠΈΡΠ»Π°ΡΡ ΠΊΠΎΠ΄/ΠΏΠ°ΡΠΎΠ»Ρ, Π·Π°ΠΏΡΡΡΠΈΡΡ ΠΊΠΎΠ΄/ΠΠ, Π½Π΅ Π΄Π΅Π»Π°ΠΉΡΠ΅ ΡΡΠΎΠ³ΠΎ - ΡΡΠΎ ΠΌΠΎΡΠ΅Π½Π½ΠΈΠΊΠΈ. ΠΠ±ΡΠ·Π°ΡΠ΅Π»ΡΠ½ΠΎ ΠΆΠΌΠΈΡΠ΅ "ΠΠΎΠΆΠ°Π»ΠΎΠ²Π°ΡΡΡΡ" ΠΈΠ»ΠΈ ΠΏΠΈΡΠΈΡΠ΅ Π² ΠΏΠΎΠ΄Π΄Π΅ΡΠΆΠΊΡ. ΠΠΎΠ΄ΡΠΎΠ±Π½Π΅Π΅ Π² Π³Π°ΠΉΠ΄Π΅ β