Principal Software Engineer (AI)
ΠΡΡΡ & Π‘ΠΎΠΏΡΠΎΠ²ΠΎΠ΄
ΠΠ»Ρ ΠΌΡΡΡΠ° Ρ ΡΡΠΎΠΉ Π²Π°ΠΊΠ°Π½ΡΠΈΠ΅ΠΉ Π½ΡΠΆΠ΅Π½ Plus
ΠΠΏΠΈΡΠ°Π½ΠΈΠ΅ Π²Π°ΠΊΠ°Π½ΡΠΈΠΈ
TL;DR
Principal Software Engineer (AI): Designing and optimizing a high-performance inference engine for agentic infrastructure and LLM serving systems with an accent on GPU utilization, memory management, and scalable orchestration. Focus on building the GenAI inference stack, integrating new model architectures, and optimizing latency and throughput for large-scale workloads.
Location: Boston, MA; Seattle, WA; or San Francisco, CA. Must be available to attend in-person company trainings and meetings
Company
delivers an AI platform that enables organizations to develop, deliver, and govern predictive and generative AI at scale while minimizing business risk.
What you will do
- Design, develop, and optimize the inference engine for agentic infrastructure API and LLM serving systems.
- Optimize for latency, throughput, and memory efficiency across GPUs and other hardware accelerators.
- Collaborate with partners like NVIDIA to implement new model architectures, including sparsity and mixture-of-experts.
- Build and maintain instrumentation, profiling, and tracing tooling to identify and resolve system bottlenecks.
- Develop scalable routing, batching, scheduling, and dynamic loading mechanisms for inference workloads.
- Orchestrate federated distributed inference infrastructure across nodes to balance load and handle communication overhead.
Requirements
- 10+ years of engineering experience, with 5+ years in infrastructure, platform, or backend systems.
- Deep expertise in Kubernetes internals, including networking, scheduling, and controller patterns.
- Strong proficiency in Python or Go for building production-quality, observable systems.
- Experience operating across multiple cloud providers (AWS, GCP, Azure) or hybrid environments.
- Strong experience with Helm, container orchestration, and CI/CD automation.
- Proficiency with IaC tools (Terraform, Pulumi) and GitOps workflows.
Nice to have
- Familiarity with Cilium, Kyverno, KEDA, Gateway API, or OPA.
- Experience building and running multi-tenant SaaS platforms.
- Exposure to on-prem delivery models or regulated environments.
- Experience with GPU infrastructure for training and inference.
- Success in driving infrastructure transformation or decomposing legacy systems.
Culture & Benefits
- Comprehensive Medical, Dental, and Vision Insurance.
- Flexible Time Off Program and Paid Holidays.
- Paid Parental Leave.
- Global Employee Assistance Program (EAP).
- Culture based on high standards, rigor, and a commitment to "being better than yesterday".
ΠΡΠ΄ΡΡΠ΅ ΠΎΡΡΠΎΡΠΎΠΆΠ½Ρ: Π΅ΡΠ»ΠΈ ΡΠ°Π±ΠΎΡΠΎΠ΄Π°ΡΠ΅Π»Ρ ΠΏΡΠΎΡΠΈΡ Π²ΠΎΠΉΡΠΈ Π² ΠΈΡ ΡΠΈΡΡΠ΅ΠΌΡ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡ iCloud/Google, ΠΏΡΠΈΡΠ»Π°ΡΡ ΠΊΠΎΠ΄/ΠΏΠ°ΡΠΎΠ»Ρ, Π·Π°ΠΏΡΡΡΠΈΡΡ ΠΊΠΎΠ΄/ΠΠ, Π½Π΅ Π΄Π΅Π»Π°ΠΉΡΠ΅ ΡΡΠΎΠ³ΠΎ - ΡΡΠΎ ΠΌΠΎΡΠ΅Π½Π½ΠΈΠΊΠΈ. ΠΠ±ΡΠ·Π°ΡΠ΅Π»ΡΠ½ΠΎ ΠΆΠΌΠΈΡΠ΅ "ΠΠΎΠΆΠ°Π»ΠΎΠ²Π°ΡΡΡΡ" ΠΈΠ»ΠΈ ΠΏΠΈΡΠΈΡΠ΅ Π² ΠΏΠΎΠ΄Π΄Π΅ΡΠΆΠΊΡ. ΠΠΎΠ΄ΡΠΎΠ±Π½Π΅Π΅ Π² Π³Π°ΠΉΠ΄Π΅ β