Назад
Company hidden
3 дня назад

Principal Software Engineer (AI)

Π€ΠΎΡ€ΠΌΠ°Ρ‚ Ρ€Π°Π±ΠΎΡ‚Ρ‹
hybrid
Π’ΠΈΠΏ Ρ€Π°Π±ΠΎΡ‚Ρ‹
fulltime
Π“Ρ€Π΅ΠΉΠ΄
senior
Английский
b2
Π‘Ρ‚Ρ€Π°Π½Π°
US
Вакансия ΠΈΠ· списка Hirify.GlobalВакансия ΠΈΠ· Hirify Global, списка ΠΌΠ΅ΠΆΠ΄ΡƒΠ½Π°Ρ€ΠΎΠ΄Π½Ρ‹Ρ… tech-ΠΊΠΎΠΌΠΏΠ°Π½ΠΈΠΉ
Для мэтча ΠΈ ΠΎΡ‚ΠΊΠ»ΠΈΠΊΠ° Π½ΡƒΠΆΠ΅Π½ Plus

ΠœΡΡ‚Ρ‡ & Π‘ΠΎΠΏΡ€ΠΎΠ²ΠΎΠ΄

Для мэтча с этой вакансиСй Π½ΡƒΠΆΠ΅Π½ Plus

ОписаниС вакансии

ВСкст:
/

TL;DR

Principal Software Engineer (AI): Designing and optimizing a high-performance inference engine for agentic infrastructure and LLM serving systems with an accent on GPU utilization, memory management, and scalable orchestration. Focus on building the GenAI inference stack, integrating new model architectures, and optimizing latency and throughput for large-scale workloads.

Location: Boston, MA; Seattle, WA; or San Francisco, CA. Must be available to attend in-person company trainings and meetings

Company

hirify.global delivers an AI platform that enables organizations to develop, deliver, and govern predictive and generative AI at scale while minimizing business risk.

What you will do

  • Design, develop, and optimize the inference engine for agentic infrastructure API and LLM serving systems.
  • Optimize for latency, throughput, and memory efficiency across GPUs and other hardware accelerators.
  • Collaborate with partners like NVIDIA to implement new model architectures, including sparsity and mixture-of-experts.
  • Build and maintain instrumentation, profiling, and tracing tooling to identify and resolve system bottlenecks.
  • Develop scalable routing, batching, scheduling, and dynamic loading mechanisms for inference workloads.
  • Orchestrate federated distributed inference infrastructure across nodes to balance load and handle communication overhead.

Requirements

  • 10+ years of engineering experience, with 5+ years in infrastructure, platform, or backend systems.
  • Deep expertise in Kubernetes internals, including networking, scheduling, and controller patterns.
  • Strong proficiency in Python or Go for building production-quality, observable systems.
  • Experience operating across multiple cloud providers (AWS, GCP, Azure) or hybrid environments.
  • Strong experience with Helm, container orchestration, and CI/CD automation.
  • Proficiency with IaC tools (Terraform, Pulumi) and GitOps workflows.

Nice to have

  • Familiarity with Cilium, Kyverno, KEDA, Gateway API, or OPA.
  • Experience building and running multi-tenant SaaS platforms.
  • Exposure to on-prem delivery models or regulated environments.
  • Experience with GPU infrastructure for training and inference.
  • Success in driving infrastructure transformation or decomposing legacy systems.

Culture & Benefits

  • Comprehensive Medical, Dental, and Vision Insurance.
  • Flexible Time Off Program and Paid Holidays.
  • Paid Parental Leave.
  • Global Employee Assistance Program (EAP).
  • Culture based on high standards, rigor, and a commitment to "being better than yesterday".

Π‘ΡƒΠ΄ΡŒΡ‚Π΅ остороТны: Ссли Ρ€Π°Π±ΠΎΡ‚ΠΎΠ΄Π°Ρ‚Π΅Π»ΡŒ просит Π²ΠΎΠΉΡ‚ΠΈ Π² ΠΈΡ… систСму, ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ iCloud/Google, ΠΏΡ€ΠΈΡΠ»Π°Ρ‚ΡŒ ΠΊΠΎΠ΄/ΠΏΠ°Ρ€ΠΎΠ»ΡŒ, Π·Π°ΠΏΡƒΡΡ‚ΠΈΡ‚ΡŒ ΠΊΠΎΠ΄/ПО, Π½Π΅ Π΄Π΅Π»Π°ΠΉΡ‚Π΅ этого - это мошСнники. ΠžΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎ ΠΆΠΌΠΈΡ‚Π΅ "ΠŸΠΎΠΆΠ°Π»ΠΎΠ²Π°Ρ‚ΡŒΡΡ" ΠΈΠ»ΠΈ ΠΏΠΈΡˆΠΈΡ‚Π΅ Π² ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΡƒ. ΠŸΠΎΠ΄Ρ€ΠΎΠ±Π½Π΅Π΅ Π² Π³Π°ΠΉΠ΄Π΅ β†’