Staff Software Engineer (AI Systems & Runtimes)

184 000 - 230 000$

Формат работы

hybrid

Тип работы

fulltime

Грейд

lead

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Staff Software Engineer (AI Systems & Runtimes): Lead the architecture and delivery of a cloud-native AI platform that enables data and AI workloads to run anywhere with an accent on Kubernetes-native orchestration, scalable inference serving, and secure multi-tenant GPU scheduling. Focus on building AI “nervous system” components (CRDs/Operators, AI gateways, RAG pipelines) and bridging AI research with production-grade runtimes so product teams can consume AI capabilities seamlessly.

Location: US-California-San Jose

Salary: $184,000- $230,000 (annual base)

Company

hirify.global builds enterprise data and AI platforms powered by open-source innovation.

What you will do

Design and implement scalable enterprise AI application services that wrap AI capabilities for production use.
Lead Kubernetes-native deployment of inference servers (e.g., vLLM, Triton) using KServe, KubeRay, or Knative for serverless-style scaling.
Build internal tooling, SDKs, and “AI Gateways” to simplify integration of foundation models into product features.
Architect RAG pipelines and prompt management services integrating with vector databases and enterprise data sources.
Collaborate with UI/UX and Product Management to ensure the AI platform is usable for internal developers.
Ensure AI workloads are secure, compliant, multi-tenant, and optimized for GPU resource scheduling within Kubernetes.

Requirements

6+ years of software engineering experience (or equivalent), including 2+ years focused on AI/ML systems.
Expert proficiency in Python and strong competence in a systems language such as Go or Rust/C++.
Deep understanding of LLM deployment challenges and runtimes (e.g., vLLM, ONNX, TorchServe, Triton), plus familiarity with quantization techniques (AWQ, GPTQ).
Experience building complex workflows with tools like LangChain or LlamaIndex and deploying on Docker/Kubernetes.
Ability to drive technical alignment across teams while filtering hype from practical engineering solutions.
Not eligible for immigration sponsorship.

Culture & Benefits

Generous PTO and support for work-life balance with “Unplugged Days”.
Flexible WFH policy and hybrid work setup.
Mental and physical wellness programs.
Phone and internet reimbursement and access to continued career development.
Comprehensive benefits and competitive compensation; corporate incentive plan for non-sales roles.

Hiring process

Interviews focused on technical depth in AI systems, Kubernetes-native orchestration, and production serving.
Evaluation of cross-functional collaboration and ability to drive architecture decisions.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →