Назад
Company hidden
8 часов назад

Staff Software Engineer (AI Systems & Runtimes)

184 000 - 230 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Software Engineer (AI Systems & Runtimes): Lead the architecture and delivery of a cloud-native AI platform that enables data and AI workloads to run anywhere with an accent on Kubernetes-native orchestration, scalable inference serving, and secure multi-tenant GPU scheduling. Focus on building AI “nervous system” components (CRDs/Operators, AI gateways, RAG pipelines) and bridging AI research with production-grade runtimes so product teams can consume AI capabilities seamlessly.

Location: US-California-San Jose

Salary: $184,000- $230,000 (annual base)

Company

hirify.global builds enterprise data and AI platforms powered by open-source innovation.

What you will do

  • Design and implement scalable enterprise AI application services that wrap AI capabilities for production use.
  • Lead Kubernetes-native deployment of inference servers (e.g., vLLM, Triton) using KServe, KubeRay, or Knative for serverless-style scaling.
  • Build internal tooling, SDKs, and “AI Gateways” to simplify integration of foundation models into product features.
  • Architect RAG pipelines and prompt management services integrating with vector databases and enterprise data sources.
  • Collaborate with UI/UX and Product Management to ensure the AI platform is usable for internal developers.
  • Ensure AI workloads are secure, compliant, multi-tenant, and optimized for GPU resource scheduling within Kubernetes.

Requirements

  • 6+ years of software engineering experience (or equivalent), including 2+ years focused on AI/ML systems.
  • Expert proficiency in Python and strong competence in a systems language such as Go or Rust/C++.
  • Deep understanding of LLM deployment challenges and runtimes (e.g., vLLM, ONNX, TorchServe, Triton), plus familiarity with quantization techniques (AWQ, GPTQ).
  • Experience building complex workflows with tools like LangChain or LlamaIndex and deploying on Docker/Kubernetes.
  • Ability to drive technical alignment across teams while filtering hype from practical engineering solutions.
  • Not eligible for immigration sponsorship.

Culture & Benefits

  • Generous PTO and support for work-life balance with “Unplugged Days”.
  • Flexible WFH policy and hybrid work setup.
  • Mental and physical wellness programs.
  • Phone and internet reimbursement and access to continued career development.
  • Comprehensive benefits and competitive compensation; corporate incentive plan for non-sales roles.

Hiring process

  • Interviews focused on technical depth in AI systems, Kubernetes-native orchestration, and production serving.
  • Evaluation of cross-functional collaboration and ability to drive architecture decisions.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →