Назад
Company hidden
1 день назад

Senior Software Engineer (AI)

119 800 - 234 700$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Software Engineer (AI): Designing and optimizing high-performance serving systems and GPU inference frameworks to drive measurable latency improvements and cost efficiency across hirify.global’s ad ecosystem. Focus on CUDA kernel tuning and NUMA-aware threading to large-scale distributed orchestration and model deployment for deep learning and LLM workloads.

Location: Mountain View, United States

Salary: USD $119,800 – $234,700 per year (U.S.) or USD $158,400 – $258,000 per year (San Francisco Bay area and New York City metropolitan area)

Company

hirify.global is an equal opportunity employer.

What you will do

  • Design and lead the development of large-scale, distributed online serving systems.
  • Architect and optimize end-to-end inference infrastructure.
  • Profile and optimize performance across the full stack.
  • Own live-site reliability.
  • Collaborate and mentor across teams.

Requirements

  • Bachelor’s Degree in Computer Science or related technical field AND 4+ years technical engineering experience developing high-performance, distributed systems in C++.
  • Deep expertise in GPU inference frameworks such as NVIDIA Triton Inference Server, CUDA, and TensorRT.
  • Strong understanding of model-serving trade-offs.
  • Proven ability to profile and optimize GPU and system workloads.
  • Expertise in low-level system and OS internals.

Nice to have

  • Master’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience developing high-performance, distrbuted systems in C++.
  • Hands-on experience with real-time data streaming systems (Kafka, Flink, Spark Streaming), feature-store integration, and multi-region deployment for low-latency, globally distributed services.
  • Familiarity with LLM inference optimization.
  • Demonstrated success operating large-scale systems with SLA-based capacity forecasting, autoscaling, and performance telemetry.

Culture & Benefits

  • Equal opportunity employer.

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →