Назад
Company hidden
5 дней назад

Principal Software Engineer (AI)

296 400$
Формат работы
onsite
Тип работы
fulltime
Грейд
principal
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Principal Software Engineer (AI): Designing and optimizing large-scale, distributed GPU-accelerated ad-serving and inference platforms with an accent on high performance, low latency, and scalability. Focus on building advanced GPU inference frameworks, deep learning model serving, and system-level performance engineering for global ad infrastructure.

Location: Onsite in Redmond, United States

Company

hirify.global is advancing core ad-serving infrastructure powering Bing Search, MSN, Microsoft Start, and Edge shopping experiences at massive global scale.

What you will do

  • Design and lead development of distributed GPU/CPU inference pipelines processing millions of ad requests per second with ultra-low latency and high reliability.
  • Architect and optimize end-to-end inference infrastructure including model serving, batching, caching, scheduling, and resource orchestration.
  • Profile and optimize performance from CUDA kernels to OS-level scheduling to improve latency and cost efficiency.
  • Own live-site reliability with telemetry, alerting, and fault-tolerance mechanisms for globally distributed systems.
  • Collaborate across teams to drive architecture reviews, enforce engineering excellence, and mentor engineers in performance engineering.

Requirements

  • Must be located onsite in Redmond, United States.
  • Bachelor’s degree in Computer Science or related field with 8+ years experience in high-performance distributed systems using C++ (or equivalent experience).
  • Deep expertise in GPU inference frameworks (NVIDIA Triton, CUDA, TensorRT) and custom CUDA kernel development.
  • Strong understanding of model-serving trade-offs, real-time bidding, and large-scale ad ranking systems.
  • Experience with real-time data streaming systems (Kafka, Flink, Spark Streaming) and multi-region deployment.
  • PhD or advanced degree preferred; demonstrated leadership and mentorship in large-scale system performance engineering.

Nice to have

  • Experience with LLM inference optimization and hybrid CPU-GPU orchestration.
  • Industry experience in advertising or search engine backend systems.

Culture & Benefits

  • Competitive salary range from $163,000 to $331,200 depending on location.
  • Opportunity to work on one of the world’s most advanced, mission-critical online serving platforms.
  • Collaborate with world-class engineers and technical leaders.

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →