Principal Software Engineer (Distributed Systems, AI)

Формат работы

onsite

Тип работы

fulltime

Грейд

senior

Английский

Страна

China

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Principal Software Engineer (Distributed Systems, AI): Designing and building a unified inference platform for Ads, ensuring scalability, reliability, and efficiency with an accent on GPU inference and acceleration technologies. Focus on optimizing model inference via batching, quantization, scheduling, memory management, and runtime optimization.

Location: Suzhou, China. Starting January 26, 2026, hirify.global AI (MAI) employees who live within a 25-mile commute of a non-U.S., country-specific location are expected to work from the office at least four days per week.

Company

hirify.global’s mission is to empower every person and every organization on the planet to achieve more.

What you will do

Design and build a unified inference platform for Ads, ensuring scalability, reliability, and efficiency.
Optimize model inference via batching, quantization, scheduling, memory management, runtime optimization, and other performance improvements.
Develop, optimize, and maintain performance‑critical components for high‑throughput, low‑latency production inference, including GPU‑accelerated paths when applicable.
Collaborate with algorithm/model teams to co‑design serving‑aware model architectures and optimizations.
Profile and improve end‑to‑end system performance: concurrency, memory footprint, throughput, and latency.
Provide senior technical leadership across teams; elevate engineering best practices and influence long‑term technical strategy.

Requirements

Bachelor’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
6+ years’ experience building high‑performance, large‑scale distributed systems or ML infrastructure.
Experience building and optimizing performance‑critical production systems.
Experience working in Ads, Search, Recommendation systems, or other large‑scale online serving systems.

Nice to have

Master’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Experience with GPU inference runtimes such as TensorRT, ONNX Runtime, Triton, TRT‑LLM, or vLLM.
Expertise in CUDA kernel development and GPU performance engineering.
Familiarity with LLM / Transformer inference optimizations, including: sharding, tensor / KV‑cache parallelism, paged attention, continuous batching, quantization (FP8 / AWQ), and hybrid CPU–GPU orchestration.

Culture & Benefits

Employees come together with a growth mindset, innovate to empower others, and collaborate to realize shared goals.
Build on values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Principal Software Engineer (Distributed Systems, AI)

Мэтч & Сопровод

Описание вакансии

TL;DR

Company

What you will do

Requirements

Nice to have

Culture & Benefits

Похожие вакансии

Senior Software Engineer (AI)

Principal Software Engineering (AI)

Principal Software Engineering–Infra–Microsoft Copilot (AI)

Senior AI Software Solutions Engineer (Edge AI)

Applied Scientist 2 (AI)

Deep Learning Software Engineer (AI)