Назад
Company hidden
обновлено 10 дней назад

Senior Software Engineer (AI)

Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
China
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Software Engineer (GPU Inference Optimization): Building and optimizing high-performance software for large-scale GPU inferencing of language models with an accent on low-level performance tuning and integration with novel AI hardware. Focus on designing robust software in C/C++ and Python, identifying bottlenecks, and implementing kernel-level improvements for AI models in search advertising.

Location: Onsite in Beijing, China

Company

hirify.global focuses on building an online advertising ecosystem and intelligent systems using web-scale data to drive user satisfaction and advertiser ROI.

What you will do

  • Design, develop, and maintain high-performance software for GPU inference of language models.
  • Optimize model inference and training pipelines for speed, throughput, and memory efficiency.
  • Collaborate with platform teams to integrate and tune solutions on emerging accelerator stacks.
  • Profile workloads, identify bottlenecks, and implement kernel-level and system-level performance improvements.
  • Partner with stakeholders to translate requirements into scalable performance features.
  • Validate performance, stability, and correctness through benchmarking and testing.

Requirements

  • 4+ years of technical engineering experience with coding in C, C++, Python, CUDA, or ROCm.
  • 3+ years of practical experience optimizing GPU performance for applications.
  • Practical experience writing new GPU kernels.
  • Cross-team collaboration skills and desire to collaborate in a team of researchers and developers.
  • Bachelor’s Degree in Computer Science or related technical field.

Nice to have

  • Master’s Degree in Computer Science or related technical field with 2+ years of experience.
  • Experience in low-level performance analysis using GPU profiling tools such as NVIDIA Visual Profiler or Nsight Compute.
  • Familiarity with inference optimization frameworks such as TensorRT-LLM, SGLang, or vLLM.
  • Exposure to Deep Neural Network inference and experience with PyTorch, Tensorflow, or ONNX Runtime.

Culture & Benefits

  • Committed to cultivating an inclusive work environment.
  • Values of respect, integrity, and accountability to create a culture of inclusion.
  • Growth mindset, innovation, and collaboration to empower others.
  • Microsoft is an equal opportunity employer.
  • Assistance with religious accommodations and/or reasonable accommodation due to a disability.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...