Назад
Company hidden
7 дней назад

Release Engineer (AI)

Формат работы
onsite
Тип работы
fulltime
Английский
b2
Страна
SK
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Release Engineer (AI/DevOps): Designing and managing CI/CD pipelines and hardware-in-the-loop test infrastructure for NPU-based AI accelerators with an accent on version-compatibility and build automation. Focus on automating end-to-end validation, tracking performance regressions, and optimizing the developer feedback loop for compiler and ML frameworks.

Location: Onsite in Seongnam, South Korea

Company

hirify.global is a high-performance AI chip company developing NPUs for next-generation AI workloads.

What you will do

  • Own integration CI/CD and manage version-compatibility across PyTorch, vLLM, and RBLN Compiler.
  • Build hardware-in-the-loop (HIL) test infrastructure using real NPU boards to automate device allocation and recovery.
  • Automate end-to-end validation for numerical correctness and output accuracy across compile, deploy, and inference stages.
  • Benchmark core metrics such as latency, throughput, and memory to track and alert on performance regressions.
  • Design and manage multi-version build matrices, wheel packaging, and artifact registries.
  • Improve developer experience through faster CI, parallelization, and better failure reporting.

Requirements

  • Hands-on experience designing and operating CI/CD systems (Buildkite, GitHub Actions, Jenkins, etc.).
  • Strong proficiency in Python, shell scripting, and Linux.
  • Experience building container-based build/test environments using Docker.
  • Experience operating self-hosted runners or custom build infrastructure.
  • Familiarity with build systems such as CMake or Bazel.
  • Working understanding of ML inference and PyTorch workflows.

Nice to have

  • Experience with HIL test infrastructure on accelerators (GPU, NPU, TPU).
  • Understanding of vLLM or LLM serving stacks (inference engines, KV cache, batching).
  • CI experience on compiler or toolchain projects involving large builds and test matrices.
  • Knowledge of PyTorch backend extensions or HuggingFace optimum-style integration.
  • Experience with Kubernetes-based runner orchestration or large-scale device-pool management.

Hiring process

  • Document review followed by online and on-site interviews, including a technical assignment.
  • Culture-fit interview and final compensation negotiation.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →