Назад
Company hidden
14 часов назад

Engineer, Supercomputing & Distributed Systems (AI)

Формат работы
onsite
Тип работы
fulltime
Грейд
middle/senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Engineer, Supercomputing & Distributed Systems (AI): Building and operating the infrastructure for Krea's research and inference, including distributed training, Kubernetes GPU clusters, and petabyte-scale data pipelines with an accent on custom distributed datastores and job orchestration systems. Focus on scaling workloads and research between clusters in multiple datacenters and building fault tolerance systems for large-scale pretraining.

Location: On-site in San Francisco

Company

hirify.global is building next-generation AI creative tools, dedicated to making AI intuitive and controllable for creatives.

What you will do

  • Design multi-stage pipelines that turn petabytes of raw data into clean, annotated datasets.
  • Manage distributed training and inference on 1000+ GPU Kubernetes clusters.
  • Profile and optimize dataloaders streaming thousands of images per second.
  • Customize and train models to filter billions of images.
  • Build fault tolerance systems for large-scale pretraining.

Requirements

  • Experience with Python, PyArrow, DuckDB, SQL, PyTorch, Pandas, NumPy.
  • Experience with Kubernetes.
  • Fundamental knowledge of containerization, operating systems, file-systems, and networking.
  • Intuition for distributed systems and a great mental model of how systems interact and function under different conditions.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →