Назад
Company hidden
3 месяца назад

Data Ingestion Engineer (AI)

Формат работы
onsite
Тип работы
fulltime
Английский
b2
Страна
UK/US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Data Ingestion Engineer (AI): Building and operating large-scale ingestion systems to transform web data into high-quality training corpora for frontier AI models with an accent on distributed systems, data extraction, and pipeline scalability. Focus on experimenting with crawling strategies, optimizing dataset delivery, and closing the feedback loop between data collection and model performance.

Location: On-site in San Francisco, London, or New York

Company

hirify.global is an AI startup dedicated to building open-weight foundational models by leveraging talent from top research institutions.

What you will do

  • Build and operate large-scale data ingestion systems including web crawling, extraction, and dataset versioning
  • Develop specialized crawlers to acquire high-priority data sources for training
  • Analyze ingested data to identify quality gaps, redundancy, and performance bottlenecks
  • Collaborate with researchers to evaluate how extraction methods impact model capabilities
  • Scale ingestion pipelines to handle multi-TB to PB-scale data efficiently
  • Debug production issues and maintain robust, observable ingestion infrastructure

Requirements

  • Experience building web crawling or large-scale data acquisition systems using Ray, Beam, or Spark
  • Familiarity with LLM training processes and an intuition for high-quality data
  • Ability to work with PB-scale datasets and ensure system observability and maintainability
  • Strong experimental mindset to iterate on system improvements based on data performance
  • Excellent communication skills to articulate system behavior and architectural tradeoffs
  • Must be able to work on-site in San Francisco, London, or New York

Culture & Benefits

  • Competitive salary and equity packages
  • Comprehensive medical, dental, and vision insurance
  • Fully paid parental leave and family planning support
  • Daily provided lunch and dinner
  • Relocation support for eligible candidates
  • Regular team off-sites and celebrations

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →