Назад
Company hidden
7 часов назад

Data Engineer (Machine Learning)

170 000 - 240 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Data Engineer (Machine Learning): Building and maintaining production data pipelines for multimodal AI models with an accent on dataset versioning, data quality frameworks, and infrastructure for model evaluation. Focus on designing scalable ETL/ELT systems for voice and sensor data to accelerate the ML development lifecycle.

Location: San Francisco, USA

Compensation: $170K - $240K

Company

hirify.global is an AI company designing a new kind of computer focused on integrating lifelike voice agents into daily life.

What you will do

  • Design and build production data pipelines for conversational, voice, and multimodal data used in model training.
  • Collaborate with ML engineers to define data requirements and deliver high-quality datasets for experiments.
  • Implement infrastructure for dataset versioning, lineage tracking, and reproducibility.
  • Develop data quality frameworks including schema validation, drift detection, and coverage monitoring.
  • Optimize large-scale data processing for cost and performance on cloud infrastructure.
  • Build internal tooling to enable ML researchers to discover and request data independently.

Requirements

  • 5+ years in data engineering with specific experience supporting ML or AI teams.
  • Strong proficiency in SQL and Python.
  • Experience building scalable ETL/ELT pipelines using orchestration tools like Airflow, Dagster, or Prefect.
  • Hands-on experience with ML data workflows, including labeling and versioning.
  • Comfort working with unstructured and semi-structured data (audio, text, JSON logs).
  • Must be based in San Francisco.

Nice to have

  • Experience with vector databases, embedding storage, or feature stores.
  • Knowledge of telemetry and sensor data from hardware or embedded systems.
  • Proficiency with distributed compute frameworks like Ray or Spark.
  • Experience with Kubernetes (GKE or EKS).
  • Familiarity with data privacy frameworks for conversational and voice data.

Culture & Benefits

  • 401(k) with up to 3.5% employer match.
  • 100% employer-paid health, vision, and dental benefits for employees and dependents.
  • Unlimited PTO and sick time.
  • Medical FSA with employer matching up to $1,650/year.
  • Competitive stock options to share in the company's success.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →