Назад
Company hidden
4 дня назад

Junior Data Ml Engineer (AI)

1 000
Формат работы
hybrid
Тип работы
fulltime
Грейд
junior
Английский
b2
Страна
Netherlands, Switzerland
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Junior Data ML Engineer (AI): Designing and implementing data‑sourcing, synthetic‑generation, and curation pipelines for training multimodal Large Language Models with an accent on managing massive scale, ensuring consistent quality, and tightly controlling data relevance and integrity. Focus on building high‑throughput data pipelines that ingest multi‑modal data at petabyte scale, generate large volumes of synthetic data and filter & rate content by topic, quality, and policy compliance.

Location: Zurich or Amsterdam, with the expectation of spending half of your time at the office.

Salary: EUR 1000 learning and development budget

Company

hirify.global.ai is building a next-generation agentic clinical AI assistant that helps clinicians reason across patient data, guidelines, and diagnostics.

What you will do

  • Design and implement data‑sourcing, synthetic‑generation, and curation pipelines.
  • Build high‑throughput data pipelines that ingest multi‑modal data at petabyte scale.
  • Generate large volumes of synthetic data.
  • Filter & rate content by topic, quality, and policy compliance.
  • Work closely together with ML researchers and help steer the development of our state‑of-the‑art foundation models.

Requirements

  • Strong programming skills in Python and familiarity with distributed frameworks such as Ray or Spark.
  • Experience contributing to ML research and associated data challenges, such as data cleaning, transformation and validation
  • Exposure to synthetic-data generation workflows or interest in working with LLM-related data pipeline.
  • Understanding of lakehouse paradigms (Delta, Iceberg) and columnar formats (Parquet, ORC).
  • Experience with core data‑processing primitives (hashing, deduplication, chunking etc.) and associated scalability/performance trade‑offs.
  • Strong communication skills and the ability to present experimental results and technical concepts clearly and concisely.

Nice to have

  • Experience using workflow orchestration tools such as Dagster or similar workflow engines.
  • Exposure to data‑quality & validation frameworks and monitoring/observability tooling.
  • Strong grasp of machine‑learning fundamentals (model architectures, training paradigms, evaluation metrics) to collaborate deeply with researchers and guide data‑driven choices.

Culture & Benefits

  • An attractive and competitive salary, a good pension plan and 25 vacation days per year.
  • Great offsites and team events to strengthen the team and celebrate successes together.
  • A EUR 1000 learning and development budget to help you grow.
  • Autonomy to do your work the way that works best for you, whether you have a kid or prefer early mornings.
  • An annual commuting subsidy.

Hiring process

  • Screening call: A short conversation to align on your motivation, career goals, and initial fit for the role.
  • Codility test : online coding assessment focused on core programming skills, problem-solving ability, and fundamental data structures and algorithms.
  • Onsite technical interview: A in-depth discussion into your problem-solving approach through a technical challenge, case study, or role-specific scenario, and conversations with team members to assess collaboration dynamics, team fit, and day-to-day fit.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...