Назад
Company hidden
обновлено 5 часов назад

Senior Bioinformatics Data Engineer (Pharma)

Формат работы
remote (только USA)/hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Senior Bioinformatics Data Engineer (AWS/Python): Building and maintaining Dagster-orchestrated ingestion pipelines for genomics and clinical data with an accent on dbt transformations and lakehouse architecture. Focus on automating AI-native engineering workflows, optimizing Redshift performance, and ensuring clinical data reproducibility.

Location: United States. Remote work is supported, but candidates within commuting distance of offices are encouraged to work on a hybrid basis.

Company

hirify.global is a consulting firm providing regulatory, clinical, and R&D technology solutions to empower biotech, med device, and pharmaceutical organizations.

What you will do

  • Build and maintain Dagster-orchestrated ingestion pipelines for genomics vendors including IO managers and Iceberg writers.
  • Develop and harden dbt Silver-to-Gold transformations, including real-data test coverage and macro consolidation.
  • Implement clinical data ingestion paths (SDTM and ADaM), reconciliation logic, and subject-dimension routing.
  • Deliver platform infrastructure using FastAPI endpoints, CI/CD pipelines, and Redshift performance tuning.
  • Extract transformation rules from legacy R and PySpark code to reconcile against new platform implementations.
  • Automate repetitive processes into workflows and guardrails to ensure high reproducibility standards.

Requirements

  • 5+ years of professional experience in data engineering with shipped production pipelines on AWS (S3, ECS/Fargate, Redshift).
  • AI-native engineering practice: demonstrated experience building systems around AI coding agents (e.g., Claude Code, Cursor).
  • Strong proficiency in Python, SQL, dbt, and workflow orchestration tools (Dagster, Airflow, or Prefect).
  • Solid understanding of lakehouse architecture patterns and schema design for complex multi-modal datasets.
  • Bachelor's or Master's degree in Computer Science, Data Engineering, Bioinformatics, or a related field.
  • Ability to handle PHI-adjacent clinical data under contractor policies (background check, VPN access).

Nice to have

  • Direct experience with Apache Iceberg, AWS Glue Catalog, or lakehouse table formats.
  • Comfort reading genomic data (VAF, HGVS, VCFs, CNV/fusion semantics).
  • Familiarity with clinical data standards including SDTM, ADaM, and CDISC.
  • Background in pharma, clinical research, or life sciences.
  • Proficiency in R for interoperability with bioinformatics teams.
  • Experience with Docker/ECS and infrastructure-as-code (CloudFormation).

Culture & Benefits

  • Commitment to diversity, equity, and inclusion, providing a safe space for all employees to succeed.
  • Support for remote working with optional hybrid collaboration for those near office locations.
  • Personal review of all applications by the recruitment team without the use of AI screening tools.
  • Guaranteed outcome communication for every applicant.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →