Bioinformatics Data Engineer (RNA Resources)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Bioinformatics Data Engineer (RNA Resources): Build and maintain data pipelines for Rfam and RNAcentral with an accent on efficient data processing, storage, and retrieval at scale. Focus on modernising curation workflows, implementing human-in-the-loop AI-assisted agentic curation, and developing LLM pipelines for literature summarisation and curation.
Location: Hinxton, Cambridgeshire
Salary: £3,303 per month (after tax), excluding pension and insurance contributions
Company
develops and operates large biological data resources used by the global research community.
What you will do
- Run, maintain, and optimise production data pipelines for Rfam and RNAcentral, improving performance and scalability.
- Analyse existing data curation and data production pipelines to identify improvement, optimisation, and scaling opportunities.
- Modernise and containerise Rfam curation pipelines, including human-in-the-loop, AI-assisted agentic curation.
- Develop and scale LLM pipelines for RNAcentral literature summarisation and curation.
- Build scalable workflows for ncRNA annotation in genomes and support data releases.
- Document pipelines and processes; present and gather feedback through consortium and scientific advisory activities.
Requirements
- Master’s level (or equivalent) in a computational, biological, or related scientific discipline.
- Proficiency in Python and other relevant languages for bioinformatics tool development.
- Experience with relational databases (PostgreSQL, MySQL) and strong SQL knowledge, including performance tuning and query optimisation.
- Proven experience building and maintaining production bioinformatics pipelines using workflow management systems such as Nextflow or Snakemake.
- Experience building applications with LLMs and other AI technologies.
- Comfortable with Git/GitHub, Unix, and Bash; strong communication skills.
Nice to have
- Knowledge of RNA biology and/or practical experience with Rfam, Infernal, R-scape, and secondary structure prediction tools.
- Experience with high-performance computing environments (e.g., Slurm) and data migration planning (downtime, consistency verification, rollback).
- Experience with Docker/Singularity, Kubernetes, and cloud infrastructure platforms (e.g., OpenStack).
- Experience with AI workflow libraries such as LangChain and LangGraph.
- Experience with the Rust programming language.
Culture & Benefits
- Hybrid working: work 2 days per week from the office in Hinxton (currently Monday and Tuesday), with flexibility to come on site more often.
- Private medical insurance for you and your immediate family.
- Generous time off: 30 days annual leave plus public holidays.
- Relocation package available (including installation grant if required).
- Campus benefits: free shuttle bus, on-site library, subsidised gym and cafeteria, and sports/social clubs.
- Benefits for non-UK residents, including visa exemption and travel/education-related support.
Hiring process
- Introductory interviews remotely starting in early July, followed by panel interviews remotely in mid July.
- Submit a CV and a tailored cover letter; applications without both documents are not considered.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →