Назад
Company hidden
2 часа назад

Web Crawling Engineer

Формат работы
hybrid
Тип работы
fulltime
Грейд
middle/senior
Английский
b2
Страна
France, UK, Spain, Netherlands, Italy, Germany, Belgium
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Web Crawling Engineer (Backend Dev): Developing and maintaining scalable distributed web crawlers and data extraction systems using Go and related technologies with an accent on large-scale data processing, distributed job queues, and web scraping optimization. Focus on designing efficient algorithms, ensuring data quality, and collaborating across teams to integrate diverse web data sources.

Location: Primarily based in European offices (Paris, London) with hybrid work; remote possible from France, UK, Germany, Belgium, Netherlands, Spain, and Italy; mandatory visits to Paris HQ for onboarding and monthly collaboration

Company

hirify.global develops high-performance, open-source AI models and solutions aimed at democratizing AI for enterprise and daily use.

What you will do

  • Develop and maintain web crawlers using Go and headless browsing tools for large-scale data extraction
  • Collaborate with teams to scrape and integrate data from APIs and web pages
  • Create efficient parsing patterns using tokenizers, regex, XPaths, and CSS selectors
  • Design and manage distributed job queues with Redis, Aerospike, and Kubernetes
  • Monitor and ensure data quality and integrity throughout crawling and indexing
  • Continuously optimize web crawling infrastructure for efficiency and adaptability

Requirements

  • Location: Must reside in or be open to relocating to Paris or London; remote candidates only from specified European countries
  • Proficiency in Go, Rust, or Zig for scalable web crawlers
  • Strong understanding of web protocols (TCP, UDP, TLS, HTTP) and web technologies (HTML, CSS, JavaScript)
  • Experience with cloud platforms (AWS, GCP), containerization (Docker), and orchestration (Kubernetes)
  • Knowledge of distributed systems and big data processing
  • English proficiency at least B2 level

Nice to have

  • Experience with web archiving projects and open-source archiving tools
  • Applying machine learning to improve crawling efficiency or accuracy
  • Low-level networking programming or userspace TCP/IP stacks experience

Culture & Benefits

  • Competitive salary and equity
  • Health insurance and private pension plan
  • Transportation, sport, and meal allowances
  • Generous parental leave policy
  • Visa sponsorship

Hiring process

  • Introduction call (35 min)
  • Hiring Manager interview (30 min)
  • Live-coding interview (45 min)
  • System design interview (45 min)
  • Optional deep dive interview (60 min)
  • Culture-fit discussion (30 min)
  • Reference checks

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →