Назад
Company hidden
17 дней назад

Web Scraping Specialist (AI)

75 000 - 150 000$
Формат работы
remote (только USA)
Тип работы
fulltime
Английский
b2
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Web Scraping Specialist (AI): Building infrastructure that delivers massive amounts of web data for training advanced AI models with an accent on high-performance code, data retrieval from complex sources, and scalable pipelines. Focus on optimizing scraping processes, ensuring data quality, and managing storage for billions of data points including videos, transcripts, and audio.

Location: Remote with a 6 hour overlap with EST

Compensation: $75K - $150K

Company

Specialized technical team operating a massive distributed crawler for ingesting, segmenting, and annotating web data at scale.

What you will do

  • Write, test, and refine high-performance code to extract data from various online sources.
  • Manage complex data retrieval tasks including pagination and dynamic AJAX content.
  • Clean and format extracted data to meet rigorous quality standards.
  • Store and manage scraped data in databases, optimizing for speed and integrity.
  • Monitor scraping processes and infrastructure to ensure continuous stable data flow.

Requirements

  • Demonstrated ability to extract data from complex websites with minimal supervision and a portfolio of projects.
  • Advanced skills in Python or JavaScript with BeautifulSoup, Scrapy, or Selenium.
  • Strong knowledge of asynchronous programming, multithreading, and distributed scraping architectures.
  • In-depth knowledge of HTML, CSS, JavaScript, and DOM.
  • Experience with NoSQL databases like MongoDB or Cassandra for efficient storage.
  • Experience deploying large-scale scraping jobs on AWS, Google Cloud, or Azure.

Nice to have

  • Ability to apply machine learning for data cleaning, categorization, or analysis.
  • Active participation in relevant open-source projects.

Culture & Benefits

  • Impactful work at the forefront of AI development and knowledge graph creation.
  • High-output culture prioritizing low ego, technical autonomy, and rapid execution.
  • Remote flexibility with comprehensive benefits and equity package.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →