Web Scraping Specialist (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Web Scraping Specialist (AI): Building infrastructure that delivers massive amounts of web data for training advanced AI models with an accent on high-performance code, data retrieval from complex sources, and scalable pipelines. Focus on optimizing scraping processes, ensuring data quality, and managing storage for billions of data points including videos, transcripts, and audio.
Location: Remote with a 6 hour overlap with EST
Compensation: $75K - $150K
Company
Specialized technical team operating a massive distributed crawler for ingesting, segmenting, and annotating web data at scale.
What you will do
- Write, test, and refine high-performance code to extract data from various online sources.
- Manage complex data retrieval tasks including pagination and dynamic AJAX content.
- Clean and format extracted data to meet rigorous quality standards.
- Store and manage scraped data in databases, optimizing for speed and integrity.
- Monitor scraping processes and infrastructure to ensure continuous stable data flow.
Requirements
- Demonstrated ability to extract data from complex websites with minimal supervision and a portfolio of projects.
- Advanced skills in Python or JavaScript with BeautifulSoup, Scrapy, or Selenium.
- Strong knowledge of asynchronous programming, multithreading, and distributed scraping architectures.
- In-depth knowledge of HTML, CSS, JavaScript, and DOM.
- Experience with NoSQL databases like MongoDB or Cassandra for efficient storage.
- Experience deploying large-scale scraping jobs on AWS, Google Cloud, or Azure.
Nice to have
- Ability to apply machine learning for data cleaning, categorization, or analysis.
- Active participation in relevant open-source projects.
Culture & Benefits
- Impactful work at the forefront of AI development and knowledge graph creation.
- High-output culture prioritizing low ego, technical autonomy, and rapid execution.
- Remote flexibility with comprehensive benefits and equity package.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →