Data Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Data Engineer (AI): Developing systems, processes, and production code for data acquisition, preparation, quality evaluation, and delivery for AI model training with an accent on scalable pipelines, statistical analysis, and neural network impacts. Focus on investigating data issues, building production-grade tooling, fusing multi-source datasets, and creating metrics to improve model performance at scale.
Location: Palo Alto, CA
Salary: $240,000 - $280,000 USD
Company
Small, highly motivated team building AI systems to understand the universe and advance scientific discovery, with a flat structure emphasizing hands-on contribution, initiative, and strong communication.
What you will do
- Analyze data performance and impact across the model training lifecycle, investigating anomalies and data issues affecting model outcomes.
- Design, build, and improve data cleaning, transformation, and quality-control processes for high-quality training data.
- Research and develop advanced methods to enhance data quality and effectiveness in AI model development.
- Build and maintain production-grade data pipelines, tooling, and systems for ingesting, processing, validating, and delivering data at scale.
- Partner with acquisition, ML, and software teams to identify data needs and high-impact acquisition opportunities.
- Develop metrics, evaluation frameworks, and monitoring to assess data quality's influence on model behavior.
Requirements
- Bachelor’s degree in computer science, data science, physics, mathematics, or a STEM discipline.
- 1+ years of data/software engineering experience (internships count).
- Experience implementing or analyzing language models or neural networks.
Nice to have
- Professional experience in analytics, data science, ML, or data engineering.
- Building/operating production data pipelines for neural networks or large-scale ML.
- Strong Python experience and ML/data tools ecosystem.
- Parquet/columnar storage, Kubernetes, distributed environments.
- Predictive models, ML pipelines (clustering, forecasting, anomaly detection).
- Very large-scale datasets (TB-PB scale), strong statistical intuition.
- Ability to handle dynamic environments and ambiguous problems.
Culture & Benefits
- Flat organizational structure with leadership based on initiative and excellence.
- Emphasis on engineering excellence, curiosity, work ethic, prioritization, and communication.
- Comprehensive medical, vision, dental coverage; 401(k), short/long-term disability, life insurance.
- Various discounts and perks.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →