Technical Lead Manager, ML Platform (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Technical Lead Manager, ML Platform (AI/ML Infrastructure): Leading the development and scaling of core infrastructure for machine learning and self-hosted LLM applications with an accent on low-latency model serving, streaming feature ingestion, and distributed training. Focus on building systems that ensure dependable and fast ML at scale, including high-throughput GPU inference and real-time feature pipelines.
Location: Must live within commuting distance of New York, Seattle, Los Angeles, or San Francisco hubs
Salary: $255,000 – $345,000 per year + equity
Company
is the largest livestream shopping platform in North America and Europe, enabling users to buy, sell, and discover items across hundreds of categories.
What you will do
- Own the infrastructure powering AI and ML models for growth, recommendations, trust and safety, and fraud detection.
- Design and scale inference infrastructure for large models with low latency and high throughput.
- Oversee real-time feature pipelines to ensure single-second feedback from behavioral signals and high training fidelity.
- Lead the development of distributed training and inference pipelines leveraging GPUs and data parallelism.
- Build abstractions, APIs, and developer tools to empower scientists to iterate faster on near-realtime features.
- Provide technical leadership, perform architectural reviews, and contribute to the codebase.
Requirements
- Must be based in the US and live within commuting distance of the NY, Seattle, LA, or SF hubs.
- 1+ years of experience as a Technical Lead Manager (TLM) developing production ML systems at consumer-scale.
- 5+ years of software engineering experience building production systems for high loads.
- Professional experience with Python.
- Experience with operational databases such as PostgreSQL, DynamoDB, Elasticsearch, and Redis.
- Proficiency with ML tools like MLFlow, LitServe, TorchServe, or Triton.
Nice to have
- Familiarity with AWS services: Sagemaker, Lambda, Kinesis, S3, EC2, EKS/ECS.
- Experience with Apache Kafka and Flink.
- Knowledge of monitoring tools like DataDog and Grafana.
Culture & Benefits
- Comprehensive health insurance (Medical, Dental, Vision).
- 401k offering with employer match up to 4% for US employees.
- Home office setup allowance and monthly stipends for cell phone and internet.
- Generous holiday policy and 16 weeks of paid parental leave.
- Wellness and childcare annual allowances.
- Monthly budget to use the platform as a buyer and seller.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →