Fullstack Software Engineer (Reinforcement Learning, AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Fullstack Software Engineer (Reinforcement Learning): Build platforms, tools, and interfaces powering RL environment creation, data collection, and training observability with an accent on scalable systems for high-quality AI training data. Focus on owning product surfaces end-to-end from backend services and APIs to web UIs used by researchers and data labelers.
Location: San Francisco, CA or New York City, NY (hybrid policy: at least 25% time in office)
Salary: $300,000 - $405,000 USD
Company
’s mission is to create reliable, interpretable, and steerable AI systems like Claude.
What you will do
- Build and extend web platforms for RL environment creation, management, and quality review including configuration, versioning, and validation workflows
- Develop vendor-facing interfaces and tooling for external partners to create and iterate on training environments
- Design platforms for scalable human data collection with labeling workflows, quality assurance, and feedback mechanisms
- Create evaluation dashboards and observability UIs for real-time insights into environment quality and training health
- Build backend services and APIs connecting environment tools, data systems, and RL training infrastructure
- Expand code data generation pipelines producing diverse programming tasks with robust reward signals
Requirements
- Strong software engineering fundamentals and fullstack experience from database to frontend
- Proficient in Python and modern web stack (React, TypeScript or similar)
- Track record of shipping systems solving hard problems with high impact
- High agency, clear communication, and ability to thrive in fast-moving environment
- Location: Must spend at least 25% time in San Francisco or New York office
- Bachelor’s degree or equivalent in relevant field
Nice to have
- Experience building data collection, labeling, or annotation platforms at scale
- Multi-tenant platforms with role-based access and vendor management
- Cloud infrastructure (GCP/AWS), Docker, CI/CD, async Python, LLM workflows
- Working with external vendors on technical integrations
Culture & Benefits
- Competitive compensation, equity, generous vacation and parental leave
- Flexible working hours and visa sponsorship (with effort to support)
- Collaborative team focused on high-impact AI research
- Location-based hybrid policy with office collaboration
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →