Forward Deployed Engineer, RL Environments (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Forward Deployed Engineer, RL Environments (AI): Owning the design, development, and operationalization of reinforcement learning environments with an accent on building sandboxed, reproducible execution environments that AI agents interact with during training and evaluation. Focus on writing production-quality infrastructure code, integrating with open-source RL tooling, and collaborating with the data operations team.
Location: Join our dedicated tech hubs in San Francisco or Wrocław, Poland. Hybrid model with 2 days per week in office, combining collaboration and flexibility.
Salary: $140,000 $200,000 USD
Company
is building the critical infrastructure that powers breakthrough AI models at leading research labs and enterprises.
What you will do
- Design, build, and maintain sandboxed RL environments for agentic AI training.
- Develop reproducible, containerized execution environments that support deterministic task rollouts and reward signal collection.
- Integrate with and extend open-source agentic tooling and custom CLI/API harnesses to enable multi-step agent interaction.
- Build instrumentation and observability layers so training runs and human annotation sessions produce clean, auditable data.
- Collaborate with data operations to design task curricula and evaluation protocols that stress-test model capabilities across environment types.
- Own environment deployment and reliability.
Requirements
- 2+ years of professional software engineering experience, with strong fundamentals in Python and at least one systems-level language (Go, Rust, C++).
- Demonstrated experience with containerization and sandboxing (Docker, Podman, Firecracker, or similar) in production or near-production contexts.
- Familiarity with RL concepts: MDPs, reward shaping, episode structure, observation/action spaces.
- Experience building or maintaining developer tooling, CLI tools, or infrastructure automation.
- Comfort working with browser automation frameworks or terminal interaction tooling.
- Strong debugging instincts.
Nice to have
- Direct experience building or contributing to RL environments (Gymnasium/Gym, PettingZoo, or custom environment implementations).
- Experience with agentic AI evaluation frameworks (SWE-bench, WebArena, OSWorld, TerminalBench, or similar).
- Familiarity with GCP or AWS infrastructure (Compute Engine, ECS/EKS, Cloud Build).
- Prior work at an AI data company, ML platform company, or AI research lab.
- Contributions to open-source projects in the RL, agents, or dev-tools space.
Culture & Benefits
- Fast-paced and high-intensity environment, perfect for ambitious individuals who thrive on ownership and quick decision-making.
- Career advancement opportunities directly tied to your impact.
- Be part of building the foundation for humanity's most transformative technology.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →