Назад
6 дней назад

RL Environments Engineer

15 600 - 27 700$
Формат работы
remote
Тип работы
project
Английский
c1
vacancy_detail.hirify_telegram_tooltipВакансия из Telegram канала -

Мэтч & Сопровод

Покажет вашу совместимость и напишет письмо

Описание вакансии

RL Environments Engineer

Компания: Preference Model
Тип занятости: #contract
Локация: #remote
Зарплата: от 15 600 USD до 27 700 USD

Подробное описание вакансии:

Нажмите, чтобы развернуть...

We’re hiring RL Environments Engineers to design and build MLE/SWE environments that deliver high-quality, diverse tasks with minimal supervision. You will target a specific language model, meet a defined difficulty distribution, and deliver about one task every 8-10 hours. This is a remote contractor role with ≥4 hours overlap to PST and advanced English (C1/C2) required.

Обязанности
- Design and build MLE/SWE environments and diverse tasks
- Target a specified language model and satisfy the required difficulty distribution
- Deliver ~1 task per 3–5 hours once onboarded
- Edit tasks within 24 hours based on customer feedback
- Onboard quickly and start delivering on day one with minimal supervision

Требования
- Strong Python (engineering-quality, not notebook-only)
- Hands-on LLM/GenAI work in production: you’ve shipped and operated real systems (not “wrapped an API and called it AI”)
- Experience designing environments/tasks for RL and/or evaluations
- Strong product/engineering ownership: comfortable building, fixing, and scaling end-to-end pipelines
- Docker + production mindset (debugging, reliability, iteration speed)
- ≥4 hours PST overlap and advanced English (C1/C2) for specs, reviews, and feedback
- Ability to meet throughput expectations and respond quickly to feedback

Будет плюсом
- Experience in high-stakes or regulated domains (e.g., healthcare, finance, fraud/risk, safety-critical systems)
- ML systems experience: CI/CD, monitoring, evaluation harnesses, MLOps, scalable pipelines
- Systems depth: C++/Rust/Scala/Java, performance/infra optimization, distributed systems
- Exposure to RL / bandits / agentic systems (not required, but a strong signal)

Что предлагаем
- Potential path to FTE and relocation to the Bay Area if performance and mutual fit align

Дополнительная информация
A take-home assignment is required as main part of the evaluation; details will be provided upon application. The time spent will be compensated if offered a job. Not a fit if: You’re primarily a prompt engineer without strong ML/engineering foundations; You’re a research-only / academic-only profile with lit

Контакты
- Telegram:

Стек технологий: #python #docker #c++ #rust #scala #java

🔗Ссылка на канал | 📋Все каналы

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник -