Research Engineer (Agentic AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Research Engineer (Agentic AI): Building and optimizing task configurations and environments for evaluation datasets on the CUA framework with an accent on safety redteaming, business tasks, and long-horizon agentic tasks. Focus on delivering custom evaluation pipelines and improving the evaluation harness.
Location: Fully remote-friendly (Asia and North America). Preference for candidates who can align with Pacific Time (UTC-7/8) or China/Singapore Time (UTC+8). Offices available in San Francisco and Singapore.
Company
(YC W25) develops an agentic RL platform and evaluation frameworks for frontier AI agents, serving frontier labs and Fortune 500 companies.
What you will do
- Build environments for CUA evaluation datasets, including safety redteaming and long-horizon agentic tasks.
- Deliver custom CUA datasets and evaluation pipelines based on client requirements.
- Contribute to the development and improvement of the evaluation harness.
Requirements
- Proficiency in Python, Docker, and Linux environments.
- Experience with React for frontend development.
- Production-level software development experience.
- Ability to align with Pacific Time (UTC-7/8) or China/Singapore Time (UTC+8).
Nice to have
- Startup experience in early-stage technology companies.
- Hands-on experience with LLM evaluation frameworks (e.g., EleutherAI, Inspect).
- Experience building custom evaluation pipelines or high-quality research datasets.
- Background in competitive programming or multimodal AI evaluation.
- Understanding of AI safety and alignment considerations.
Culture & Benefits
- Unlimited access to API credits for leading providers (OpenAI, Anthropic, Gemini, Cursor, etc.).
- Visa and relocation support provided for strong full-time candidates to the USA or Singapore.
- Work alongside international Olympiad medallists and published AI researchers.
- Flexible arrangement options including full-time, part-time, or internships for exceptional candidates.
Hiring process
- Initial screening call.
- Five-hour take-home technical assignment.
- Paid, week-long work trial prior to the final offer.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →