TL;DR
Research Engineer, Embodied Generalist Agent (AI): Developing general-purpose agents capable of perceiving, reasoning, planning, and executing real-time actions in complex environments with an accent on multimodal large language models (LLMs), vision-language-action (VLA) models, and reinforcement learning (RL). Focus on bridging the gap between high-level long-horizon planning and low-level high-frequency motor control, creating agents that adaptively master tasks in virtual testbeds.
Location: Tokyo, Japan
Company
hirify.global is a team of scientists, engineers, and machine learning experts advancing the state of the art in artificial intelligence for widespread public benefit and scientific discovery.
What you will do
- Develop and optimize state-of-the-art agent architectures that seamlessly integrate multimodal perception, reasoning, and precise real-time execution.
- Build and scale training recipes utilizing supervised fine-tuning, reinforcement learning, imitation learning, and/or in-context learning.
- Design advanced systems that enable agents to reason over long horizons and effectively utilize memory to solve complex, extended tasks.
- Research and implement capabilities that allow agents to adapt to new environments and learn from experience at test time.
- Establish rigorous benchmarks within virtual environments to measure progress in general agent capabilities and embodied intelligence in unseen environments.
Requirements
- Bachelors/Masters/Ph.D. in Computer Science, Artificial Intelligence, or a related field.
- Experience with relevant ML frameworks such as JAX, TensorFlow, or PyTorch.
- Strong programming skills in Python and experience with large-scale data pipelines.
- Solid understanding of LLM internals, e.g., typical training pipelines, computational characteristics of training/inference, mechanisms for multimodal extension.
- Knowledge of Deep Reinforcement Learning (RL), LLM Reasoning, Imitation Learning, Memory-Based Architectures, Vision-Language-Model (VLM), and/or Vision-Language-Action (VLA) models.
- Proven track record of designing, implementing, and maintaining robust technical assets (such as libraries, frameworks, or models) used by a large number of technical stakeholders; experience with OSS contributions is a plus.
Nice to have
- A minimum of 5 years of relevant professional experience.
- Experience building agents for 3D virtual environments, simulators, or video games.
- Strong track record in competitions in machine learning, data science, or AI in games.
- Strong track record in AI competitions or publications in top-tier conferences (NeurIPS, ICLR, ICML, CVPR, etc.).
Culture & Benefits
- Committed to equal employment opportunities regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition or any other basis as protected by applicable law.
- Value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →