TL;DR
AI Engineer (Node.js / Next.js / TypeScript): Shaping AI infrastructure and driving production-ready LLM experiences with an accent on model performance, reliability, and cost. Focus on advanced prompt systems, structured outputs, and complex LLM workflows, leveraging observability and debugging tools to continuously improve model quality and operational efficiency.
Location: Applicants from any country are welcome to apply for the position as long as they are located within approximately ± 4 hours of CET.
Company
Ruby Labs is a leading tech company that creates and operates innovative consumer products across the health, education, and entertainment industries.
What you will do
- Design complex, dynamic prompt templates with conditional logic to maximize generation quality and reasoning.
- Implement various response schemes (JSON mode, function calling, Zod/JSON schemas) to ensure AI outputs are predictable and ready for seamless integration into application logic.
- Build robust evaluation pipelines and using Langfuse to collect feedback and score the quality of responses in real time.
- Debug complex LLM chains using Langfuse traces to identify bottlenecks and optimize for cost, latency, and context window usage.
- Run systematic experiments across different models via OpenRouter and analyzing results based on quantitative metrics.
- Re-evaluate model performance as new architectures emerge and performing fine-tuning when necessary to meet specific domain requirements.
Requirements
- Deep knowledge of Node.js & Next.js to build reliable services and handle complex LLM-generated data.
- Proven experience in building prompts where content is highly dependent on input variables and context injection.
- Experience working with OpenRouter unified APIs, managing rate limits, and selecting the most cost-effective models for specific tasks.
- Understanding of LLM observability principles — setting up tracing, creating test datasets, and integrating scoring systems using Langfuse (or similar).
- Experience with frameworks like RAGAS or building custom “LLM-as-a-judge” systems for evaluation.
- Ability to transform raw generation logs into actionable business metrics and technical insights.
Nice to have
- Practical experience in fine-tuning models for specific domain tasks or JSON compliance.
- Understanding how to build and optimize Retrieval-Augmented Generation systems, including indexing, retrieval, and re-ranking.
- Basic knowledge of Python for working with data science scripts or AI evaluation libraries.
Culture & Benefits
- Enjoy the freedom to work from anywhere, anytime, promoting a healthy work-life balance.
- Enjoy unlimited paid time off to recharge and prioritize your well-being, without counting days.
- Celebrate and relax on national holidays with paid time off to unwind and recharge.
- Experience seamless productivity with top-notch Apple MacBooks provided to all employees who need them.
- Unlock the benefits of flexibility, autonomy, and entrepreneurial opportunities with Flexible Independent Contractor Agreement.
Hiring process
- Recruiter Screening (40 minutes)
- Technical Interview (60 minutes)
- Final Interview (30 minutes)
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →