Назад
Company hidden
9 часов назад

Principal ML Solutions Architect (AI)

208 000 - 261 000$
Формат работы
remote (только USA)
Тип работы
fulltime
Грейд
senior
Английский
c1
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Principal ML Solutions Architect (AI): Designing and implementing optimized inference and fine-tuning workflows for a serverless LLM platform with an accent on performance, quality, and scalability. Focus on optimizing LLM inference at the framework and hardware level, leading fine-tuning efforts, and shaping the platform roadmap.

Location: Remote (United States)

Salary: $208,000 - $261,000 USD

Company

hirify.global is building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment.

What you will do

  • Own the most complex, high-stakes customer engagements from architecture through production across multiple modalities.
  • Optimize LLM inference at the framework and hardware level and codify best practices into reusable playbooks.
  • Lead supervised and reinforcement fine-tuning (SFT/RLHF) efforts to maximize model quality.
  • Design and implement production-ready LLM solutions using Token Factory's inference services.
  • Partner with product, engineering, and research teams to prototype features and influence the platform roadmap.
  • Mentor Senior and mid-level Solutions Architects to raise the overall technical bar of the team.

Requirements

  • 8+ years of experience in ML/AI systems, with at least 4 years focused on LLMs and generative AI.
  • Expert knowledge of model architectures, fine-tuning approaches, and inference internals.
  • Deep hands-on command of inference optimization, including quantization, KV-cache management, and batching.
  • Experience running LLMs in production at scale using frameworks like vLLM, SGLang, or TensorRT-LLM.
  • Strong Python programming skills and ability to explain technical concepts to diverse audiences.
  • Must be authorized to work in the United States.

Nice to have

  • Contributions to major OSS inference/ML projects (vLLM, SGLang, TensorRT-LLM).
  • Published research or technical writing in the LLM/serving space.
  • Experience with multimodal AI models (vision-language, speech).
  • Proficiency with DevOps tooling such as Docker, Kubernetes, and infrastructure-as-code.

Culture & Benefits

  • 100% company-paid medical, dental, and vision coverage for employees and families.
  • 401(k) plan with up to 4% company match and immediate vesting.
  • Generous parental leave: 20 weeks for primary and 12 weeks for secondary caregivers.
  • Remote work reimbursement of up to $85/month for mobile and internet.
  • Company-paid short-term, long-term, and life insurance.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →