Principal ML Solutions Architect (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Principal ML Solutions Architect (AI): Designing and implementing optimized inference and fine-tuning workflows for a serverless LLM platform with an accent on performance, quality, and scalability. Focus on optimizing LLM inference at the framework and hardware level, leading fine-tuning efforts, and shaping the platform roadmap.
Location: Remote (United States)
Salary: $208,000 - $261,000 USD
Company
is building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment.
What you will do
- Own the most complex, high-stakes customer engagements from architecture through production across multiple modalities.
- Optimize LLM inference at the framework and hardware level and codify best practices into reusable playbooks.
- Lead supervised and reinforcement fine-tuning (SFT/RLHF) efforts to maximize model quality.
- Design and implement production-ready LLM solutions using Token Factory's inference services.
- Partner with product, engineering, and research teams to prototype features and influence the platform roadmap.
- Mentor Senior and mid-level Solutions Architects to raise the overall technical bar of the team.
Requirements
- 8+ years of experience in ML/AI systems, with at least 4 years focused on LLMs and generative AI.
- Expert knowledge of model architectures, fine-tuning approaches, and inference internals.
- Deep hands-on command of inference optimization, including quantization, KV-cache management, and batching.
- Experience running LLMs in production at scale using frameworks like vLLM, SGLang, or TensorRT-LLM.
- Strong Python programming skills and ability to explain technical concepts to diverse audiences.
- Must be authorized to work in the United States.
Nice to have
- Contributions to major OSS inference/ML projects (vLLM, SGLang, TensorRT-LLM).
- Published research or technical writing in the LLM/serving space.
- Experience with multimodal AI models (vision-language, speech).
- Proficiency with DevOps tooling such as Docker, Kubernetes, and infrastructure-as-code.
Culture & Benefits
- 100% company-paid medical, dental, and vision coverage for employees and families.
- 401(k) plan with up to 4% company match and immediate vesting.
- Generous parental leave: 20 weeks for primary and 12 weeks for secondary caregivers.
- Remote work reimbursement of up to $85/month for mobile and internet.
- Company-paid short-term, long-term, and life insurance.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →