15 часов назад
LLMOps Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
LLMOps Engineer (AI): Deploying, managing, and optimizing large language models in production environments with an accent on scalable GPU infrastructure and security hardening. Focus on architecting auto-scaling inference clusters, implementing LLM observability, and optimizing resource management across multi-cloud environments.
Location: Work From Home (WFH)
Company
specializes in building scalable, secure, and high-performance GenAI systems across cloud platforms.
What you will do
- Deploy and optimize open-source large language models on cloud-based GPU instances.
- Implement rigorous security protocols for model weight storage, endpoint encryption, and prompt injection mitigation.
- Architect auto-scaling inference clusters on Azure and GCP to handle high-concurrency requests with low latency.
- Build custom observability stacks to track hallucination rates and latency for self-hosted deployments.
- Manage complex GPU scheduling and cost-optimization strategies across Azure, GCP, and AWS.
Requirements
- 3+ years of experience in LLMOps or related AI engineering roles.
- Professional experience deploying GenAI workloads on Azure and GCP.
- Deep understanding of GPU partitioning and memory-efficient inference.
- Knowledge of GenAI security, including OWASP for LLMs and secure API design.
- Experience with LLM frameworks and a capability to document system architecture clearly.
Culture & Benefits
- Remote work arrangement (Work From Home).
- Agile environment with an emphasis on rapid iteration of prompts and model parameters.
- Opportunity to work on cutting-edge GenAI production systems.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →