14 часов назад

Senior MLOps Engineer (AI)

Формат работы

remote (Global)

Тип работы

fulltime

Грейд

senior

Английский

Вакансия из Telegram канала -

Мэтч & Сопровод

Покажет вашу совместимость и напишет письмо

Описание вакансии

Senior MLOps Engineer

Company

Fortytwo

Conditions

1 day agoSenior Anywhere Remote Full Time Ai Jobs by Fortytwo

Skills

Opensearch Onnx Loki Mig Job Scheduler Tensorrt Nos Bash Airflow Gpu Rag Lora Distributed Training Fine-Tuning S3 Github Actions Containerization Elasticsearch Monitoring Aws Ci/Cd Azure Gcp Mlops Grafana Prometheus Llm Vllm Triton Helm Python Go Rust Kubernetes Model Orchestration Model Merging Slm Lmm Tgi Cron Model Serving

About the Role

You will deploy and maintain production ML infrastructure, optimize GPU utilization, and serve large and small language models. You will build CI/CD pipelines, create Helm templates for Kubernetes deployments, implement model optimization and serving workflows, and set up monitoring, logging, and automated workflows to ensure reliable model delivery.

Requirements

Bachelor's or Master's degree in Computer Science Engineering or related field
Proficiency in Kubernetes Helm and containerization technologies
Experience with GPU optimization including MIG and NOS
Experience with cloud platforms such as AWS GCP and Azure
Knowledge of monitoring tools such as Grafana and Prometheus
Proficiency in scripting languages Python and Bash
Hands-on experience with CI/CD tools and workflow management systems
Familiarity with Triton Inference Server ONNX and TensorRT

Responsibilities

Deploy scalable production-ready ML services with optimized infrastructure
Manage and autoscale Kubernetes clusters
Optimize GPU resources using MIG and NOS
Manage cloud storage to ensure high availability and performance
Integrate LoRA and model merging workflows
Adapt and deploy state-of-the-art ML codebases
Deploy and manage LLMs SLMs and LMMs
Serve models using Triton Inference Server and other serving frameworks
Leverage vLLM and TGI for model serving
Optimize models with ONNX and TensorRT
Develop Retrieval-Augmented Generation systems
Set up monitoring and logging with Grafana Prometheus Loki Elasticsearch and OpenSearch
Write and maintain CI/CD pipelines using GitHub Actions
Create Helm templates for rapid Kubernetes node deployment
Automate workflows using cron jobs and Airflow DAGs

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник -