Назад
14 часов назад

Senior MLOps Engineer (AI)

Формат работы
remote (Global)
Тип работы
fulltime
Грейд
senior
Английский
b2
vacancy_detail.hirify_telegram_tooltipВакансия из Telegram канала -

Мэтч & Сопровод

Покажет вашу совместимость и напишет письмо

Описание вакансии

Senior MLOps Engineer

Company

Fortytwo

Conditions

1 day agoSenior Anywhere Remote Full Time Ai Jobs by Fortytwo

Skills

Opensearch Onnx Loki Mig Job Scheduler Tensorrt Nos Bash Airflow Gpu Rag Lora Distributed Training Fine-Tuning S3 Github Actions Containerization Elasticsearch Monitoring Aws Ci/Cd Azure Gcp Mlops Grafana Prometheus Llm Vllm Triton Helm Python Go Rust Kubernetes Model Orchestration Model Merging Slm Lmm Tgi Cron Model Serving

About the Role

You will deploy and maintain production ML infrastructure, optimize GPU utilization, and serve large and small language models. You will build CI/CD pipelines, create Helm templates for Kubernetes deployments, implement model optimization and serving workflows, and set up monitoring, logging, and automated workflows to ensure reliable model delivery.

Requirements

  • Bachelor's or Master's degree in Computer Science Engineering or related field
  • Proficiency in Kubernetes Helm and containerization technologies
  • Experience with GPU optimization including MIG and NOS
  • Experience with cloud platforms such as AWS GCP and Azure
  • Knowledge of monitoring tools such as Grafana and Prometheus
  • Proficiency in scripting languages Python and Bash
  • Hands-on experience with CI/CD tools and workflow management systems
  • Familiarity with Triton Inference Server ONNX and TensorRT

Responsibilities

  • Deploy scalable production-ready ML services with optimized infrastructure
  • Manage and autoscale Kubernetes clusters
  • Optimize GPU resources using MIG and NOS
  • Manage cloud storage to ensure high availability and performance
  • Integrate LoRA and model merging workflows
  • Adapt and deploy state-of-the-art ML codebases
  • Deploy and manage LLMs SLMs and LMMs
  • Serve models using Triton Inference Server and other serving frameworks
  • Leverage vLLM and TGI for model serving
  • Optimize models with ONNX and TensorRT
  • Develop Retrieval-Augmented Generation systems
  • Set up monitoring and logging with Grafana Prometheus Loki Elasticsearch and OpenSearch
  • Write and maintain CI/CD pipelines using GitHub Actions
  • Create Helm templates for rapid Kubernetes node deployment
  • Automate workflows using cron jobs and Airflow DAGs

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник -