Назад
Company hidden
8 дней назад

DataOps Engineer (AI Platform Engineer)

Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
c1
Страна
Cyprus
Релокация
Cyprus
Вакансия из списка Hirify.GlobalВакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

DataOps Engineer (AI Platform Engineer): Designing and operating an on-prem AI platform for deploying and scaling models with an accent on multi-node GPU clusters and Kubernetes infrastructure. Focus on building reliable model serving runtimes, optimizing GPU utilization, and automating the model lifecycle via CI/CD pipelines.

Location: Must be based in Limassol, Cyprus. Full relocation support for the employee and their family is provided.

Company

A leading global trading broker specializing in high-scale fintech solutions for over a million clients worldwide.

What you will do

  • Collaborate with infrastructure teams to configure GPU servers, high-performance networking, and RDMA-enabled clusters.
  • Manage GPU MIG configurations and ensure scalable GPU operations and scheduling within Kubernetes.
  • Deploy and maintain model serving runtimes such as vLLM, ONNX, SGLang, Nvidia Triton, and KServe.
  • Build CI/CD pipelines and platform tooling for model packaging, versioning, and registry systems using MLflow.
  • Enable infrastructure and workflows for model fine-tuning (e.g., LoRA), focusing on scalability and automation.
  • Implement comprehensive observability and tracing for GPU clusters and model inference workflows.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
  • 5+ years of experience in infrastructure, platform engineering, or distributed systems, preferably with GPU workloads.
  • Strong expertise in Kubernetes and Linux-based environments.
  • Proficiency in Python and/or Go.
  • Experience with GPU infrastructure (NVIDIA/AMD) and model serving/inference systems.
  • Advanced English proficiency for business and technical communication.

Nice to have

  • Familiarity with networking concepts relevant to distributed systems, such as RDMA.

Culture & Benefits

  • Competitive salary and annual performance bonus.
  • Full relocation support including flights, housing, visas, and legal assistance.
  • Top-tier health insurance with full family coverage (medical, dental, vision, mental health) and life insurance.
  • Unlimited learning opportunities, English lessons, and education allowances for school and kindergarten fees.
  • 21 working days of annual leave plus public holidays and paid sick/parental leave.
  • Exclusive perks: Branded MINI Cooper company car, private parking, in-house sports clubs, and jet skis.

Hiring process

  • Intro call with Recruiter (30 minutes).
  • English language check (if needed).
  • Technical interview (90 minutes).
  • Behavioural interview (60 minutes).

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →