Назад
1 час назад

Machine Learning Engineer

28 000 - 31 000AED
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
UAE

Мэтч & Сопровод

Покажет вашу совместимость и напишет письмо

Описание вакансии

Публикатор: Liz Kostina
Обсуждение: @devops_jobs
Dubai (ON-SITE, FULL-TIME!!!) - Machine Learning Engineer

#вакансия #vacancy #onsite #fulltime

AED 28 000 - 31 000

Important

- ONSITE Position in Dubai (4 days per week work from office)
- Fluent in Russian
- English B2 or higher

Job Content

- Design and optimize AI inference pipelines ensuring low-latency, high-throughput model serving for enterprise applications.
- Build and maintain scalable AI infrastructure supporting complex, large-scale workloads efficiently.
- Enable reliable deployment and operation of high-performance AI model serving frameworks across environments.
- Ensure effective GPU resource utilization and cost-efficient AI workload execution.
- Establish comprehensive monitoring and observability for consistent model inference performance.
- Uphold enterprise-grade security, governance, and MLOps best practices throughout the AI delivery lifecycle.

Essential Qualifications

- Bachelor or Equivalent Degree
- 7+ years total engineering or operational experience
- At least 5+ years of relevant experience in a similar role
- Experience within large and complex global enterprises defined by high availability, transaction rates, and geographical distribution

Essential Knowledge & Skills

- Deep Learning Inference: Expertise in TensorRT, vLLM, Triton, FasterTransformer.
- Model Optimization: Experience with ONNX, GGUF, quantization (FP16, INT8, FP8).
- Distributed Systems: Experience with NCCL, MPI, InfiniBand, RDMA, and multi-node GPU workloads.
- Scalable AI Serving: Hands-on experience with Triton Inference Server, vLLM, TensorFlow Serving .
- Profiling & Debugging: Familiarity with nvidia-smi, Nsight, nvprof, TensorRT Profiler.
- Cloud & On-Prem GPU Management: Experience with Kubernetes (K8s), OpenShift, GPU scheduling (Kubeflow, Ray, KServe).
- Understanding of vector databases and their applications in analytics and AI workloads.
- Proficiency in programming languages like Python, Scala, and SQL
- Experience working collaboratively on programming projects and managing the architecture of such projects.
- Advanced skills working in a Linux environment.

Nice to have

- GPU Programming: Knowledge of CUDA, cuDNN, NCCL, Tensor Cores for optimizing inference.
- Speculative Decoding & FlashAttention for LLM inference.
- Experience optimizing token streaming for chat applications.
- Experience with vector databases (Qdrant, Milvus) for RAG workloads.

Benefits

- Opportunity to work on cutting-edge technologies in a highly innovative environment
- Dynamic and friendly work environment
- Company assistance with relocation expenses
- Medical insurance

If interested, please send your CV to: or ekostina@enfint.a

Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений