Lead Machine Learning Engineer (AI)

Формат работы

onsite

Тип работы

fulltime

Грейд

lead

Английский

Страна

Singapore

Описание вакансии

Текст:

TL;DR

Lead Machine Learning Engineer (AI): Lead design and implementation of advanced model optimization pipelines and scalable inference systems with an accent on inference runtime optimization, performance benchmarking, and cost efficiency. Focus on leading teams through complex optimization challenges, architectural oversight, and ensuring operational sustainability of AI solutions across cloud, on-prem, and edge environments.

Location: Singapore only (Singapore Citizens and Permanent Residents)

Company

hirify.global is a dynamic and inclusive technology consultancy with 30+ years of experience delivering impactful solutions by solving complex business problems through technology.

What you will do

Lead design and implementation of model optimization pipelines including quantization, pruning, and distillation.
Architect and tune inference runtimes and serving frameworks for optimal performance.
Guide teams on high-throughput serving strategies like batching, caching, and asynchronous scheduling.
Develop benchmarks and dashboards to measure system efficiency improvements.
Collaborate with infrastructure, MLOps, and product teams to integrate inference optimization into production workflows.
Provide technical leadership and mentorship fostering a culture of experimentation and continuous improvement.

Requirements

Must have current right to work in Singapore (Singapore Citizens or Permanent Residents only).
Deep expertise in model and runtime optimization techniques and frameworks (vLLM, NVIDIA Triton/Dynamo).
Strong proficiency with deep learning frameworks such as PyTorch and TensorFlow and production deployment experience.
Experience with profiling tools and tuning GPU/accelerator workloads for cost and performance efficiency.
Proven leadership of small-to-medium engineering teams or technical workstreams.
Strong communication skills connecting technical optimizations with business impact.

Nice to have

Familiarity with observability stacks, telemetry, and cost instrumentation for AI workloads.
Experience designing scalable inference systems across heterogeneous environments including GPU clusters, serverless, and edge.

Culture & Benefits

Flexible career development supported by interactive tools and numerous development programs.
Collaborative and supportive team culture focused on continuous learning and knowledge sharing.
Inclusive environment valuing autonomy balanced with cultivation culture.