Назад
Company hidden
6 дней назад

Member Of Technical Staff (AI Infrastructure)

175 000 - 220 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Member of Technical Staff (Performance Optimization): Building and optimizing high-performance infrastructure for generative AI inference with an accent on low-level GPU kernels, distributed systems, and compute efficiency. Focus on maximizing the performance of LLMs, VLMs, and video models through profiling, CUDA/Triton optimization, and co-designing hardware-aware model architectures.

Location: Must be based in or able to commute to San Mateo, CA

Salary: $175,000 - $220,000 USD (plus equity)

Company

A Series C, high-growth startup building the future of generative AI infrastructure, backed by top-tier venture firms.

What you will do

  • Optimize GPU kernels and system performance for large-scale training and inference workloads.
  • Analyze and resolve latency, throughput, and memory bottlenecks across the AI stack.
  • Implement performance optimizations using CUDA, Triton, and advanced profiling tools.
  • Collaborate with researchers to tune model architectures for maximum hardware efficiency.
  • Scale inference and training systems within multi-GPU and multi-node environments.
  • Maintain and improve benchmarking infrastructure to track execution speed and utilization.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.
  • 5+ years of experience in performance optimization or high-performance computing.
  • Proficiency in CUDA or ROCm with extensive experience using GPU profiling tools.
  • Strong background in PyTorch and performance-critical model execution.
  • Deep understanding of GPU architecture, parallel programming, and compute kernels.
  • Proven ability to debug and optimize distributed systems in multi-GPU environments.

Nice to have

  • Master’s or PhD in Computer Science or a related field.
  • Experience optimizing LLMs, VLMs, or video models for production.
  • Knowledge of ML compilers like torch.compile, Triton, or XLA.
  • Background in hardware-aware model design and cloud-scale infrastructure (Kubernetes).

Culture & Benefits

  • Meaningful equity participation in a rapidly growing, well-funded startup.
  • Opportunity to solve high-impact, industry-leading challenges in AI infrastructure.
  • Collaborative environment working alongside veterans from Meta and Google AI.
  • Ownership of critical projects with direct influence on global AI deployment efficiency.
  • Inclusive, innovation-focused culture with minimal bureaucracy.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →