Member Of Technical Staff (AI Infrastructure)

175 000 - 220 000$

Формат работы

onsite

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Member of Technical Staff (Performance Optimization): Building and optimizing high-performance infrastructure for generative AI inference with an accent on low-level GPU kernels, distributed systems, and compute efficiency. Focus on maximizing the performance of LLMs, VLMs, and video models through profiling, CUDA/Triton optimization, and co-designing hardware-aware model architectures.

Location: Must be based in or able to commute to San Mateo, CA

Salary: $175,000 - $220,000 USD (plus equity)

Company

A Series C, high-growth startup building the future of generative AI infrastructure, backed by top-tier venture firms.

What you will do

Optimize GPU kernels and system performance for large-scale training and inference workloads.
Analyze and resolve latency, throughput, and memory bottlenecks across the AI stack.
Implement performance optimizations using CUDA, Triton, and advanced profiling tools.
Collaborate with researchers to tune model architectures for maximum hardware efficiency.
Scale inference and training systems within multi-GPU and multi-node environments.
Maintain and improve benchmarking infrastructure to track execution speed and utilization.

Requirements

Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.
5+ years of experience in performance optimization or high-performance computing.
Proficiency in CUDA or ROCm with extensive experience using GPU profiling tools.
Strong background in PyTorch and performance-critical model execution.
Deep understanding of GPU architecture, parallel programming, and compute kernels.
Proven ability to debug and optimize distributed systems in multi-GPU environments.

Nice to have

Master’s or PhD in Computer Science or a related field.
Experience optimizing LLMs, VLMs, or video models for production.
Knowledge of ML compilers like torch.compile, Triton, or XLA.
Background in hardware-aware model design and cloud-scale infrastructure (Kubernetes).

Culture & Benefits

Meaningful equity participation in a rapidly growing, well-funded startup.
Opportunity to solve high-impact, industry-leading challenges in AI infrastructure.
Collaborative environment working alongside veterans from Meta and Google AI.
Ownership of critical projects with direct influence on global AI deployment efficiency.
Inclusive, innovation-focused culture with minimal bureaucracy.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →