Senior Software Engineer, ML Infrastructure (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Software Engineer, ML Infrastructure (AI): Designing and operating scalable GPU-backed infrastructure for AI systems with an accent on high-throughput inference and distributed serving patterns. Focus on optimizing GPU occupancy, implementing model parallelism, and building observability for production-grade ML workloads.
Location: Hybrid (San Francisco, CA)
Company
provides blockchain analytics and AI solutions to help agencies and financial institutions detect and disrupt crypto-related financial crime.
What you will do
- Design and operate GPU cluster infrastructure in cloud environments (AWS/GCP), including orchestration and autoscaling.
- Optimize high-throughput inference by tuning serving systems for maximum token throughput and cost-effectiveness.
- Operationalize distributed inference strategies, including model and tensor parallelism for large-scale models.
- Integrate acceleration stacks like TensorRT, ONNX Runtime, vLLM, and FlashAttention to reduce inference costs.
- Manage heterogeneous workloads across accelerators (e.g., NVIDIA GPUs, Inferentia) to ensure predictable performance.
- Develop observability tools to monitor GPU load, memory utilization, and batching efficiency.
Requirements
- 5+ years of experience building and operating distributed systems or infrastructure in production.
- Experience deploying ML/LLM inference workloads on GPU clusters within AWS or GCP.
- Deep knowledge of high-throughput inference systems, batching strategies, and latency/cost trade-offs.
- Proficiency with ML serving frameworks such as Triton Inference Server, vLLM, Ray Serve, or ONNX Runtime.
- Experience with Kubernetes or equivalent orchestration systems in cloud environments.
- Bachelor’s degree in Computer Science or a related field.
Nice to have
- Familiarity with heterogeneous accelerators such as Inferentia.
- CUDA familiarity and experience debugging GPU-related issues.
Culture & Benefits
- High-velocity, high-ownership environment focused on experimentation and rapid shipping.
- Mission-driven work at the intersection of AI, national security, and fighting financial crime.
- Expectation of "AI fluency" to accelerate workflows and solve problems.
- Culture based on leadership principles: Impact-Oriented Trailblazer, Master Craftsperson, and Inspiring Colleague.
- Distributed-first company structure with global hubs.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →