TL;DR
Senior Software Engineer (AI): Designing and optimizing high-performance serving systems and GPU inference frameworks to drive measurable latency improvements and cost efficiency across hirify.global’s ad ecosystem. Focus on CUDA kernel tuning and NUMA-aware threading to large-scale distributed orchestration and model deployment for deep learning and LLM workloads.
Location: Mountain View, United States
Salary: USD $119,800 – $234,700 per year (U.S.) or USD $158,400 – $258,000 per year (San Francisco Bay area and New York City metropolitan area)
Company
hirify.global is an equal opportunity employer.
What you will do
- Design and lead the development of large-scale, distributed online serving systems.
- Architect and optimize end-to-end inference infrastructure.
- Profile and optimize performance across the full stack.
- Own live-site reliability.
- Collaborate and mentor across teams.
Requirements
- Bachelor’s Degree in Computer Science or related technical field AND 4+ years technical engineering experience developing high-performance, distributed systems in C++.
- Deep expertise in GPU inference frameworks such as NVIDIA Triton Inference Server, CUDA, and TensorRT.
- Strong understanding of model-serving trade-offs.
- Proven ability to profile and optimize GPU and system workloads.
- Expertise in low-level system and OS internals.
Nice to have
- Master’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience developing high-performance, distrbuted systems in C++.
- Hands-on experience with real-time data streaming systems (Kafka, Flink, Spark Streaming), feature-store integration, and multi-region deployment for low-latency, globally distributed services.
- Familiarity with LLM inference optimization.
- Demonstrated success operating large-scale systems with SLA-based capacity forecasting, autoscaling, and performance telemetry.
Culture & Benefits
- Equal opportunity employer.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →