TL;DR
AI Inference Engineer (AI): Developing and optimizing large-scale machine learning model deployment for real-time inference with an accent on API development, performance benchmarking, and system reliability. Focus on exploring novel research, implementing LLM inference optimizations, and addressing system bottlenecks.
Location: San Francisco
Salary: $200K – $350K
Company
hirify.global is seeking an AI Inference engineer to join their growing team.
What you will do
- Develop APIs for AI inference for both internal and external customers.
- Benchmark and address bottlenecks throughout the inference stack.
- Improve the reliability and observability of systems and respond to system outages.
- Explore novel research and implement LLM inference optimizations.
Requirements
- Experience with ML systems and deep learning frameworks (e.g., PyTorch, TensorFlow, ONNX).
- Familiarity with common LLM architectures and inference optimization techniques (e.g., continuous batching, quantization).
- Understanding of GPU architectures or experience with GPU kernel programming using CUDA.
Culture & Benefits
- Full-time U.S. employees enjoy a comprehensive benefits program including equity, health, dental, vision, retirement, fitness, commuter, and dependent care accounts.
- Full-time employees outside the U.S. enjoy a comprehensive benefits program tailored to their region of residence.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →