TL;DR
Software Engineer 3 (Backend/Infrastructure): Building core systems and services that power model inference at scale for hirify.global Atlas with an accent on real-time, low-latency, and high-scale inference. Focus on improving performance, autoscaling, GPU utilization, and resource efficiency in a cloud-native environment.
Location: Based in Palo Alto, CA or Seattle, WA with an in-office or hybrid work model.
Salary: $109,000—$215,000 USD
Company
hirify.global is built for change, empowering our customers and our people to innovate at the speed of the market.
What you will do
- Design and build components of a multi-tenant inference platform integrated directly with hirify.global Atlas, supporting semantic search and hybrid retrieval.
- Collaborate with AI engineers and researchers to productionize inference for embedding models and rerankers — enabling both batch and real-time use cases.
- Contribute to platform capabilities such as latency-aware routing, model versioning, health monitoring, and observability.
- Improve performance, autoscaling, GPU utilization, and resource efficiency in a cloud-native environment.
- Work across product, infrastructure, and ML teams to ensure the inference platform meets the scale, reliability, and latency demands of Atlas users.
Requirements
- 2+ years of experience building backend or infrastructure systems at scale.
- Strong software engineering skills in languages such as Go, Rust, Python, or C++, with an emphasis on performance and reliability.
- Experienced in cloud-native architectures, distributed systems, and multi-tenant service design.
- Familiar with concepts in ML model serving and inference runtimes, even if not directly deploying models.
- Comfortable working across functional teams, including ML researchers, backend engineers, and platform teams.
- Motivated to work on systems integrated into hirify.global Atlas and used by thousands of developers.
Nice to have
- Experience integrating infrastructure with production ML workloads.
- Understanding of hybrid retrieval, prompt-driven systems, or retrieval-augmented generation (RAG).
- Contributions to open-source infrastructure for ML serving or search.
Culture & Benefits
- Committed to developing a supportive and enriching culture for everyone.
- Employee affinity groups.
- Generous parental leave policy.
- Flexible paid time off.
- Mental health counseling.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →