TL;DR
Senior Software Engineer (AI): Building real-time data pipelines and serving systems that power large-scale ML models with an accent on designing, coding, and optimizing ETL pipelines, GPU inference serving. Focus on ensuring models get the freshest data and serve results with millisecond-level latency.
Location: Bengaluru, India
Company
hirify.global’s mission is to empower every person and every organization on the planet to achieve more.
What you will do
- Design & code real-time ETL/feature pipelines (e.g., Flink or Spark Structured Streaming) feeding online stores/caches with strict freshness.
- Define and meet SLOs with OpenTelemetry/Prometheus/Grafana for metrics, tracing, and alerting.
- Implement robust queuing/streaming with Kafka/Pulsar.
- Optimized GPU inference services on Triton Inference Server (or ONNX Runtime/TensorRT).
- Profile & optimize end-to-end: CPU/GPU utilization, memory, I/O, vectorization, caching etc.
- Collaborate with applied scientists on feature contracts, embedding pipelines, validation.
Requirements
- Bachelor’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
- 6+ years building distributed systems in production.
- Production experience with streaming frameworks (Flink or Spark ) and messaging (Kafka).
- Hands-on with Kubernetes and containers; comfort with service ops (logs, metrics, scaling).
- Practical experience with GPU inference on Triton or ONNX Runtime/TensorRT (model packaging, runtime tuning, batching).
- Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry).
Nice to have
- Master’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
- Experience with real-time feature stores or embedding pipelines.
- Prior contributions to GPU batching, dynamic scheduling, or multi-model serving.
- 6+ years building distributed systems in production.
Culture & Benefits
- Employees come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals.
- Values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →