TL;DR
Software Engineer (AI Infra Visibility): Design, build, and scale backend systems for AI and GPU cluster observability with an accent on high-performance distributed systems that power telemetry ingestion, data processing, and APIs. Focus on detecting complex infrastructure issues that impact AI workloads.
Location: On Site, Palo Alto, California
Company
Clockwork Systems is pioneering a software-driven approach to AI fabrics by delivering cross-stack observability, workload fault tolerance, and performance acceleration.
What you will do
- Design and build scalable backend systems for metric collection, processing, and analysis.
- Develop robust methods to detect complex infrastructure issues that impact AI workloads.
- Build large distributed systems running in production environments.
- Collaborate across teams to deliver reliable, performant, and maintainable systems.
Requirements
- 2+ years of industry experience building and operating production software systems.
- Strong foundation in data structures, algorithms, and software design.
- Fluency in one or more programming languages: C, C++, Go, Java, or Python.
- Solid understanding of operating systems fundamentals (threads, scheduling, synchronization; kernel programming is a plus).
- Experience with databases, including design, development, or scaling.
- Excellent debugging, problem-solving, and communication skills.
Nice to have
- Knowledge of networking protocols; familiarity with NIC architecture and operation.
- Understanding of GPU or AI infrastructure (e.g., DCGM, PyTorch).
- Familiarity with observability systems (metrics, logs, traces); experience with OpenTelemetry, Prometheus, or distributed tracing is a bonus.
- Experience designing, building, and scaling large distributed systems.
- Hands-on experience with service-oriented architectures and cloud platforms (AWS, GCP, Azure)
Culture & Benefits
- Challenging projects.
- A friendly and inclusive workplace culture.
- Competitive compensation.
- A great benefits package.
- Catered lunch.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →