Software Engineer, Fleet Hardware Health (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Software Engineer, Fleet Hardware Health (AI): Responsible for the reliability and uptime of ’s compute fleet, focusing on minimizing hardware failures and ensuring stable services. Focus on building automation systems, monitoring server health, and collaborating with infrastructure teams.
Location: Onsite in San Francisco
Salary: $255K – $490K + Offers Equity
Company
is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
What you will do
- Build and maintain automation systems for provisioning and managing server fleets.
- Develop tools to monitor server health, performance, and lifecycle events.
- Collaborate with clusters, networking, and infrastructure teams.
- Partner with external operators to ensure a high level of quality.
- Identify and fix performance bottlenecks and inefficiencies.
- Continuously improve automation to reduce manual work.
Requirements
- Experience managing large-scale server environments.
- Proficiency in Python, Go, or similar languages.
- Strong Linux, networking, and server hardware knowledge.
- Comfort digging into noisy data with SQL, PromQL, and Pandas or any other tool.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →