Data Center Operations Systems Engineer
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Data Center Operations Systems Engineer (AI Cloud): Ensure new server, storage, and network infrastructure is properly racked, labeled, cabled, and configured with an accent on troubleshooting hardware/software issues in advanced GPU and networking systems. Focus on documenting data center layouts, managing inventory and parts depot, and partnering with teams for deployments, incident resolution, and RMA processes.
Location: Onsite in Los Angeles, CA Data Center 5 days/week, shift work; willing to travel for new data center bring-ups as needed
Salary: $89K – $159K
Company
is a leader in AI cloud infrastructure serving tens of thousands of customers from AI researchers to enterprises and hyperscalers.
What you will do
- Rack, label, cable, and configure new server, storage, and network infrastructure
- Troubleshoot hardware and software issues in advanced GPU and networking systems
- Document and update data center layout and network topology in DCIM software
- Manage parts depot inventory and track equipment through delivery, storage, staging, deployment, and handoff
- Partner with HW support, supply chain, manufacturing, and RMA teams for deployments, incident resolution, and faulty parts handling
- Follow installation standards for consistency in placement, labeling, and cabling across data centers
Requirements
- Strong experience with critical infrastructure systems: power distribution, airflow management, environmental monitoring, capacity planning, DCIM software, structured cabling, cable management
- Familiarity with carrier DIA circuit testing/turn-ups, fiber testing/troubleshooting, cable optics, single/three-phase power theories, PDU balancing
- Knowledge of multiple cable media types, cold/hot aisle containment, server hardware, and boot processes
- Ability to structure, collaborate, and improve complex maintenance MOPs
- Action-oriented with willingness to train junior staff
- Presence required in Los Angeles, CA Data Center 5 days/week; shift work
Nice to have
- 5+ years with critical infrastructure systems supporting data centers
- Experience with network topology, 400Gb Infiniband architectures, DDP/SCM cluster storage
- 5+ years with ticketing systems like JIRA/Zendesk
- Advanced Linux administration
- Experience with high-performance compute GPU systems (Nvidia NVL72)
Culture & Benefits
- Generous cash & equity compensation
- Health, dental, vision coverage for you and dependents
- Wellness and commuter stipends for select roles
- 401k with 2% company match (USA employees)
- Flexible paid time off
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →