Data Center Operations Systems Engineer
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Data Center Operations Systems Engineer (AI Cloud): Ensure new server, storage, and network infrastructure is properly racked, labeled, cabled, and configured with an accent on troubleshooting advanced GPU and networking systems, inventory management, and data center standards. Focus on documenting layouts in DCIM software, partnering with supply chain and support teams for deployments, and resolving hardware incidents.
Location: Onsite in Dallas, TX Data Center 5 days a week, shift work
Salary: $89K – $145K
Company
is a leader in AI cloud infrastructure serving tens of thousands of customers from AI researchers to enterprises and hyperscalers.
What you will do
- Ensure new server, storage, and network infrastructure is properly racked, labeled, cabled, and configured.
- Troubleshoot hardware and software issues in advanced GPU and networking systems.
- Document and update data center layout and network topology in DCIM software.
- Work with supply chain and manufacturing teams for timely system deployments and large-scale project plans.
- Manage parts depot inventory and track equipment through delivery, storage, staging, deployment, and handoff.
- Partner with HW support and RMA teams to resolve incidents, report issues, and disseminate solutions.
Requirements
- Presence required in Dallas, TX Data Center 5 days a week with shift work
- Strong experience with critical infrastructure: power distribution, air flow management, environmental monitoring, capacity planning, DCIM software, structured cabling, cable management.
- Familiar with carrier DIA circuit testing, fiber testing/troubleshooting, cable optics, single/three-phase power, PDU balancing.
- Solid understanding of server hardware, boot process, cold/hot aisle containment, multiple cable media types.
- Ability to structure, collaborate, and improve complex maintenance MOPs; action-oriented with willingness to train juniors and travel for new data center setups.
Nice to have
- 3+ years with critical infrastructure systems.
- Experience with network topology, 400Gb Infiniband, DDP/SCM cluster storage.
- 3+ years with ticketing systems like JIRA/Zendesk.
- Advanced Linux administration.
- Experience with high-performance GPU systems like Nvidia NVL72.
Culture & Benefits
- Generous cash and equity compensation.
- Health, dental, vision coverage for you and dependents.
- 401k with 2% company match (USA employees).
- Flexible paid time off.
- Wellness and commuter stipends for select roles.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →