Network Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Network Engineer (AI): Building and scaling networking infrastructure that powers AI research and deployment with an accent on physical network builds, software-defined networking, and automation systems. Focus on configuring routers, writing automation, troubleshooting physical connectivity, and designing scalable network architectures.
Location: Remote-Friendly (Travel-Required) | San Francisco, CA | Seattle, WA | New York City, NY. Expect all staff to be in one of our offices at least 25% of the time.
Salary: $320,000 - $405,000 USD
Company
is an AI safety company working to build reliable, interpretable, and steerable AI systems.
What you will do
- Design, deploy, and maintain high-performance networks across multiple data center sites, including managing physical infrastructure, routing protocols, and network security.
- Develop software to automate network provisioning, configuration management, and operational workflows.
- Implement comprehensive monitoring, build debugging tools, investigate outages, and drive continuous reliability improvements.
- Plan and execute network expansions to new sites and facilities, including working with vendors, managing physical builds, and migrating live traffic.
- Partner with compute, storage, and ML infrastructure teams to optimize network performance for AI workloads and troubleshoot complex distributed systems issues.
- Participate in on-call rotation and lead response to network-related incidents.
Requirements
- 5+ years of experience in network engineering, with hands-on experience building and operating production networks.
- Strong software development skills (Python, Go, or similar) and experience building network automation, tooling, or infrastructure software.
- Deep understanding of networking fundamentals: TCP/IP, BGP, ISIS, OSPF, VLANs, VPCs, routing, switching, and network security.
- Experience with the full stack of networking—from physical layer concerns (cabling, optics, hardware) through software-defined networking.
- Comfort working with Linux, command-line tools, and infrastructure-as-code approaches.
- Track record of debugging complex distributed systems issues that span network, compute, and application layers.
Nice to have
- Experience scaling networks in cloud environments (AWS, GCP, Azure) and/or on-premises data centers.
- Background in DevOps/SRE practices including monitoring, observability, and reliability engineering.
- Experience with Kubernetes networking, container orchestration, or service mesh technologies.
- Prior work at startups, small ISPs, or environments requiring broad technical responsibilities.
- Familiarity with AI/ML infrastructure and the unique networking requirements of large-scale training clusters.
Culture & Benefits
- Competitive compensation and benefits.
- Optional equity donation matching.
- Generous vacation and parental leave.
- Flexible working hours.
- Lovely office space in which to collaborate with colleagues.
Hiring process
- We encourage you to apply even if you do not believe you meet every single qualification.
- We think AI systems like the ones we're building have enormous social and ethical implications.
- We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →