Infrastructure and Platform Engineer (Metal)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Infrastructure and Platform Engineer (Metal): Designing and operating Kubernetes-based platforms for workload orchestration and hardware allocation in large-scale AI systems with an accent on cluster lifecycle, scaling, and integration with CI/CD pipelines. Focus on building APIs, platform services, and operational maturity to support internal development and customer workloads on custom accelerator hardware.
Location: Hybrid based out of Santa Clara, CA; Austin, TX; or Toronto, ON. Offer contingent upon eligibility to access U.S. export-controlled technology; compliance with U.S. Export Administration Regulations required, may affect nationals of certain countries.
Salary: $100k - $500k (base + variable, depending on experience, skills, location)
Company
leads in cutting-edge AI technology, developing high-performance RISC-V CPUs and AI platforms unifying software, compilers, networking, and semiconductors.
What you will do
- Design and build platform services for workload orchestration, ML services, and internal development workflows.
- Develop APIs and systems for user and service interaction with infrastructure platforms.
- Own Kubernetes-based platforms, including cluster lifecycle, scaling, and operational maturity.
- Integrate platform systems with CI/CD pipelines, GitOps workflows, and internal tooling.
- Partner with SRE, infrastructure, and deployment teams to support large-scale environments.
Requirements
- Eligible to access U.S. export-controlled technology per U.S. EAR; citizenship/permanent residency or license approval may be required.
- Experienced backend or infrastructure engineer focused on platform development in large-scale environments.
- Strong expertise in Kubernetes (cluster provisioning, operators, production debugging).
- Proficient in Python or Go for APIs and platform services.
- Comfortable with Linux systems, networking fundamentals, and distributed systems.
- Collaborative and adaptable across engineering, infrastructure, and deployment teams.
What You Will Learn
- Large-scale AI platforms on custom accelerator hardware.
- Advanced Kubernetes patterns (operators, controllers, cluster lifecycle).
- Evolution of internal platforms to production-grade customer systems.
- Orchestration, APIs, and infrastructure for AI workloads.
- Scaling platform engineering across on-prem and hybrid environments.
Culture & Benefits
- Value collaboration, curiosity, and commitment to solving hard problems.
- Highly competitive compensation package and benefits.
- Equal opportunity employer.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →