Principal Observability Platform Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Principal Observability Platform Engineer (AI): Designing and scaling observability and security solutions for GPU cloud infrastructure with an accent on platform security and operational excellence. Focus on hardening Kubernetes, virtualization layers, and securing multi-tenant environments at scale.
Location: Must be based in the US
Company
is a GPU cloud provider engineered specifically for AI startups and large enterprises to provide cost-effective, high-performance infrastructure.
What you will do
- Lead security and observability engineering initiatives across distributed, multi-tenant infrastructure.
- Design scalable and resilient solutions to mitigate architectural and systemic risks.
- Harden Kubernetes, virtualization layers, GPU workloads, and platform services.
- Strengthen identity, authentication, authorization, and secrets management systems.
- Embed automated security validation and guardrails into CI/CD pipelines.
- Mentor junior engineers and collaborate with the CISO to shape long-term platform strategy.
Requirements
- 10+ years of hands-on experience in security or observability engineering for cloud or hyperscale distributed systems.
- Strong software engineering skills in Go, Python, Rust, or similar languages.
- Deep expertise in Linux systems internals, Kubernetes, and container security.
- Proficiency with Infrastructure-as-Code (Terraform) and cloud-native architectures.
- Proven experience securing multi-tenant environments at scale.
- Must be based in the US
Nice to have
- Experience building observability platforms and telemetry pipelines.
- Familiarity with GPU cloud infrastructure or AI workloads.
- Exposure to distributed tracing, metrics, and log aggregation tools.
Culture & Benefits
- Highly competitive compensation package including base salary and equity with annual reviews.
- Remote-first work environment providing high autonomy and flexibility.
- Opportunity to join a fast-growing AI infrastructure startup with significant impact.
- Collaborative and innovative culture built on ownership and accountability.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →