Principal Platform Engineer (GCP/ML)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Principal Platform Engineer (GCP/ML): Architect and lead infrastructure strategy for next-generation Production ML platform on Google Cloud with an accent on elastic scaling, security, and resilience for high-performance machine learning workloads. Focus on building paved road for engineers, automating model deployment to complex networking, CI/CD pipelines, and implementing comprehensive ML observability.
Location: Remote Ukraine
Company
helps customers monitor, manage, and protect against risks to their identities and personal information in the digital world, backed by WndrCo, Warburg Pincus, and General Catalyst.
What you will do
- Design, deploy, and maintain elastic scaling GCP infrastructure and Kubernetes for ML workloads.
- Build and maintain CI/CD pipelines for training, testing, and deploying ML models using Jenkins, GitHub Actions, or Airflow.
- Implement observability for model drift, accuracy, latency, performance, and system health including non-ML workloads.
- Deploy monitoring tools empowering teams and participate in on-call rotation for compliance like SOC.
- Collaborate with data, ML, backend, and frontend engineers for smooth production operations.
Requirements
- 8-10+ years in DevOps/Platform Engineering, with 2+ years operating production ML workloads
- Deep hands-on GCP (VPC-SC, IAM, Org Policies) and GKE (topology, Helm, Kustomize, ArgoCD)
- High proficiency in Istio (VirtualServices, mTLS) and Kong API Gateway
- Expert Terraform with Atlantis/GitOps workflow
- Experience with secrets/identity (Auth0, Dex, ESO, SOPS), Airflow, ML-serving (Triton, vLLM, MLflow)
- Manage Cloud SQL (PostgreSQL), BigQuery, Elasticsearch, ClickHouse
- Upper-intermediate spoken and written English
Nice to have
- ML observability experience with model accuracy and drift detection
- Ansible for cluster bootstrap/recovery
- CKA/CKS or GCP Professional certifications
- Loki, Grafana, or large-scale ClickHouse
Culture & Benefits
- High-trust, outcome-focused team solving challenging ML problems quickly
- Scrappy, nimble organization valuing individual contributions and impact
- Fast-paced growth environment to learn new technologies, products, and markets
- Inclusive community committed to no discrimination or barriers to success
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →