Эта вакансия в архиве

Посмотреть похожие вакансии ↓
Company hidden
обновлено 1 месяц назад

DevOps Engineer (HPC)

Формат работы
remote (только Europe)
Тип работы
fulltime
Грейд
middle/senior
Английский
b2
Страна
France/UK/Spain +5 еще

Описание вакансии

Текст:
/

TL;DR

DevOps Engineer (HPC): Building and scaling a scalable Kubernetes-based platform for large-scale AI and HPC workloads with an accent on infrastructure reliability, automation, and security. Focus on designing fault-tolerant systems, driving infrastructure innovation, and ensuring high availability in a fast-paced startup environment.

Location: Must be based in or willing to relocate to France or the UK, or remote from specified European countries with mandatory periodic office visits in Paris.

Company

hirify.global develops high-performance, open-source AI models and infrastructure, aiming to democratize AI for enterprise and cloud environments.

What you will do

  • Design, build, and operate a scalable Kubernetes platform for AI and HPC workloads ensuring performance and security.
  • Manage full lifecycle of cluster operations including automation, monitoring, and orchestration.
  • Drive infrastructure innovation through tooling, CI/CD pipelines, and observability improvements.
  • Implement zero-trust security models including IAM and network access controls.
  • Develop user-centric features to simplify operations for sysadmins and customers.
  • Lead incident resolution with root-cause analysis to improve system resilience.

Requirements

  • Experience in infrastructure engineering roles such as DevOps, SRE, or platform engineering.
  • Proficiency in software development, preferably Golang, and Kubernetes internals.
  • Hands-on experience with containerization, orchestration tools, and infrastructure-as-code (Terraform, CloudFormation).
  • Knowledge of monitoring and observability tools like Prometheus, Grafana, ELK, Datadog.
  • Experience with highly available distributed systems and reliability KPIs.
  • Location: Must be based in or willing to relocate to France or the UK, or remote from specified European countries with mandatory office visits.
  • Excellent problem-solving and communication skills.

Nice to have

  • Experience with HPC workload managers (Slurm) and distributed storage systems (Lustre, Ceph).
  • Contributions to open-source projects.

Culture & Benefits

  • Competitive salary and equity.
  • Health insurance and private pension plan.
  • Transportation, sport, and meal allowances.
  • Generous parental leave policy.
  • Visa sponsorship available.