Site Reliability Engineer (Kubernetes)

150 000 - 170 000$

Формат работы

remote (только USA)

Тип работы

fulltime

Грейд

senior

Английский

Страна

France/US

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Site Reliability Engineer (Kubernetes/AWS): Building and maintaining cloud infrastructure for large-scale machine learning on terabytes of biosignal data with an accent on reliability, security, and observability. Focus on designing infrastructure as code, leading cluster upgrades, and enhancing CI/CD pipelines for distributed numerical workloads.

Boston, MA - Remote. In-person office hubs available in Boston, New York, and Paris.

$150,000 – $170,000

Company

Leading at-home EEG platform supporting clinical development of novel therapeutics for neurological, psychiatric, and sleep disorders with FDA-cleared hardware and AI algorithms.

What you will do

Design and implement infrastructure as code solutions to improve reliability, security, and maintainability of cloud infrastructure.
Lead major infrastructure initiatives including cluster upgrades, security improvements, and architectural changes.
Develop and maintain CI/CD pipelines for safe and efficient deployments.
Improve observability through enhanced monitoring, logging, and alerting.
Participate in on-call rotation and lead incident response efforts.
Collaborate with development teams to boost application reliability and performance.
Maintain security posture through infrastructure hardening and automation.
Create and maintain documentation for infrastructure, deployments, and incident response.

Requirements

Strong experience with Kubernetes administration, including cluster management, security, and troubleshooting.
Proven track record with infrastructure as code using Terraform or similar.
Experience building and maintaining CI/CD pipelines, particularly with GitHub Actions, Azure DevOps, or ArgoCD.
Solid understanding of container technologies and build processes, especially Docker.
Strong cloud provider knowledge (e.g., AWS) including networking, security, and services; Azure is a plus.
Experience with incident response and on-call in production environments.
Deep Linux systems administration and debugging; Windows Server familiarity is a plus.
Proficiency in at least one programming language (Python, Go, Typescript etc.).
Understanding of security and networking concepts including OAuth2/OIDC, DNS, TLS, TCP/UDP.
Bachelor's degree + 5-8 years of experience in SRE, DevOps, or similar.

Culture & Benefits

Robust asynchronous work practices for first-class remote experience.
In-person office hubs in Boston, New York, and Paris.
Total compensation includes equity, PTO, and other benefits.
Culture emphasizes curiosity, simplicity, composability, self-service, and empathy.
Diverse team focused on robust systems and high impact.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →