Назад
Company hidden
1 день назад

Site Reliability Engineer (Kubernetes)

Формат работы
remote (только USA)/hybrid
Тип работы
fulltime
Грейд
middle/senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Site Reliability Engineer (Kubernetes): Ensuring the operational readiness and scalability of compute platforms for autonomous systems with an accent on batch workload orchestration and observability. Focus on building automation, diagnosing complex system failures, and maintaining high-scale Kubernetes infrastructure to support engineering teams.

Location: Must be based in the U.S. (Pittsburgh, PA or Remote)

Company

hirify.global develops advanced autonomous systems and AI-driven solutions for the trucking transportation industry.

What you will do

  • Instrument and manage large-scale batch workloads across Kubernetes clusters.
  • Diagnose and triage job failures to ensure system reliability.
  • Collaborate with cross-functional teams to improve platform capabilities and workload efficiency.
  • Scale system reliability through increased automation and CI/CD workflows.
  • Develop and maintain a comprehensive library of runbooks for operational knowledge.
  • Participate in an on-call rotation to uphold production SLOs and SLAs.

Requirements

  • Must be a U.S. person or citizen due to export control and national security regulations.
  • Strong experience with Kubernetes and container orchestration in production environments.
  • Fundamental understanding of Linux internals, TCP/IP networking, and storage subsystems.
  • Proficiency in debugging cloud-native tools like Prometheus and OpenTelemetry.
  • Ability to provide guidance on engineering design limitations and performance scaling.
  • Strong communication skills for working in a distributed team environment.

Culture & Benefits

  • Commitment to an inclusive, diverse, and innovative workplace.
  • Opportunity to work on cutting-edge autonomous technology and robotics.
  • Focus on developer experience and infrastructure reliability.
  • Collaborative environment spanning infrastructure, distributed systems, and platform engineering.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →