Назад
Company hidden
21 час назад

Lead Site Reliability Engineer (AWS/Kubernetes)

170 000 - 200 000$
Формат работы
remote (только USA)
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Lead Site Reliability Engineer (SRE): Guide architecture, reliability, and operational excellence for infrastructure powering secure mission-critical collaboration platform with an accent on scalability, observability, performance, and automation across cloud and hybrid environments. Focus on designing containerized workloads, establishing monitoring frameworks, driving incident management, ensuring regulatory compliance, and mentoring SRE team members.

United States. Remote-first. For candidates residing in the U.S.: must be U.S. citizens eligible to obtain and maintain government security clearance. Must meet U.S. export control laws (EAR, ITAR).

Posting Range $170,000 - $200,000 USD

Company

Leading collaborative workflow platform for defense, intelligence, security, and critical infrastructure, trusted by U.S. Department of War and Fortune 500s, running on-premises and in private clouds.

What you will do

  • Define strategy, architecture, and roadmap for SRE function, aligning with product and business goals.
  • Design, deploy, and optimize containerized workloads, infrastructure-as-code, and compliant cloud environments (e.g., FedRAMP, DoD).
  • Establish observability, monitoring, alerting frameworks for performance, reliability, and capacity planning.
  • Drive incident management, on-call rotations, root cause analysis, and reliability improvements.
  • Partner with security/compliance for data sovereignty and regulatory requirements; manage cloud costs and capacity.
  • Build developer platform for secure software delivery; mentor and coach SRE team.

Requirements

  • BS in Computer Science, Cybersecurity, Software Engineering or equivalent + 5+ years in SRE, DevOps, or cloud infrastructure.
  • Expertise in container orchestration (Kubernetes), infrastructure-as-code (Terraform), cloud platforms (AWS).
  • Experience designing monitoring, alerting, performance optimization; troubleshooting distributed systems.
  • Proficiency in scripting/programming for automation; excellent communication and cross-functional influence.
  • Experience leading globally distributed teams in remote-first environment. U.S. applicants: U.S. citizens eligible for security clearance; meet export control requirements (EAR/ITAR).

Nice to have

  • Familiarity with Grafana, Prometheus; high-availability/disaster recovery architectures.
  • Exposure to GCP, Azure; leadership in regulated industries (defense, finance).
  • U.S. federal compliance (FedRAMP, DoD ATO, NIST); AWS Marketplace experience.
  • Open-source contributions; certifications (CKA, CKAD, AWS Solutions Architect).

Culture & Benefits

  • Remote-first, open-source company with globally distributed teams.
  • Market-based pay based on skills, experience, location, and market conditions.
  • EEO employer committed to diversity; accommodations for interviews.
  • Expanding hiring to more countries while ensuring local compliance.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →