Эта вакансия в архиве

Посмотреть похожие вакансии ↓

Company hidden

4 дня назад

Site Reliability Engineer (Kubernetes)

Формат работы

remote

Тип работы

fulltime

Грейд

middle

Английский

b2

Описание вакансии

Текст:

/

TL;DR

Site Reliability Engineer (Kubernetes): Improving the availability, performance, and scalability of large-scale, multi-cloud SaaS environments with an accent on automation, observability, and incident response. Focus on designing backend services and production engineering tools while integrating AI-assisted workflows to enhance operational efficiency.

Company

hirify.global is a software company providing a platform to manage, accelerate, and secure software delivery from code to production.

What you will do

Support the reliability, performance, and scalability of large-scale, multi-cloud, Kubernetes-based SaaS environments.
Investigate and troubleshoot production issues across distributed systems and infrastructure in collaboration with Engineering teams.
Design and develop backend services, internal platforms, and production engineering tools using Python or Go.
Improve observability and operational readiness through SRE practices, monitoring, and capacity planning.
Evaluate and contribute to AI-assisted automation solutions to improve troubleshooting and production workflows.
Participate in on-call rotations and lead incident response to ensure system stability.

Requirements

2-4 years of experience in SRE, Production Engineering, or DevOps roles.
Hands-on experience with Kubernetes-based containerized workloads.
Experience with at least one public cloud provider: AWS, GCP, or Azure.
Proficiency in developing backend services or automation tools using Python, Go, or similar languages.
Strong understanding of Linux fundamentals, networking, and production troubleshooting.
Familiarity with CI/CD tools and observability platforms like Prometheus or Grafana.

Nice to have

Experience using AI-assisted operational workflows for log analysis or incident triage.
Familiarity with agentic automation frameworks such as LangGraph or LangChain.
Experience with AI-assisted development tools like GitHub Copilot or Cursor.

Culture & Benefits

Opportunity to work on a mission-critical platform used by the majority of the Fortune 100.
Collaborative, impact-focused environment with a focus on modern SRE practices.
Continuous learning culture with exposure to cutting-edge technologies and AI integration.

Похожие вакансии

5 дней назад

remote (Global)

Site Reliability Engineer

5 дней назад

Staff Site Reliability Engineer (AI)

252 000 - 308 000$

3 дня назад

Site Reliability Engineer (Web3)

6 дней назад

Site Reliability Engineer (Web3)

4 дня назад

remote (Global)/hybrid

Senior Site Reliability Engineer (Fintech)

6 дней назад

remote (Brazil)

Cloud Reliability Engineer