TL;DR
Site Reliability Engineer (DevOps): Analyzing metrics and logs from operating systems and applications to assist in performance tuning and fault-finding with an accent on improving observability and reliability of systems. Focus on running blameless postmortems, completing root cause analysis investigations, and contributing to automation solutions.
Location: Remote
Company
hirify.global provides users with rich functionality, high availability, and superior performance to help them achieve their missions.
What you will do
- Analyze metrics and logs from operating systems and applications to assist in performance tuning and fault-finding.
- Provide correlations between layers, highlight hidden parts, and improve observability and reliability of systems.
- Escalate issues to product/platform teams, initiate and facilitate firefighting processes, and provide issue summaries.
- Run blameless postmortems and complete root cause analysis investigations.
- Contribute to handbooks/run-books and general documentation.
- Participate in knowledge sharing, mentoring, and providing training material and workshops.
Requirements
- Advanced Linux administration experience.
- Basic programming experience in Golang, Python, C++, or Java.
- Solid understanding of TCP/IP network fundamentals and experience in troubleshooting network issues.
- Experience using Gitlab CI/CD or Terraform (Iac).
- Experience with Jaeger, Sentry, DataDog, NewRelic, Grafana, and Prometheus.
- Fluency in English.
Nice to have
- Experience working with globally distributed systems or infrastructure.
- Experience with Kubernetes, Docker, Postgres, and Kafka.
- Experience with Rancher, Rancher2, AWS EKS, Alicloud ACS, and Rancher RKE.
Culture & Benefits
- Competitive and attractive compensation.
- Extensive learning opportunities, such as professional training and certifications, soft skills development, free English courses, and trading workshops.
- Health and life insurance for employees, spouses, and children, including vaccinations, tests, mental health care, and coverage for vision and dental care.
- Allowance for sports club memberships or other physical exercise activities.
- Reimbursement for a work laptop, home office equipment, and coworking memberships.
- Generous time off, including 21 days of annual leave and paid sick leave.
Hiring process
- Interview with a Talent Acquisition Specialist (30 minutes).
- Short online English test.
- Technical interview (1 hour).
- Behavioral interview (1 hour).
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →