2 дня назад
Site Reliability Engineer (SRE)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
Site Reliability Engineer (SRE): Maintaining the stability and reliability of production environment with an accent on monitoring, alerting, and observability. Focus on developing and improving monitoring systems, analyzing incidents, and optimizing system performance and fault tolerance.
Location: On-site in Limassol, Cyprus or remote
Company
Software company using GCP with a team in Cyprus.
What you will do
- Ensure stability of production and development infrastructure
- Develop and improve monitoring, alerting, and observability (metrics, logs, tracing)
- Configure and optimize metrics and logging systems
- Analyze incidents and prevent recurrence
- Work with alerts and improve their quality
- Increase service reliability and fault tolerance
- Optimize system performance and stability
Requirements
- Strong understanding of Linux
- Experience as SRE / DevOps / System Engineer
- Solid experience with monitoring and alerting tools (Prometheus, Grafana or similar)
- Understanding of observability (metrics, logs, tracing)
- Experience with Kubernetes and containerization
- Experience in incident analysis and production troubleshooting
- Automation skills (Bash, Python)
- Understanding of networking, performance, and fault tolerance
Culture & Benefits
- Remote work or from office in Limassol
- Compensation for English or Greek classes
- Health insurance (for Cyprus)
- Office lunches (for Cyprus)
- Flexible start of the working day
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →