Site Reliability Engineer (SRE)

55 000 - 68 000€

Формат работы

onsite

Тип работы

fulltime

Грейд

senior

Английский

Страна

US/Spain/Portugal

Релокация

Portugal

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Site Reliability Engineer (SRE): Design and implement scalable, reliable systems across cloud environments with an accent on observability, automation, and performance optimization. Focus on building fault-tolerant infrastructure, conducting root cause analysis, and enhancing CI/CD pipelines for high availability under production loads.

Fully onsite in Lisbon office. Open to support with relocation efforts.

€55K – €68K

Company

Family-founded company building an AI-powered Personal & Entrepreneurial Resource Planner with offices in Lisbon and San Francisco, self-funded with over 100 million downloads worldwide.

What you will do

Design scalable, reliable, fault-tolerant systems in cloud environments.
Develop observability tools including monitoring, logging, and alerting with Prometheus, Grafana, Datadog, ELK.
Automate infrastructure, deployments, and incident response using IaC tools like Terraform or CloudFormation.
Optimize system performance, scalability, and incident workflows to improve uptime.
Collaborate with dev and DevOps teams on reliability-focused system design.
Conduct root cause analysis and implement preventative measures.
Maintain load balancing, failover, disaster recovery, and optimize cloud costs on AWS, Azure, GCP.
Participate in on-call rotations.

Requirements

Around 4+ years in SRE, DevOps, or System Engineering.
Strong knowledge of cloud platforms (AWS, Azure, GCP) and cloud-native architectures.
Experience with observability tools (Prometheus, Grafana, ELK, Datadog, New Relic).
Proficiency in IaC (Terraform, CloudFormation, Pulumi).
Hands-on with containerization/orchestration (Docker, Kubernetes, Helm).
Strong Linux admin, networking, scripting (Bash, Python, Go).
Incident management, debugging, root cause analysis.
Load balancing, failover, distributed systems, security best practices.
Strong communication for cross-functional collaboration.

Culture & Benefits

Apple hardware ecosystem.
Annual bonus, top-tier health and life insurance, pension fund.
Transportation budget, Coverflex for meals/well-being, free meals at hub.
Childcare support, Urban Sports Club membership.
Air Conference for team collaboration.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →