Назад
Company hidden
2 месяца назад

Staff Engineer (Performance, Reliability & AI Automation)

86 250
Формат работы
hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
Spain
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Staff Engineer (Performance, Reliability & AI Automation): Defining and evolving SLIs/SLOs, improving observability, and driving performance/reliability enhancements for a large-scale Ruby on Rails backend with GraphQL APIs, MySQL, Kafka, and multi-region cloud with an accent on load testing, bottleneck investigations, and AI-assisted workflows. Focus on standardizing service health visibility, validating systems under realistic traffic/concurrency, and designing AI tools for anomaly analysis and incident response.

Location: Barcelona, Spain (office-first, flexible hybrid with on-site several days a week)

Salary: €86,250

Company

Europe’s fastest-growing HR SaaS platform serving 15,000+ customers and 1M+ users, headquartered in Barcelona with 1,200+ employees across 7 markets.

What you will do

  • Define and evolve SLIs/SLOs for critical product journeys and standardize observability/dashboards across teams.
  • Investigate production bottlenecks across application, database, async, and system layers.
  • Drive improvements in latency, throughput, scalability, and reliability through structured load testing.
  • Analyze capacity, saturation, and behavior under peak load/growth scenarios.
  • Design AI-assisted workflows for metric/alert interpretation, anomaly analysis, and performance insights.
  • Partner with product/infrastructure teams to align on priorities and prevent regressions.

Requirements

  • Based in Barcelona area or willing to work on-site several days a week
  • Strong hands-on experience improving performance, scalability, and reliability in complex systems.
  • Experience with SLIs/SLOs, observability (e.g., Datadog), and production bottleneck investigations.
  • Knowledge of load testing, tail latency/throughput diagnosis, and cloud production systems.
  • Strong communication and proactive ownership mindset.

Nice to have

  • Experience with Ruby on Rails, MySQL, Kafka, GraphQL, ClickHouse.
  • Background in Performance/Reliability Engineering or large-scale operations.
  • Interest in AI/agentic workflows for engineering.

Culture & Benefits

  • Office-first flexible hybrid model for collaboration and innovation.
  • Private health insurance (Alan), Wellhub gym access, Cobee expense savings.
  • Language classes, office breakfast/fruit, Nora food discounts, free coffee/tea, pet-friendly.
  • Multicultural English-speaking environment with values of ownership, learning, partnership, and growth.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →