Senior Database Reliability Engineer
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Senior Database Reliability Engineer (AWS/SQL/Terraform): Own database infrastructure across AWS environments by provisioning, scaling, and operating SQL Server and RDS databases with an accent on Infrastructure-as-Code, reliability engineering, and observability. Focus on designing runbooks and SLOs, automating provisioning and migrations via CI/CD, and improving performance through query tuning and connection pooling.
Location: London, UK
Company
provides asset servicing and operational solutions for investment management clients.
What you will do
- Provision and manage AWS RDS instances entirely through Terraform (parameter/subnet groups, IAM auth, secrets rotation, multi-AZ, read replicas).
- Own database reliability by designing runbooks, defining SLOs, and setting up alerting for slow queries, connection pool saturation, replication lag, and disk growth.
- Automate database operations including schema migrations, backup validation, failover drills, and patching via CI/CD pipelines.
- Improve performance with development teams using EXPLAIN ANALYZE, query tuning, indexing strategies, and connection pooling (PgBouncer/RDS Proxy).
- Secure the data layer with encryption at rest/in transit, IAM database authentication, credential rotation via AWS Secrets Manager, and least-privilege access.
- Participate in on-call rotation to respond to incidents, drive RCAs, and implement permanent fixes.
Requirements
- Strong SQL skills including complex queries, performance tuning, indexing, and EXPLAIN plans.
- Hands-on AWS experience with RDS Aurora, Secrets Manager, IAM, CloudWatch, VPC networking, and KMS encryption.
- Terraform experience including writing/maintaining modules, state management, remote backends, and version pinning.
- Scripting ability in Python or Bash for automation, operational tooling, and migration scripts.
- CI/CD experience (e.g., GitLab CI or equivalent) for infrastructure changes and database migrations.
- Observability experience with Datadog, CloudWatch, or Prometheus/Grafana for database metrics and alerting; comfortable with Linux production troubleshooting.
Nice to have
- Deep PostgreSQL knowledge (vacuuming/autovacuum tuning, replication, pg_stat views, extensions).
- Experience with PgBouncer or RDS Proxy for connection pooling configuration and tuning.
- Kubernetes familiarity (workload connectivity patterns, sidecars, secrets injection).
- Experience with zero-downtime database migrations (Flyway, Liquibase, or custom approaches).
- Experience in financial services or regulated environments (audit logging, data residency, PCI DSS/SOC 2 controls).
- Experience with AWS Aurora Global Database (cross-region replication and DR patterns).
Culture & Benefits
- Permanent full-time role with on-call responsibility for database incidents.
- Platform approach: build reusable Terraform modules so application teams can self-serve.
- Work in a regulated, client-focused environment supporting large-scale investment operations.
- Emphasis on innovation and reliability engineering rather than manual DBA administration.
Hiring process
- Interview stage evaluation after application submission.
- Only candidates proceeding to interviews are contacted.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →