Staff Site Reliability Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Staff Site Reliability Engineer (AWS/Kubernetes): Building and scaling an AI platform with an accent on reliability, observability, and automation of cloud-native infrastructure. Focus on managing EKS clusters, implementing Istio service mesh, and optimizing GitOps workflows to ensure system stability for AI workloads and microservices.
Location: Fully remote, must be based in Colombia
Company
is a global Fintech SaaS leader specializing in audit and accounting software for over 30 years.
What you will do
- Maintain high-performing AWS production systems and manage EKS clusters for stability and scaling.
- Implement and support Istio service mesh for traffic control and security.
- Oversee GitOps workflows using Flux CD to ensure secure and consistent infrastructure changes.
- Design and manage monitoring, logging, and tracing for AI workloads, microservices, and data pipelines.
- Develop automation tools, platform enhancements, and support nx-based monorepos.
- Participate in on-call rotations and perform root cause analysis for incidents.
Requirements
- Deep expertise in AWS (EKS, EC2, IAM, networking, load balancing).
- Professional experience with Kubernetes, including RBAC and workload autoscaling.
- Hands-on experience with Istio service mesh and GitOps (Flux CD).
- Proficiency with GitHub Actions, CI/CD workflows, and nx monorepos.
- Knowledge of IaC tools such as Terraform and CDK, and production readiness best practices.
- Strong English language communication and collaboration skills.
Culture & Benefits
- Indefinite term contract with full legal benefits and competitive compensation.
- Comprehensive health coverage including prepaid medicine and life insurance.
- Home office support with internet allowance and home office stipend.
- Focus on work-life balance with flexible remote options and 5 personal days off.
- Professional growth opportunities via mentorship and a dedicated training budget.
- Recognition programs and tenure-based vacation upgrades.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →