Эта вакансия в архиве
Посмотреть похожие вакансии ↓обновлено 1 месяц назад
Principal Site Reliability Engineer (AI)
Описание вакансии
Текст:
TL;DR
Principal Site Reliability Engineer (AI): Leading infrastructure strategy for a cutting-edge AI-driven SaaS platform serving finance teams with an accent on scaling, optimizing, and securing cloud-based systems. Focus on shaping reliability and performance, working with advanced cloud technologies, automation tools, and AI-driven solutions.
Location: Remote within Latin America (Argentina, Brazil, Colombia, Mexico)
Company
is partnering with a fast-growing AI-driven SaaS platform that automates critical workflows for finance and accounting teams in high-growth businesses.
What you will do
- Define and lead infrastructure and reliability strategy across the platform.
- Design scalable, resilient systems in collaboration with engineering teams.
- Optimize build, testing, and deployment processes for speed and stability.
- Establish and uphold best practices for CI/CD, monitoring, and observability.
- Lead incident response and drive continuous improvement post-incident.
- Automate workflows to reduce operational toil and risk.
- Mentor engineers and foster a culture of operational excellence.
- Make strategic build-vs-buy decisions balancing speed, quality, and sustainability.
Requirements
- 8+ years of experience in Site Reliability Engineering or DevOps roles, including 2+ years in a Principal or Lead position.
- Proven experience in infrastructure modernization and scaling initiatives for high-growth environments.
- Strong proficiency in Python.
- Deep expertise in cloud platforms and container orchestration tools such as AWS ECS and EKS.
- Solid experience in CI/CD pipeline design and optimization using tools like GitHub Actions and Buildkite.
- Proficiency in infrastructure-as-code tools such as Terraform.
- Strong knowledge of monitoring, observability, and performance optimization practices.
- Upper-Intermediate level of spoken and written English.
Nice to have
- Experience with monorepos (Turborepo, pnpm).
- Familiarity with modern TypeScript tools (swc, biome, oxc).
- Knowledge of NestJS, NextJS, and testing frameworks (Jest, Vitest).
Culture & Benefits
- Diversity of domains and technology.
- Health and legal support.
- Active professional community and continuous education.
- Flexible schedule and remote work.
- Outstanding offices (if chosen).
- Sports and community activities.