Intermediate Site Reliability Engineer
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Intermediate Site Reliability Engineer (Node.js/AWS): Provide first-line operational support for a modern mobile point-of-sale platform with an accent on incident response, cloud troubleshooting, and system reliability. Focus on monitoring systems, automating workflows, and resolving production incidents in high-pressure situations.
Location: Remote, based in Chile
Company
is a global company building software since 2011 with a 99.9% remote team that values fairness, high standards, openness, and inclusivity.
What you will do
- Provide first-line operational support, monitor systems, and resolve production incidents
- Troubleshoot cloud systems and integrations, applying corrective actions
- Manage escalations, collaborate on bug fixes and hotfixes
- Administer MDM solutions and support remote software deployments
- Implement automated monitoring and alerting to improve incident response
- Document processes, maintain knowledge bases, and create incident runbooks
- Participate in on-call rotation for 24/7 critical incident coverage
- Contribute to post-incident reviews and build Node/TypeScript utilities for automation
Requirements
- Upper-Intermediate+ English level
- Bachelor’s degree in Computer Science, Engineering, or related field
- 3+ years supporting production systems, focused on incident response and resolution
- Strong experience in operational support or SRE roles in cloud environments
- Proficiency in Node.js, including debugging, error handling, and performance troubleshooting
- Experience with AWS, Azure, or GCP, including monitoring and troubleshooting cloud-native applications
- Experience working with APIs and integrations
- Familiarity with logging and monitoring tools (Winston, Bunyan, Datadog, ELK Stack, CloudWatch)
- Experience with CI/CD pipelines and automated deployments (Jenkins, GitLab CI, AWS CodePipeline)
Nice to have
- Experience with containerization (Docker, Kubernetes)
- Knowledge of REST APIs, WebSockets, and microservices architecture
- Familiarity with incident management frameworks (ITIL, SRE practices)
- Understanding of cloud security best practices
- Experience with mobile POS platforms or mobile application environments
- Familiarity with mobile device management (MDM) solutions
Culture & Benefits
- 99.9% remote work from anywhere in the world
- 30 paid days off per year, 5 paid sick days, up to 60 days medical leave, 6 days for family events
- Partially covered health insurance after probation, wellness bonus for gym/sports after 6 months
- Payment in U.S. dollars, overtime coverage
- English lessons, University programs, online activities, and team-building events
Hiring process
- Submit CV in English
- Intro call with Recruiter
- Internal interview
- Client interview
- Offer
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →