TL;DR
Senior Site Reliability Engineer: Provisioning and maintaining cloud infrastructure on Google Kubernetes Engine (GKE) for a high-throughput platform with an accent on automation, reliability, and cost-efficiency. Focus on designing resilient systems, debugging production issues, and proactive capacity planning.
Location: Onsite in Herzliya, IL
Company
Pendo is a fast-growing startup providing a product experience platform to help product managers understand and drive product success.
What you will do
- Automate provisioning, deployment, scaling, and monitoring of Pendo’s infrastructure using infrastructure-as-code.
- Write maintainable code for product functionality with an emphasis on operations, scale, resiliency, and monitoring.
- Ensure new services are well-designed with defined SLIs/SLOs and proper monitoring.
- Debug and mitigate production issues, finding ways to prevent them.
- Proactively track capacity, quotas, and other performance limits to plan for growth.
- Participate in a 24x7 on-call rotation to handle product availability issues and urgent customer support escalations.
Requirements
- Experience working with cloud infrastructure using tools such as Ansible or Terraform.
- Programming skills in a language such as Go or Python, and a willingness to learn new languages as needed.
- Ability to think and talk about systems in terms of possible failure modes, bottlenecks, etc.
- Ability to write clear and concise English-language documentation of processes for incident runbooks and release processes.
- Good number sense for discussing performance analysis, cost analysis, and operational metrics.
- Experience designing, analyzing, and troubleshooting distributed systems.
- Experience maintaining Kubernetes clusters in a production environment.
Culture & Benefits
- Join one of the fastest-growing startups, supported by best-in-class institutions.
- Gain experience in a diverse and exciting set of technologies and clients, and have a real impact.
- Passionate, dynamic, and fun culture.
- Commitment to diversity, equity, and inclusion.
- Provision of access and reasonable accommodation to applicants with mental and/or physical disabilities.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →