TL;DR
Senior Site Reliability Engineer (Web3): Defining and implementing operational excellence for a rapidly growing sports company with an accent on incident response, observability, and SLOs. Focus on infrastructure preparation and capacity management for high-traffic events and improving developer experience.
Location: United States/Remote
Salary: $160,000 and $240,000
Company
hirify.global makes sports more fun by building multiple games and products across fantasy sports, sports betting, and prediction markets.
What you will do
- Own and maintain the incident response process, defining procedures, tools, and best practices.
- Guide teams in establishing and monitoring Service Level Objectives (SLOs), including setting up alerts and reporting systems.
- Lead capacity planning initiatives, focusing on both short and long-term scalability while optimizing costs.
- Develop and implement disaster recovery plans, including regular testing and regulatory compliance.
- Manage launch and event planning for high-traffic occasions, focusing on infrastructure preparation and capacity management.
- Act as an internal expert and consultant for monitoring tools like Datadog and Pagerduty and infrastructure like AWS and Kubernetes.
Requirements
- Strong written and verbal communicator and collaborative by nature.
- Comfortable working with an IDE, multiple languages, multiple web application frameworks, AWS services, Kubernetes, PostgreSQL.
- Ability to work independently to learn new languages/technologies as needed.
- Enjoy deploying changes to production quickly, multiple times a week if necessary.
- Enjoy using research, data, and experiments to make decisions.
- Enjoy working directly with customers (generally engineers or other people inside the company).
Nice to have
- Experience with PostgreSQL SQL query optimization, tweaking autovacuum settings, table statistics, different index types, etc.
- Experience with Redis / Valkey Optimization.
- Experience with Datadog or similar observability tools.
- Experience working as a web application developer, frontend or backend, especially in React and Ruby on Rails.
- Experience with AWS cost optimization.
Culture & Benefits
- Unlimited PTO.
- 16 weeks of fully paid parental leave.
- Home office stipend.
- A connected virtual first culture with a highly engaged distributed workforce.
- 5% 401k match, FSA, company paid health, dental, vision plan options for employees and dependents.
- Expected to gather 2-3 times per year for team and company offsites, trainings, and more.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →