TL;DR
Site Reliability Engineering Tech Lead (AI/Data): Designing and implementing robust, scalable infrastructure solutions for an AI & Data Context Platform with an accent on multi-cloud deployment strategies and distributed system integrations. Focus on architecting monitoring, observability, and alerting systems, and driving best practices for infrastructure as code and deployment automation.
Location: Remote within the United States (due to US-specific benefits and employee programs).
Company
hirify.global is an AI & Data Context Platform providing fully managed SaaS solutions with AI-powered discovery, observability, and governance capabilities to over 3,000 enterprises.
What you will do
- Design and implement robust, scalable infrastructure solutions for hirify.global Cloud and enterprise deployments.
- Lead the technical vision for multi-cloud deployment strategies and distributed system integrations.
- Partner with product and engineering teams to influence advanced deployment and management capabilities.
- Establish and maintain SLAs/SLOs, lead incident response, and implement chaos engineering practices.
- Mentor and guide SRE engineers and collaborate with cross-functional teams to ensure reliable product delivery.
- Optimize system performance, capacity planning, and cost efficiency across diverse environments.
Requirements
- 8+ years of experience in Site Reliability Engineering, Platform Engineering, or DevOps roles.
- 3+ years of technical leadership experience managing engineering teams.
- Strong expertise with cloud platforms (AWS, GCP, Azure) and infrastructure automation tools.
- Proficiency in containerization technologies (Docker, Kubernetes) and orchestration.
- Experience with infrastructure as code tools (Terraform, CloudFormation, Pulumi) and CI/CD pipelines.
- Strong programming skills in Python, Java, or similar languages.
- Deep understanding of monitoring and observability tools (Prometheus, Grafana, Datadog).
- English: B2 required.
- Work authorization for the USA required.
Nice to have
- Experience building and operating multi-tenant SaaS platforms.
- Knowledge of data infrastructure and metadata management systems.
- Experience with service mesh technologies and microservices architectures.
Culture & Benefits
- Competitive compensation with equity for every team member.
- Flexible remote work policy with a monthly coworking stipend.
- Comprehensive health coverage (medical, dental, vision) covering 99% for employees.
- Flexible savings accounts (FSA, Dependent Care FSA).
- Inclusive fertility benefits and family-forming support through Carrot Fertility for U.S. employees.
- Unlimited PTO and sick leave policy for flexibility and rest.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →