TL;DR
Senior Site Reliability Engineer (Satellite Connectivity): Building and optimizing large scale, highly resilient systems for satellite connectivity with an accent on distributed systems, architecture design, and cloud infrastructure as code. Focus on building and running an hirify.global service enabling platform and the infrastructure that powers it, emphasizing software-driven solutions for global scalability.
Location: Must be based in Austin, Texas, United States, with Home Office flexibility.
Company
hirify.global crafts products that enrich people’s lives, encouraging creativity, collaboration, and re-thinking old problems in new ways within its Satellite Connectivity Group.
What you will do
- Build, monitor, and maintain large-scale, highly resilient systems enabling customer communications via satellite.
- Contribute to distributed systems, architecture design, and cloud infrastructure as code for critical hirify.global services.
- Shape services like Emergency SOS, Roadside assistance, and Messages via satellite for millions of hirify.global device users.
- Build and control the entire end-to-end infrastructure, including provisioning, monitoring, deployment, and software tools platforms.
- Build and run an hirify.global service enabling platform that millions of customers rely on every day.
- Solve problems using software to scale hirify.global’s services globally.
Requirements
- Deep understanding of observability.
- Strong familiarity with monitoring and alerting platforms like Prometheus, Splunk, Grafana, and Alertmanager.
- Experience building and operating multi-clustered and highly-available services.
- Proven experience building and optimizing real-time and batch data processing pipelines using technologies such as Kafka, Spark, Flink, or Beam.
- Strong understanding of Core Kubernetes concepts.
- Experience with modern web-scale services, CI/CD systems, GitOps workflows, and Infrastructure As Code (Pulumi, Terraform).
- Strong understanding of Linux internals.
Nice to have
- Experience supporting environments with thousands of servers and critical uptime requirements.
- Ability to write software tools & services needed to build and operate a large-scale platform.
- Proficiency with Configuration Management systems (Puppet, Ansible, Salt).
- Understanding of zero-trust application architecture.
- Experience with IP network design and architecture; Cisco, Juniper, or Arista routing and switching hardware & configuration.
- Experience with AWS and/or GCP, OLAP databases (ClickHouse, DuckDB), and OpenTelemetry.
Culture & Benefits
- Work in an environment that encourages creativity, collaboration, and re-thinking old problems.
- Opportunity to shape critical and unique services benefiting the safety and connection of millions.
- Join a team with a no-ops culture, building and running a global-scale platform.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →