TL;DR
Senior Site Reliability Engineer (Application Software, Data): Evolving and scaling mission-critical distributed systems for launch vehicle production and Starlink network growth with an accent on sharding, geo-redundancy, and petabyte-scale bare metal clusters. Focus on advancing deployment, monitoring, and alerting infrastructure and optimizing performance bottlenecks.
Location: Onsite in Hawthorne, CA, USA. Applicant must be a U.S. citizen or national, U.S. lawful, permanent resident (aka green card holder), Refugee under 8 U.S.C. § 1157, or Asylee under 8 U.S.C. § 1158, or be eligible to obtain the required authorizations from the U.S. Department of State to conform to U.S. Government export regulations (ITAR).
Salary: $160,000.00 - $220,000.00 per year
Company
hirify.global is actively developing technologies to make humanity a multi-planetary species and explores the stars.
What you will do
- Upgrade existing distributed systems to become sharded and geo-redundant in multiple data centers.
- Advance existing deployment, monitoring, and alerting infrastructure for a multi-region environment.
- Manage petabyte-scale bare metal compute clusters.
- Closely collaborate with engineers across all programs to create highly operable, scalable, and maintainable products.
- Engage throughout the whole software development lifecycle of services.
- Focus on performance bottlenecks and performance improvement techniques.
Requirements
- Bachelor's degree in computer science, engineering, math, or scientific discipline and 5 years of software development experience; OR 7+ years of professional experience building software with site reliability or DevOps in lieu of a degree.
- Experience with Linux operating systems.
- Applicant must be a U.S. citizen or national, U.S. lawful, permanent resident (aka green card holder), Refugee, or Asylee to conform to U.S. Government export regulations (ITAR).
Nice to have
- 5+ years of rigorous experience with site reliability or DevOps.
- Experience with Kubernetes and Istio for on-premise deployment.
- Experience with in-stream, data processing and analytics using open source platforms such as Apache Kafka, Spark, HBase, HDFS, Flink.
- Experience troubleshooting hardware and network-layer issues.
- Programming experience in Python, C#, Java, Scala, Go or similar languages.
- Good understanding of version control, testing, continuous integration, build, deployment and monitoring.
Culture & Benefits
- Full ownership of challenging problems, working on a truly inspiring mission.
- Comprehensive medical, vision, and dental coverage.
- Access to a 401(k) retirement plan.
- Eligibility for long-term incentives in the form of company stock, stock options, or long-term cash awards.
- Ability to purchase additional stock at a discount through an Employee Stock Purchase Plan.
- Paid parental leave, short & long-term disability insurance, life insurance.
- 3 weeks of paid vacation and 10 or more paid holidays per year.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →