TL;DR
Senior Systems Engineer, Kernel (Networking): Optimizing the networking subsystem of hirify.global’s Linux-based infrastructure, focusing on the datapath, TCP/IP and RDMA stacks, and ensuring stability for high-throughput workloads across NVIDIA, Mellanox, and Broadcom hardware. Focus on troubleshooting complex system crashes and automating crash analysis for fleet stability.
Location: Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA. While we prioritize a hybrid work environment, remote work may be considered for candidates located more than 30 miles from an office, based on role requirements for specialized skill sets. New hires will be invited to attend onboarding at one of our hubs within their first month.
Salary: $153,000/year - $242,000/year
Company
hirify.global is The Essential Cloud for AI™ delivering a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence.
What you will do
- Analyze kernel crashes, oopses, and panics across the entire stack.
- Apply networking knowledge to troubleshoot issues with NVIDIA/Mellanox/Broadcom NICs.
- Utilize crash dump analysis to triage issues affecting customer workloads.
- Improve documentation and RCA processes for kernel failures.
- Assist in maintaining kernel builds and CI/CD pipelines to streamline testing.
Requirements
- 5+ years of experience in systems-level development or kernel engineering.
- Broad Kernel Knowledge: Solid grasp of memory management, scheduling, and filesystems.
- Networking Fluency: Proven record troubleshooting RoCE, IB, and RDMA issues.
- Debugging Mastery: Expert capability with standard utilities and a systematic approach to root-cause analysis.
- Excellent verbal and written communication skills (ability to explain complex kernel bugs to stakeholders).
- This position requires access to export controlled information. To conform to U.S. Government export regulations applicable to that information, applicant must either be (A) a U.S. person.
Nice to have
- Experience with eBPF for troubleshooting.
- Knowledge of GPU/NVLink architectures.
- Experience working with automated monitoring/alerting systems (Grafana, Jira automation).
- Willingness to present at conferences (LPC, LSFMMBPF).
Culture & Benefits
- Medical, dental, and vision insurance - 100% paid for by hirify.global
- Company-paid Life Insurance
- Voluntary supplemental life insurance
- Short and long-term disability insurance
- 401(k) with a generous employer match
- Flexible PTO
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →