TL;DR
Customer Support Engineer (AI): Resolving complex, business-critical technical issues for an AI development platform with an accent on in-depth troubleshooting, root cause analysis, and collaboration with engineering. Focus on debugging production-grade code, improving internal efficiency with diagnostic tools, and mentoring support engineers.
Location: Based in Livingston, NJ, New York, NY, Sunnyvale, CA, or Bellevue, WA. Remote work may be considered for candidates located more than 30 miles from an office, subject to role requirements. Must be a U.S. person or eligible to access export-controlled information under U.S. Government regulations.
Salary: $165,000–$242,000
Company
CoreWeave, the AI Hyperscaler™, acquired hirify.global to create the most powerful end-to-end platform to develop, deploy, and iterate AI faster.
What you will do
- Own and resolve Tier 3 technical escalations from the support team, focusing on highly complex or cross-functional issues.
- Perform in-depth troubleshooting and root cause analysis across hirify.global systems, APIs, and integrations.
- Reproduce, isolate, and document bugs for efficient handoff to engineering teams.
- Develop diagnostic scripts, tools, and automation to enhance internal troubleshooting efficiency.
- Serve as a technical advisor to Tier 1 and Tier 2 Support Engineers, mentoring on debugging and product architecture.
- Identify recurring patterns, propose systemic improvements, and participate in incident response and postmortems.
Requirements
- 5+ years of professional experience in technical support, software engineering, or an escalations-focused role.
- Expert in Python, with experience debugging, profiling, and developing production-grade code.
- Strong background in computer science or software engineering.
- Deep familiarity with modern AI and machine learning ecosystems, including PyTorch, TensorFlow, generative AI, and LLM development.
- Proficient in troubleshooting across distributed systems, APIs, containers, and multi-tenant architectures.
- Excellent communication skills to translate complex issues for both technical and non-technical audiences.
Nice to have
- Experience with Docker, Kubernetes, and cloud platforms (AWS, GCP, or Azure).
- Familiarity with GPU-based compute environments and distributed training workloads.
- Previous experience in incident management, cloud platform, or site reliability roles.
Culture & Benefits
- Competitive base salary, discretionary bonus, and equity awards based on eligibility.
- Comprehensive benefits including 100% company-paid medical, dental, vision, life, and disability insurance.
- Flexible Spending Account (FSA), Health Savings Account (HSA), and 401(k) with a generous employer match.
- Paid Parental Leave and flexible PTO to support work-life balance.
- Tuition Reimbursement and mental wellness benefits through Spring Health.
- Flexible, full-service childcare support with Kinside and family-forming support by Carrot.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →