Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Staff Engineer (ML Infrastructure): Designing and evolving the infrastructure that powers Stripe's ML-driven products with an accent on scalability, low latency, and robust system design. Focus on building next-generation ML training and serving systems, LLM orchestration, and high-performance feature stores.
Location: Toronto
Salary: CA$208,000 - CA$312,000
Company
Stripe is a financial infrastructure platform for businesses, providing tools to accept payments and grow revenue globally.
What you will do
- Lead end-to-end architecture and system design for complex projects across the ML Platform.
- Define technical direction for ML workflow orchestration and scalable CPU/GPU compute infrastructure.
- Design systems for LLM fine-tuning, low-latency model inference, and large-scale feature stores.
- Collaborate cross-functionally with data scientists, product teams, and senior leadership to increase ML impact.
- Drive company-wide initiatives to improve MLOps maturity and ML development velocity.
- Mentor and grow other engineers, serving as a role model for software system design and operation.
Requirements
- 10+ years of professional software development experience with a background in service-oriented architecture and large-scale distributed systems.
- Proven track record as a technical lead, managing multi-team initiatives and mentoring members.
- Experience building and operating production ML platforms (training, serving, orchestration, or data systems).
- Strong communication skills and ability to explain complex concepts to diverse stakeholders.
- Ability to operate with high autonomy in ambiguous environments.
- Hands-on experience using AI tools to accelerate professional workflows.
Nice to have
- Experience with distributed ML training, model registries, and experiment tracking.
- Familiarity with LLMs, RAG, and agentic AI patterns (e.g., multi-agent orchestration).
- Experience with cloud services like AWS, SageMaker, Bedrock, Databricks, or OpenAI.
- History of training and shipping ML models to production to solve critical business problems.
Culture & Benefits
- Opportunity to work on a platform processing over $1.9T in annual payment volume.
- High level of autonomy and responsibility in shaping the global economy's infrastructure.
- Collaborative environment working with geographically distributed teams.
- Focus on technical excellence and creative problem-solving.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →