Staff Software Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Staff Software Engineer (Django/Python): Own and evolve the core platform powering AI employees with an accent on backend systems, distributed task infrastructure, event-driven architecture, and Kubernetes deployments. Focus on optimizing database queries, scaling async workflows, improving observability, and driving architectural decisions for reliability and scalability.
Location: Remote (overlap with Americas timezones for collaboration; reliable high-speed internet)
Company
Supernal helps small-to-medium businesses hire their first AI employee using intelligent agentic workflows on a proprietary platform delivering value-generating AI teammates for real business processes.
What you will do
- Drive platform architecture decisions, align team on scalable patterns, and ensure long-term maintainability
- Review code, design docs, and proposals for scalability, reliability, security, and operability
- Mentor engineers, unblock issues, raise production readiness bar, and establish best practices
- Evolve Django/DRF/ASGI backend for performance and correctness
- Scale async execution with Celery, Dramatiq, Temporal; implement resilient workflows
- Optimize PostgreSQL/pgvector, caching; maintain Kubernetes (GKE, Helm), CI/CD, autoscaling
- Own RabbitMQ, Redis, PostgreSQL reliability; lead incident response and post-mortems
- Extend OpenTelemetry, Datadog instrumentation, dashboards, alerts, SLOs; profile bottlenecks
Requirements
- 10+ years building and operating production backend systems at scale
- Deep expertise in Python (Django preferred) and relational databases (PostgreSQL)
- Hands-on with Kubernetes, Helm, cloud infrastructure (GCP preferred)
- Strong background in distributed systems: message queues, event sourcing, workflow orchestration
- Production experience with async task systems (Celery, Dramatiq or similar)
- Track record debugging complex production issues across services
- Ability to work autonomously and drive initiatives
- Clear technical communication to explain tradeoffs and build consensus
Nice to have
- Experience with Temporal or similar workflow engines
- Background in LLM infrastructure, RAG systems, or AI/ML platforms
- Familiarity with OpenTelemetry, Datadog observability stacks
- Experience with KEDA or Kubernetes autoscaling
- Contributions to multi-tenant SaaS platform architecture
- History improving developer experience and platform abstractions
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →