AI Support Engineer (GenAI)
ΠΡΡΡ & Π‘ΠΎΠΏΡΠΎΠ²ΠΎΠ΄
ΠΠ»Ρ ΠΌΡΡΡΠ° Ρ ΡΡΠΎΠΉ Π²Π°ΠΊΠ°Π½ΡΠΈΠ΅ΠΉ Π½ΡΠΆΠ΅Π½ Plus
ΠΠΏΠΈΡΠ°Π½ΠΈΠ΅ Π²Π°ΠΊΠ°Π½ΡΠΈΠΈ
TL;DR
AI Support Engineer (GenAI): Monitoring, diagnosing, and resolving production incidents across AI solutions including agentic and RAG-based systems with an accent on rapid triage, root cause analysis, and system reliability. Focus on implementing observability, building diagnostic tools, and collaborating with engineering teams to harden production environments.
Alpharetta, Georgia, USA / Columbus, Georgia, USA. Candidates must be legally authorized to work for any employer in the United States on a full-time basis without the need for current or future immigration sponsorship.
Company
is a leading fintech company delivering payment technology and software solutions.
What you will do
- Serve as first line of defense for production AI incidents with rapid triage, root cause analysis, and resolution.
- Monitor health and performance of deployed AI applications, agentic solutions, RAG systems, and orchestration platforms.
- Investigate issues like latency, failures, model drift, hallucinations, or broken integrations, escalating as needed.
- Collaborate with engineers to implement observability, logging, and alerting best practices.
- Build diagnostic tools, runbooks, and automated workflows to improve incident response.
- Maintain knowledge bases, contribute to postmortems, and ensure compliance with governance policies.
Requirements
- 4+ years in production support, software engineering, SRE, or DevOps, preferably with GenAI/ML systems.
- Strong understanding of cloud infrastructure (AWS, GCP) and AI observability tools (Fiddler AI, Arize AI, etc.).
- Experience with LLM/GenAI systems (OpenAI, Azure OpenAI, Bedrock, Vertex AI).
- Familiarity with orchestration frameworks (LangChain, LangGraph, Autogen, CrewAI).
- Proficiency in Python/shell scripting; 1+ years AI/ML engineering with Generative AI focus.
- Availability for on-call rotation.
- Bachelorβs degree in Computer Science, Engineering, or related field.
Nice to have
- Prompt engineering, RLHF, model evaluation techniques.
- AI governance, safety principles; reinforcement learning.
- Big data technologies (Spark, Kafka); CI/CD for AI/ML.
- Real-time data processing and streaming analytics.
Culture & Benefits
- Dynamic team passionate about learning, cutting-edge technologies, and innovation.
- Full-time position with travel required less than 2%.
- Collaborative environment across AI engineering, platform, and governance teams.
ΠΡΠ΄ΡΡΠ΅ ΠΎΡΡΠΎΡΠΎΠΆΠ½Ρ: Π΅ΡΠ»ΠΈ ΡΠ°Π±ΠΎΡΠΎΠ΄Π°ΡΠ΅Π»Ρ ΠΏΡΠΎΡΠΈΡ Π²ΠΎΠΉΡΠΈ Π² ΠΈΡ ΡΠΈΡΡΠ΅ΠΌΡ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡ iCloud/Google, ΠΏΡΠΈΡΠ»Π°ΡΡ ΠΊΠΎΠ΄/ΠΏΠ°ΡΠΎΠ»Ρ, Π·Π°ΠΏΡΡΡΠΈΡΡ ΠΊΠΎΠ΄/ΠΠ, Π½Π΅ Π΄Π΅Π»Π°ΠΉΡΠ΅ ΡΡΠΎΠ³ΠΎ - ΡΡΠΎ ΠΌΠΎΡΠ΅Π½Π½ΠΈΠΊΠΈ. ΠΠ±ΡΠ·Π°ΡΠ΅Π»ΡΠ½ΠΎ ΠΆΠΌΠΈΡΠ΅ "ΠΠΎΠΆΠ°Π»ΠΎΠ²Π°ΡΡΡΡ" ΠΈΠ»ΠΈ ΠΏΠΈΡΠΈΡΠ΅ Π² ΠΏΠΎΠ΄Π΄Π΅ΡΠΆΠΊΡ. ΠΠΎΠ΄ΡΠΎΠ±Π½Π΅Π΅ Π² Π³Π°ΠΉΠ΄Π΅ β