Мэтч & Сопровод
Покажет вашу совместимость и напишет письмо
Описание вакансии
Senior Applied AI Engineer
Conditions
Full-time Senior 🇺🇸 USA 💻 Development ✈️ Relocation Job description What You'll Do:
- Ship production agentic systems- Design and build agents and agentic workflows that solve a defined problem end to end. You own prompts, tools, retrieval, guardrails, observability, cost and latency budgets, and rollout.
- Automate SME workflows- Identify high-leverage operational toil (review pipelines, content QA, labeling and ops loops, support triage, internal copilots) and partner with the SMEs running those workflows to define success criteria, validate outputs, and replace meaningful chunks of that work with AI systems they trust. SMEs are co-owners of quality from the start of the project.
- Own the evaluation loop and the golden datasets that anchor it- Build offline evals, LLM-as-judge with calibration, regression suites, and online metrics. Maintain versioned, decontaminated golden datasets covering intents, difficulty, edge cases, and adversarial inputs, and continuously enrich them with real production failures and SME-validated labels. The measurement plan is what decides whether a feature ships.
- Make AI features safe- Treat what you deliver as a regulated product. Design for relevant compliance frameworks (e.g., COPPA, FERPA) from day one, run safety and bias evals before launch and continuously after, and build the human-in-the-loop and content-filtering controls AI features need before they reach end users.
- Hand off what you ship- Production features leave your hands with documentation, runbooks, an eval harness, and dashboards. An embed is not complete until the receiving team has shipped a change to the system without you in the room. Typical embed length is 4 to 12 weeks.
- Make AI features easier for the rest of engineering to build- Internal libraries, patterns, and playbooks so other teams can ship AI features without your direct involvement.
What You Need:
- 7+ years professional experience as an Engineer with at least 1+ years of hands-on experience building agentic systems on at least one modern stack (LangGraph, the Anthropic SDK / Claude Agent SDK, OpenAI Agents SDK, Pydantic-AI, Mastra, LlamaIndex, CrewAI, or a homegrown stack). We care that you have built and operated agentic systems in production, not which framework.
- Strong Python familiarity plus one typed language for production services (TypeScript, Go, or similar). Cloud experience (AWS or GCP) and containerized deployment.
- Senior or staff-level software engineering foundation with several years of production environment experience and a track record of leading systems to launch.
- Multiple shipped LLM-powered features in a production environment, with concrete stories about what broke, how you fixed it, and what you would do differently.
- Practical knowledge of common agentic patterns: ReAct, tool use with structured schemas, prompt chaining, routing, orchestrator-workers, evaluator-optimizer / reflection, and human-in-the-loop. You can decide when a deterministic workflow is the right answer instead of an autonomous agent.
- Hands-on experience in a production environment with retrieval: chunking, embeddings, hybrid search, re-ranking, metadata filtering, and the failure modes of each. Working knowledge of grounding techniques that anchor generated answers in retrieved evidence (citation and quote extraction, faithfulness and refusal evals, post-hoc consistency checks).
- Strong prompt-engineering practice: zero-shot, few-shot, and many-shot patterns; example selection and ordering; in-context learning and chain-of-thought.
- Comfort with structured output and validation in production (provider-native structured outputs, Instructor, Pydantic-AI, Outlines, or a comparable approach).
- Disciplined evaluation practice. You do not rely on subjective review to decide whether a system is ready.
- Strong written and verbal communication. You can explain an architectural trade-off to an executive and to a junior engineer in the same week.
- You are comfortable using AI coding tools heavily in your implementation workflow while you own problem framing, design choices, and verification. We measure your output by working systems delivered, not lines of code written.
Nice to Have:
- Advanced retrieval experience: GraphRAG, agentic retrieval, evaluation-driven retrieval tuning, and hybrid retrieval at scale.
- Direct or transferable experience with safety, privacy, and policy constraints in user-facing AI. K-12 or other regulated-domain experience is a strong plus.
- Experience with prompt-optimization frameworks (DSPy, TEXTGRAD, AdalFlow) where they paid off in production.
- A public repo, package, gist, or technical write-up of meaningful AI work, or a representative project you can describe in detail under your confidentiality constraints.
- Open-source contributions to AI tooling (frameworks, agents, evals, MCP servers).
*A note on our interview process One of our interview rounds is an AI-assisted coding session. Bring your own setup and solve a realistic problem live with AI in the loop. We are looking at how you collaborate with AI tools: how you prompt, validate output, catch bad suggestions, decide when to override, and produce code that meets production standards. It is not a cleanroom algorithm round. Apply for this job Please mention "I found this job at Remocate!"
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →
Текст вакансии взят без изменений