TL;DR
Pictor | Arabic (Gulf) AI Evaluation Specialist (AI): Supporting the testing and evaluation of an Arabic language model with an accent on designing prompts, evaluating responses for functionality, accuracy, and safety. Focus on refining large language models, creating evaluation rubrics, and generating high-quality golden responses.
Location: Remote-Egypt
Salary: $10 USD/Hour (approx. $1600 USD/month)
Company
Welo Data, part of hirify.global, is a global AI data company delivering high-quality, ethical data to train the world’s most advanced AI systems.
What you will do
- Design scenario-based and edge-case prompts to test AI behavior.
- Develop evaluation rubrics to assess AI responses across instruction-following, factuality, tone, safety, refusals, and helpfulness.
- Perform side-by-side evaluations of AI outputs and score them on a 1–5 scale using defined criteria.
- Create high-quality source documents as the single source of truth for testing.
- Write accurate and well-structured Golden Responses that correctly follow instructions and handle ambiguity.
Requirements
- Bachelor's degree or equivalent experience in Linguistics, Computational Linguistics, Communications, Technical Writing, or a related analytical field.
- B2 or superior level of English.
- Native fluency in Modern Standard Arabic in Gulf dialect.
- Strong understanding of the distinction between Fusha and ‘Ammiyya.
- Proven experience in a role involving AI data annotation, content quality review, search quality rating, or prompt engineering.
- Ability to work independently and manage workflows effectively in a remote environment.
Nice to have
- Multilingual proficiency in one or more Arabic dialects.
- Strong attention to detail and critical thinking to identify hallucinations and bias.
- Familiarity with data annotation platforms and model evaluation tools.
- Experience in prompt engineering, AI evaluation, linguistic QA, or translation.
- Cultural familiarity with regional norms and high-context communication styles, particularly in the GCC region.
Culture & Benefits
- Project-based opportunities that fit your availability, fully remote, with complete autonomy.
- Optional access to AI and Large Language Model workshops designed for professionals without coding.
- Be part of a global contributor community with responsive guidance and support.
- Apply your expertise to influence the AI systems shaping the future of your industry.
Hiring process
- Apply by answering a few quick questions to join the database and become part of the community.
- Do not use VPNs or IP-masking tools during the recruitment process as security systems require accurate regional verification.
Будьте осторожны: если вас просят войти в iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →