TL;DR
Arabic (Gulf) AI Evaluation Specialist: Assessing and enhancing large language models (LLMs) by testing how they understand, generate, and respond to Arabic content with an accent on ensuring accurate, culturally appropriate, and reliable results. Focus on crafting realistic scenarios, analyzing model outputs for quality and safety, and identifying potential issues such as hallucinations or inconsistencies.
Location: Remote-Egypt
Salary: $10 USD/Hour
Company
Welo Data, part of hirify.global, is a global AI data company with 500,000+ contributors delivering high-quality, ethical data to train the world’s most advanced AI systems.
What you will do
- Conduct side-by-side comparisons of AI responses and rate their quality.
- Design scenario-based and edge-case prompts to evaluate model behavior.
- Assess outputs for instruction adherence, factual accuracy, tone, safety, and overall usefulness.
- Develop clear evaluation rubrics and criteria to ensure consistent scoring across tasks.
- Identify potential issues such as hallucinations, inconsistencies, or cultural/contextual mismatches.
Requirements
- Bachelor's degree or equivalent experience in Linguistics, Computational Linguistics, Communications, Technical Writing, or a related analytical field.
- B2 or superior level of English.
- Native fluency in Modern Standard Arabic in Gulf dialect.
- Strong understanding of the distinction between Fusha and ‘Ammiyya.
- Proven experience in a role involving AI data annotation, content quality review, search quality rating, or prompt engineering.
- Ability to work independently and manage workflows effectively in a remote environment.
Nice to have
- Multilingual proficiency in one or more Arabic dialects.
- Strong attention to detail and critical thinking to identify hallucinations and bias.
- Familiarity with data annotation platforms and model evaluation tools.
- Experience in prompt engineering, AI evaluation, linguistic QA, or translation is a plus.
- Cultural familiarity with regional norms and high-context communication styles, particularly in the GCC region.
Culture & Benefits
- Project-based opportunities that fit your availability.
- Optional access to AI and Large Language Model workshops.
- Be part of a global contributor community with responsive guidance and support.
- Apply your expertise in the Legal field to influence the AI systems shaping the future of your industry.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →