Company logo

AI Quality Assurance Engineer - LLM

Enroute

Full-time

Remote

Dinastía, Mexico

Description

We love technology, and we enjoy what we do. We are always looking for innovation. We have social awareness and try to improve it daily. We make things happen. You can trust us. Our Enrouters are always up for a challenge. We ask questions, and we love to learn.

We pride ourselves on having great benefits and compensations, a fantastic work environment, flexible schedules, and policies that positively impact the balance of work and life outside of it. We care about who you are in the office and as an individual. We get involved, we like to know our people, we want every Enrouter to become part of a great community of highly driven, responsible, respectful, and above all, happy people. We want you to enjoy working with us.

We’re looking for a QA Engineer with experience testing Large Language Model (LLM) applications.

Requirements

  • 3+ years in QA, including 1+ year testing AI/LLM applications.
  • Experience with RAG frameworks (Response Accuracy, Grounding, Faithfulness).
  • Experience using AI evaluation tools such as Weights & Biases (W&B) or MLflow.
  • Skilled in hallucination detection, multilingual validation, and prompt evaluation.
  • UI testing for AI-driven interfaces using Cypress, Playwright, or Detox.
  • API testing using Postman, REST-assured, or custom scripts.
  • Strong knowledge of edge case testing, fallback validation, and response analysis.
  • Collaborative mindset; ability to work closely with AI/ML engineers and product teams.
  • Hands-on with TestRail, Jira, Zephyr, and other QA tools.
  • Strong documentation and defect-reporting skills.

Key Responsibilities:

  • Design and run test strategies for LLM responses, using the RAG triad framework.
  • Evaluate conversational AI outputs to flag hallucinations or inconsistencies.
  • Validate chatbot/voice UI elements across mobile/web.
  • Perform agentic decision tree validation and simulate edge case scenarios (e.g., API rate limits).
  • Conduct regression, exploratory, and accessibility testing.
  • Maintain test cases/scripts for AI features using automation tools like Cypress or Detox.
  • Track and report AI-specific quality metrics (e.g., hallucination rate, response latency).
  • Clearly document bugs with reproducible steps and AI model response samples.

Benefits

  • Monetary compensation
  • Year-end Bonus
  • IMSS, AFORE, INFONAVIT
  • Major Medical Expenses Insurance
  • Minor Medical Expenses Insurance
  • Life Insurance
  • Funeral Expenses Insurance
  • Preferential rates for car insurance
  • TDU Membership
  • Holidays and Vacations
  • Sick days
  • Bereavement days
  • Civil Marriage days
  • Maternity & Paternity leave
  • English and Spanish classes
  • Performance Management Framework
  • Certifications
  • TALISIS Agreement: Discounts at ADVENIO, Harmon Hall, U-ERRE, UNID
  • Taquitos Rewards
  • Amazon Gift Card on your Birthday
  • Work-from-home Bonus
  • Laptop Policy

Equal Employment

Enroute is committed to providing equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by law.