About Evaluations – Evolv AI

This article covers everything you need to know about using the Evaluations feature.

What Are Evaluations?

Evaluations in Evolv AI help you systematically assess how well your digital experience aligns with best practices, business goals, and business's own guidelines. By leveraging Evaluations, you can:

Ensure a predictable evaluation of your guidelines.
Set scoring and goal benchmarks to track improvements over time.
Identify optimization opportunities more efficiently.

Evaluation filters – See evaluations and ideation results for different devices by switching the device filter.
View States – Add view states to evaluate the different ways your view might appear, such as when a dialog is present, or when the page options change based on product type.
Ideation tabs – Switch between Evaluations and Ideate with AI.
Page Goal and Page Type – These AI-inferred values determine which evaluations and guidelines are available for the current screenshot.
Evaluations - An evaluation checks to see if a current screenshot meets its guidelines.
View preview – A device-specific screenshot

How Evaluations Work

Evaluations allow teams to identify potential UX or compliance issues without needing a manual audit, saving time and ensuring comprehensive coverage of design elements.

Evaluations analyze screenshots of your digital experience within Views to determine how well they align with best practices and your business's own guidelines.

A few factors influence the results of each evaluation:

The View’s Goal & Page Type – The AI assigns a goal (e.g. Checkout, Lead Generation, Subscriptions, Engagement, Donations) and page type (e.g., Product Detail Page, Checkout) to determine applicable guidelines.
Device-Specific Screenshots – Guidelines are evaluated per device (e.g., desktop vs. mobile).
Knowledge about your business – Information you add to the Knowledge section provide additional context for the AI when evaluating guidelines.

Ideation tabs – Switch between Evaluations and Ideate with AI.
Navigate back to All Evaluations for the current view state
Evaluation Score - the number of guidelines found to be in line, over the total number of guidelines that could be evaluated
Guideline status filters
Generated ideas for a guideline that's not in line
Guidelines that are in line or inconclusive appear after the those that are not in line.

Evaluating Guidelines

Each evaluation analyzes a set of guidelines against a specific screenshot to determine their status:

In Line – The design meets the guideline criteria.
Not In Line – The design does not meet the guideline criteria, prompting the AI to generate hypotheses and ideas to address the issue
Inconclusive – inconclusive refers to either of the following conditions:
- Insufficient information – the current screenshot and available knowledge are insufficient for evaluating the guideline criteria. More information, such as an additional view state screenshot or knowledge content may be required
- Not applicable – the AI determined that the guideline does not apply to the current screenshot

Evaluation Scoring

Teams can use evaluation scores to track progress over time, measure design quality, and identify areas that require improvement before launching changes to production.

The score for a View State is calculated based on the number of In Line guidelines relative to the total conclusive guidelines.

Note – Inconclusive guidelines are excluded from the total score.

Types of Evaluations

There are 2 types of evaluations: third-party evaluations and your ‘knowledge category’ evaluations.

Evolv AI (a third-party) provides evaluations for the follow goals and page types:

Goal: Checkout (E-commerce)
- Homepage
- Landing Page CTA - landing pages that have a clickthrough primary action
- Product Listing Page
- Product Detail Page
- Product Configuration page
- Cart Page
- Checkout
Goal: Lead Generation
- Homepage
- Landing Page Form Submission - landing pages that have a lead form
- Landing Page CTA - landing pages that have a clickthrough primary action
- Information Request Page
Goal: Subscriptions
- Homepage
- Landing Page CTA - landing pages that have a clickthrough primary action
- Pricing Page
- Checkout
Goal: Engagement
- Homepage
- Landing Page Form Submission - landing pages that have a lead form
- Landing Page CTA - landing pages that have a clickthrough primary action
Goal: Donations
- Homepage
- Landing Page CTA - landing pages that have a clickthrough primary action
- Donation Page

Knowledge Category Evaluations are evaluations based on guidelines derived from the information you add to different categories in the Knowledge section.

Running and Updating Evaluations

If a design is updated or new knowledge is added, teams can quickly refresh their evaluations to reflect the most up-to-date insights, preventing outdated recommendations.

Running Evaluations

When you create a View or View State, the Basic E-commerce Evaluation will run automatically.

To run the Advanced E-commerce Evaluation or any of your Knowledge Category Evaluations, navigate to “All Evaluations” in each view.

Updating Evaluations

Evaluations may need to be updated when:

A screenshot is replaced, triggering a reevaluation.
New Knowledge is added, requiring guidelines to be reassessed. When new knowledge guidelines become available, you’ll see a prompt to re-run the evaluation.

Important: Updating an evaluation will reset the results and remove any custom hypotheses and ideas that you may have added.

Adding Custom Hypotheses & Ideas

Custom hypotheses allow teams to test their own design assumptions and gather AI-generated ideas, making evaluations more tailored to their specific project needs.

Users can:

Add custom hypotheses, prompting AI to generate new ideas.
Add custom ideas manually for planning purposes.

Archiving & Feedback

Teams can declutter their workspace by archiving old or irrelevant results while keeping them accessible for future reference if needed.

Restore archived results by first revealing them, then clicking the unarchive button.

Teams can refine AI-generated ideas by providing feedback, improving the relevance and quality of future recommendations tailored to their specific needs.