PromptVault Docs
Dashboard

AI Evaluator

How the AI Evaluator audits draft prompts, scores them, and surfaces ranked suggestions.

The AI Evaluator audits draft prompts and surfaces issues that need attention. The sidebar badge is the count of unresolved non-pass results across the workspace.

What the Auditor Checks

The auditor classifies each prompt and scores it across dimensions such as:

DimensionWhat it measures
Role identityWhether the agent's role and expertise are clear.
Task definitionWhether the primary objective is specific and singular.
Context sufficiencyWhether the prompt gives enough background to act.
Input specificationWhether inputs are explicitly described.
Output formatWhether output structure is precise.
Constraints and guardrailsWhether scope limits and refusal cases are clear.
ExamplesWhether representative and edge cases are covered.
Edge casesWhether ambiguity and out-of-scope inputs are handled.
Internal consistencyWhether instructions conflict.
Token efficiencyWhether unnecessary padding is avoided.
Variable hygieneWhether placeholders are consistent and documented.

Some dimensions may not apply to every prompt category.

Pass vs. Report

The evaluator returns:

  • Pass when no material issues are found.
  • Report when the prompt needs attention.

Reported issues are ranked by severity:

SeverityMeaning
BLOCKERThe prompt cannot reliably do its job.
MAJORDegrades quality on common inputs.
MINORDegrades quality on edge cases.
NITStyle or polish only.

Running an Audit

Click Run audit on the AI Evaluator screen. PromptVault reviews current draft prompts, reuses prior results when possible, and shows the latest suggestions in the dashboard.

Suggestions List

Each non-pass suggestion shows:

ElementMeaning
Prompt slug + folderWhich prompt needs attention.
Severity badgeBLOCKER, MAJOR, MINOR, NIT, or REFUSED.
ScoreOverall quality score.
Issue summaryOne-sentence description of the primary problem.
Proposed fixSuggested edit.
DiffSide-by-side comparison between current and proposed prompt.
StrengthsWhat the prompt already does well.
Open questionsThings the evaluator could not infer.

Resolving a Suggestion

ActionWhat it does
ApplyCopies the proposed source into the editor for review.
Mark resolvedRemoves it from the active list.
DismissHides it for the current session.

The next audit can recreate the suggestion if the underlying issue still exists.

Refusals

The evaluator may refuse to improve prompts designed to enable catastrophic harm. A refusal appears as a REFUSED suggestion with the reason, and the prompt itself is not modified.

On this page