Text quality evaluation metric
The text quality metric evaluates the output of a model against SuperGLUE datasets by measuring the F1 score, precision, and recall against the model predictions and its ground truth data.
Metric details
Text quality is a generative AI quality evaluation metric that measures how well generative AI assets perform tasks.
Scope
Text quality evaluates generative AI assets only.
- Types of AI assets: Prompt templates
- Generative AI tasks:
- Text summarization
- Content generation
- Supported languages: Arabic (ar), Danish (da), English (en), French (fr), German (de), Italian (it), Japanese (ja), Korean (ko), Portuguese (pt), Spanish (es).
Scores and values
The text quality metric score indicates the similarity between the predictions and references. Higher scores indicate higher similarity between the predictions and references.
Settings
- Thresholds:
- Lower limit: 0.8
- Upper limit: 1