Text quality evaluation metric

The text quality metric evaluates the output of a model against SuperGLUE datasets by measuring the F1 score, precision, and recall against the model predictions and its ground truth data.

Metric details

Text quality is a generative AI quality evaluation metric that measures how well generative AI assets perform tasks.

Scope

Text quality evaluates generative AI assets only.

  • Types of AI assets: Prompt templates
  • Generative AI tasks:
    • Text summarization
    • Content generation
  • Supported languages: Arabic (ar), Danish (da), English (en), French (fr), German (de), Italian (it), Japanese (ja), Korean (ko), Portuguese (pt), Spanish (es).

Scores and values

The text quality metric score indicates the similarity between the predictions and references. Higher scores indicate higher similarity between the predictions and references.

Settings

  • Thresholds:
    • Lower limit: 0.8
    • Upper limit: 1