Skip to content

Add a new BLEU metric to Evidently  #1319

@elenasamuylova

Description

@elenasamuylova

About Hacktoberfest contributions: https://github.com/evidentlyai/evidently/wiki/Hacktoberfest-2024

Descripton.

The BLEU (Bilingual Evaluation Understudy) metric is used to evaluate the quality of machine-generated text, typically translations, by comparing it to the reference texts. BLEU measures how closely the generated text matches the reference using n-gram precision, with a penalty for overly short or incomplete translations.

We can implement a BLEU metric that computes scores for each row and a summary BLEU metric for the dataset.

Note that this implementation would require creating a new Metric (instead of defaulting to ColumnSummaryMetric to aggregate descriptors values) to compute and visualize the summary BLEU score. You can check other dataset-level metrics (e.g., from classification or ranking) for inspiration.

Note: we can also consider implementing METEOR metric as an option.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthacktoberfestAccepted contributions will count towards your hacktoberfest PRs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions