Understand BLEU & ROUGE-1

Definition: Metrics used heavily in sequence-to-sequence tasks measuring n-gram overlap between a generated response and a target output. Higher BLEU and ROUGE-1 scores equates to better overlap between the generated and target output.

Calculation: A measure of n-gram overlap. A more lengthy explanation of BLEU provided here. A more lengthy explanation of ROUGE-1 provided here. These metrics require a {target} column in your dataset.

Usefulness: Evaluate the accuracy of model outputs in comparison to target outputs, enabling a metric to guide improvement and examination of areas where a model has trouble adhering to expected output.

Last updated