ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, is a set of metrics and a software package used for evaluating

automatic summarization Automatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Artificial intelligence algorithms are commo ...

and

machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates t ...

software in

natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to pro ...

. The metrics compare an automatically produced summary or translation against a reference or a set of references (human-produced) summary or translation.

Metrics

The following five evaluation metrics are available. *ROUGE-N: Overlap of n-gramsLin, Chin-Yew and E.H. Hovy 2003. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics. In Proceedings of 2003 Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada, May 27 - June 1, 2003.
/ref> between the system and reference summaries. **ROUGE-1 refers to the overlap of ''unigram'' ''(each word)'' between the system and reference summaries. **ROUGE-2 refers to the overlap of ''bigrams'' between the system and reference summaries. *ROUGE-L: Longest Common Subsequence (LCS)Lin, Chin-Yew and Franz Josef Och. 2004. Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), Barcelona, Spain, July 21 - 26, 2004.
/ref> based statistics.

Longest common subsequence problem The longest common subsequence (LCS) problem is the problem of finding the longest subsequence common to all sequences in a set of sequences (often just two sequences). It differs from the longest common substring problem: unlike substrings, sub ...

takes into account sentence level structure similarity naturally and identifies longest co-occurring in sequence n-grams automatically. *ROUGE-W: Weighted LCS-based statistics that favors consecutive LCSes . *ROUGE-S: Skip-

bigram A bigram or digram is a sequence of two adjacent elements from a string of tokens, which are typically letters, syllables, or words. A bigram is an ''n''-gram for ''n''=2. The frequency distribution of every bigram in a string is commonly used f ...

based co-occurrence statistics. Skip-bigram is any pair of words in their sentence order. *ROUGE-SU: Skip-bigram plus unigram-based co-occurrence statistics.

References

{{Reflist

External links

ROUGE Usage TutorialJava Implementation of ROUGE
Machine translation Computational linguistics Natural language processing software Data mining

Metrics

See also

References

External links