NIST is a method for
evaluating the quality of text which has been translated using
machine translation
Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates t ...
. Its name comes from the US
National Institute of Standards and Technology
The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical sci ...
.
It is based on the
BLEU metric, but with some alterations. Where
BLEU simply calculates
n-gram
In the fields of computational linguistics and probability, an ''n''-gram (sometimes also called Q-gram) is a contiguous sequence of ''n'' items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or b ...
precision adding equal weight to each one, NIST also calculates how informative a particular
n-gram
In the fields of computational linguistics and probability, an ''n''-gram (sometimes also called Q-gram) is a contiguous sequence of ''n'' items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or b ...
is. That is to say when a correct
n-gram
In the fields of computational linguistics and probability, an ''n''-gram (sometimes also called Q-gram) is a contiguous sequence of ''n'' items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or b ...
is found, the rarer that n-gram is, the more weight it will be given.
For example, if the bigram "on the" is correctly matched, it will receive lower weight than the correct matching of bigram "interesting calculations", as this is less likely to occur.
NIST also differs from
BLEU in its calculation of the brevity penalty insofar as small variations in translation length do not impact the overall score as much.
See also
*
BLEU
*
F-Measure
*
METEOR
A meteoroid () is a small rocky or metallic body in outer space.
Meteoroids are defined as objects significantly smaller than asteroids, ranging in size from grains to objects up to a meter wide. Objects smaller than this are classified as micr ...
*
Noun-phrase chunking
*
ROUGE (metric)
ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, is a set of metrics and a software package used for evaluating automatic summarization and machine translation software in natural language processing. The metrics compare an automatical ...
*
Word error rate (WER)
References
NIST 2005 Machine Translation Evaluation Official Results
{{tech-stub
Evaluation of machine translation