Finding Appropriate Lexical Diversity Measurements for Small-Size Corpus
In the present investigation four kinds of lexical diversity measurement have been applied to the sets of word chunks with monotone increasing size. The computational experiment with corpus processing and statistical test has been conducted to find out the most effective lexical diversity measurement in evaluating a small-sized corpus of 350~550 words, and the result shows that D-estimate is the most appropriate among the four lexical diversity measurements which are considered in this research. Also D-estimate shows more stable results than other measurements when the number of words varies between texts.
Dongye Sun, Wen-Pei Sung and Ran Chen
W. H. Choi "Finding Appropriate Lexical Diversity Measurements for Small-Size Corpus", Applied Mechanics and Materials, Vols. 121-126, pp. 1244-1248, 2012