RANDOM TEXTS EXHIBIT ZIPF-LAW-LIKE WORD-FREQUENCY DISTRIBUTION

被引:247
作者
LI, WT
机构
[1] Rockefeller University, New York, NY 10021, Box 167
关键词
STATISTICAL LINGUISTICS; ZIPF LAW; POWER-LAW DISTRIBUTION; RANDOM TEXTS;
D O I
10.1109/18.165464
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is shown that the distribution of word frequencies for randomly generated texts is very similar to Zipf's law observed in natural languages such as English. The facts that the frequency of occurrence of a word is almost an inverse power law function of its rank and the exponent of this inverse power law is very close to 1 are largely due to the transformation from the word's length to its rank, which stretches an exponential function to a power law function.
引用
收藏
页码:1842 / 1845
页数:4
相关论文
共 13 条
[1]  
GELLMANN M, UNPUB
[2]  
Kucera H., 1967, COMPUTATIONAL ANAL P
[3]  
LI W, 1989, 89009 SANT FE I PREP
[4]  
Mandelbrot B. B., 1982, FRACTAL GEOMETRY NAT, P1
[5]  
Mandelbrot B. B., 1977, FRACTALS FORM CHANCE
[6]  
MANDELBROT BB, 1975, OBJECTS FRACTAL FORM
[7]   INTERMITTENCY, SELF-SIMILARITY AND 1-F SPECTRUM IN DISSIPATIVE DYNAMICAL-SYSTEMS [J].
MANNEVILLE, P .
JOURNAL DE PHYSIQUE, 1980, 41 (11) :1235-1243
[8]  
MILLER G, 1965, PSYCHOBIOLOGY LANGUA
[9]  
RAIMI HA, 1969, SCI AM, V221, P109
[10]  
ZIPF GK, 1965, PSYCHOBIOL LANGUAGE