Linearization of Zipfian Distribution for Chinese Characters
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we report our results of least-square fittings to 4 sets of data derived from Chinese characters namely, character strokes, radicals, characters and words. We have found that fitting using a power series, ief^I versus R^i(f is the frequency of occurrence, R the rank and t is a Constant) is better than the use of a logarithm series derived from the original simple Zipf's law, iefR = Constant, or log f= c-log R. The dependency of f versus R is found to be of order 5 as we have found that t=0.2. We have also discovered a secondary dependency of f on R of lower order. This secondary dependency can be modeled using a cosine function.
- 一般社団法人情報処理学会の論文
- 1992-03-31