linux下可以用wc命令统计单词数量。不过为了练手我重复发明了这个轮子,用shell实现了wc的功能。输入数据是一个txt文本,输出数据,呵呵,除了单词数量的统计,还有单词的长度,以及用Detrended Fluctuation Analysis方法计算了波动值。。。具体的要求是:
This section sketches out a rough procedure to accomplish this task. You will
be evaluated based on these points (see Section 3 for detailed information).
(1) Install catdoc, a utility that converts microsoft word, pdf file into pure text
file.
(2) Write down the number of characters word by word of a whole input file.
For example, the this sentence in the document is ”In statistics, a power
law is a functional relationship between two quantities, where one quantity
varies as a power of another.” You should write something like ” 2 10 1 5
3 2 1 10 12 7 3 10 5 3 8 6 2 1 5 2 7”.
(3) Analyze the data sequence with ”Detrended fluctuation analysis[2]”. Fluc-
tuation under different scale L ∈ (10,100,1000,10000) should be calculated.