题目:统计文本中单词出现的频率
编写一个脚本,读取一个文本文件,统计其中每个单词出现的次数,然后按照出现次数从高到底的顺序输出单词和次数
我写的版本:
set in_file [open "./in.f" r]
set out_file [open "./out.f" w]
set in_list [regexp -all -inline {\w+} [read $in_file] ]
set sort [lsort -unique $in_list]
foreach i $sort {
set m 0
foreach j $in_list {
if {$i eq $j} {
incr m
}
}
puts $out_file "$i $m"
}
close $out_file
chatGPT给出的答案:
set input_file "input.txt"
set word_freq [dict create]
set fin [open $input_file r]
while {[gets $fin line] != -1} {
set words [regexp -inline -all {\w+} $line]
foreach word $words {
if {[dict exists $word_freq $word]} {
dict incr word_freq $word
}else {
dict set word_freq $word 1
}
}
}
set sorted_words [lsort -decreasing -integer [dict keys $word_freq]]
foreach word $sorted_words {
set freq [dict get $word_freq $word]
puts "$word :$freq"
}
close $fin
我的忘记排序了,主要是把文本当作列表list处理的,chatGPT版本是当作字典dict处理的,关于list和dict的具体定义和用法可以见本人的另一篇文章:
https://blog.csdn.net/LRRRUI/article/details/121715140