7-2 jmu-Java&Python-统计文字中的单词数量并按出现次数排序 (25 分)
现在需要统计若干段文字(英文)中的单词数量,并且还需统计每个单词出现的次数。
注1:单词之间以空格(1个或多个空格)为间隔。
注2:忽略空行或者空格行。
基本版:
统计时,区分字母大小写,且不删除指定标点符号。
进阶版:
1.统计前,需要从文字中删除指定标点符号!.,:*?
。
2.统计单词时需要忽略单词的大小写。
输入说明
若干行英文,最后以!!!!!
为结束。
输出说明
单词数量
出现次数排名前10的单词(次数按照降序排序,如果次数相同,则按照键值的字母升序排序)及出现次数。
输入样例1
failure is probably the fortification in your pole
it is like a peek your wallet as the thief when you
are thinking how to spend several hard-won lepta
when you are wondering whether new money it has laid
background because of you then at the heart of the
most lax alert and most low awareness and left it
godsend failed
!!!!!
输出样例1
46
the=4
it=3
you=3
and=2
are=2
is=2
most=2
of=2
when=2
your=2
输入样例2
Failure is probably The fortification in your pole!
It is like a peek your wallet as the thief when You
are thinking how to. spend several hard-won lepta.
when yoU are? wondering whether new money it has laid
background Because of: yOu?, then at the heart of the
Tom say: Who is the best? No one dare to say yes.
most lax alert and! most low awareness and* left it
godsend failed
!!!!!
输出样例2
54
the=5
is=3
it=3
you=3
and=2
are=2
most=2
of=2
say=2
to=2
答案实现:
import java.util.*;
public class Main {
public static String formate(String s) //输入处理函数
{
String str = "";
StringBuilder sb = new StringBuilder();
for (int i = 0; i < s.length(); i++)
{
if (s.charAt(i) == '!'||s.charAt(i) == '.'||s.charAt(i) == ','||s.charAt(i) == ':'||s.charAt(i) == '*'||s.charAt(i) == '?')
continue;
else
{
sb.append(s.charAt(i));
}
}
str = sb.toString();//把其转回String类型
str = str.toLowerCase();//全部转化为小写
return str;
}
public static void main(String[] args) {
// TODO Auto-generated method stub
Scanner sc = new Scanner(System.in);
Map<String,Integer> hashmap = new HashMap<String,Integer>();
//工作1:读入处理好的数据
while(true)
{
String ss = sc.nextLine();
//Set<String> wordSet = hashmap.keySet();
if(ss.equals("!!!!!"))
break;
if (ss !=null && ss.equals(""))
continue;
String[] lineWords=ss.split(" ");//用非单词符来做分割,分割出来的就是一个个单词
for(int i = 0;i < lineWords.length;i++)
{
String str = formate(lineWords[i]);
if(str == null ||str.length() == 0) continue;
if(!hashmap.containsKey(str))
hashmap.put(str, 1);
else
{
int num = hashmap.get(str);
num++;
hashmap.put(str, num);
}
}
}
//工作2:进行排序
List<Map.Entry<String, Integer>> lis = new ArrayList<Map.Entry<String, Integer>>(hashmap.entrySet());
Collections.sort(lis, new Comparator<Map.Entry<String, Integer>>() {
public int compare(Map.Entry<String, Integer> e1, Map.Entry<String, Integer> e2) {
if (e1.getValue() - e2.getValue() != 0)//次序不同就降序排列
return e2.getValue().compareTo(e1.getValue());
else
return e1.getKey().compareTo(e2.getKey());//次序相同就升序排列
//return e1.getValue().compareTo(e2.getValue());
}
});
//工作3:处理输出
System.out.println(hashmap.size());
int num = 0;
for (Map.Entry<String, Integer> map : lis)
{
System.out.println(map.getKey() + "=" + map.getValue());
num ++;
if(num == 10)
break;
}
/*
Iterator<String> it=hashmap.keySet().iterator();
while(it.hasNext())
{
String word=it.next();
System.out.println(word+"="+hashmap.get(word));
}
*/
sc.close();
}
}