# 题目说明

## 测试数据 1

### 输入样例

failure is probably the fortification in your pole

it is like a peek your wallet as the thief when you
are thinking how to spend several hard-won lepta

when you are wondering whether new money it has laid
background because of you then at the heart of the

most lax alert and most low awareness and left it

godsend failed
!!!!!


### 输出样例

46
the=4
it=3
you=3
and=2
are=2
is=2
most=2
of=2
when=2
your=2


## 测试数据 2

### 输入样例

Failure is probably The fortification in your pole!

It is like a peek your wallet as the thief when You
are thinking how to. spend several hard-won lepta.

when yoU are? wondering whether new money it has laid
background Because of: yOu?, then at the heart of the
Tom say: Who is the best? No one dare to say yes.
most lax alert and! most low awareness and* left it

godsend failed
!!!!!


### 输出样例

54
the=5
is=3
it=3
you=3
and=2
are=2
most=2
of=2
say=2
to=2


# 题目分析

1. 单词输入并去除标点符号：程序需要用不定行输入的方式输入多行字符串，并且将字符串分割成多个单词。其中字符串会存在不少标点符号和空格，这些标点符号可能和单词相邻，因此不能简单地直接分割字符串。
2. 统计词频：对于输入的字符串，不同的单词需要分别统计其词频。
3. 单词按照词频和字典序排序：对于词频统计的结果，按照“次数按照降序排序，如果次数相同则按照键值的字母升序排序”的规则排序后输出。

# 去除标点符号

public static String removePunctuation(String str){

StringBuilder strbld = new StringBuilder();
for (int i = 0; i < str.length(); i++) {
if (str.charAt(i) != '!' && str.charAt(i) != '.' && str.charAt(i) != ','
&& str.charAt(i) != ':' && str.charAt(i) != '*' && str.charAt(i) != '?') {
strbld.append(str.charAt(i));
}
}
return strbld.toString().toLowerCase();
}


# 统计词频

Scanner sc = new Scanner(System.in);
Map<String,Integer> fre_map = new HashMap<String,Integer>();
while(true){
String str = sc.nextLine();
if(str.equals("!!!!!")) {
break;
}
if (str != null && str.equals("")) {
continue;
}
String[] words = str.split(" ");    //分割出单个单词
for(int i = 0; i < words.length; i++){
String a_word = Main.removePunctuation(words[i]);    //获取单个去除标点的单词
if(a_word == null || a_word.length() == 0) {
continue;
}
if(!fre_map.containsKey(a_word)) {    //单词未被统计，建立新映射
fre_map.put(a_word, 1);
}
else{    //单词已被统计过，更新数据
int num = fre_map.get(a_word) + 1;
fre_map.put(a_word, num);
}
}
}


# 单词排序

//返回 fre_map 中所有映射关系的视图，存储入 1 个 ArrayList 中
List<Map.Entry<String, Integer>> fre_list = new ArrayList<Map.Entry<String, Integer>>(fre_map.entrySet());
Collections.sort(fre_list, new WordComparator());

System.out.println(fre_list.size());
int num = 0;
for (Map.Entry<String, Integer> e : fre_list) {
System.out.println(e.getKey() + "=" + e.getValue());
if(++num == 10) {
break;
}
}


class WordComparator implements Comparator<Map.Entry<String, Integer>>{

@Override
public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {
if (o1.getValue().equals(o2.getValue()))    //2 个单词词频相同，按照字典序进行排序
return o1.getKey().compareTo(o2.getKey());
else    //词频不同，按词频排序
return o2.getValue().compareTo(o1.getValue());
}
}

posted @ 2021-01-15 02:02  乌漆WhiteMoon  阅读(1230)  评论(0编辑  收藏  举报