wcPro
1.github地址:
https://github.com/huangjianmin/wcPro/
2.小组分工:
黄健民——核心功能(单词划分,词频统计),姚金港——输入控制,孙智超——输出控制,梅智超——main函数
3.PSP表格:
PSP2.1表格
|
PSP2.1 |
PSP阶段 |
预估耗时 (分钟) |
实际耗时 (分钟) |
|
Planning |
计划 |
10 | 10 |
|
· Estimate |
· 估计这个任务需要多少时间 |
5 | 5 |
|
Development |
开发 |
20 | 15 |
|
· Analysis |
· 需求分析 (包括学习新技术) |
30 | 20 |
|
· Design Spec |
· 生成设计文档 |
20 | 0 |
|
· Design Review |
· 设计复审 (和同事审核设计文档) |
20 | 0 |
|
· Coding Standard |
· 代码规范 (为目前的开发制定合适的规范) |
15 | 15 |
|
· Design |
· 具体设计 |
60 | 60 |
|
· Coding |
· 具体编码 |
480 | 360 |
|
· Code Review |
· 代码复审 |
60 | 30 |
|
· Test |
· 测试(自我测试,修改代码,提交修改) |
60 | 40 |
|
Reporting |
报告 |
30 | 40 |
|
· Test Report |
· 测试报告 |
30 | 20 |
|
· Size Measurement |
· 计算工作量 |
5 | 5 |
|
· Postmortem & Process Improvement Plan |
· 事后总结, 并提出过程改进计划 |
20 | 0 |
|
合计 |
865 | 620 |
4.接口实现:
4.1单词划分:
此功能分为两个部分,第一个部分为去掉非字母的字符,划分单词,第二部分为统计不同的单词个数。
第一部分的实现代码:
int curPosition=0;
//System.out.println(targetStr);
ArrayList<String> words=new ArrayList<String>(); //划分单词
while(curPosition<targetStr.length())
{
while(curPosition<targetStr.length()&&!(targetStr.charAt(curPosition)>='a'&&targetStr.charAt(curPosition)<='z'||targetStr.charAt(curPosition)>='A'&&targetStr.charAt(curPosition)<='Z'||targetStr.charAt(curPosition)=='-'))
{
curPosition++;
}//去掉非字母
boolean isWord=false;
String thisWord="";
while (curPosition<targetStr.length()&&(targetStr.charAt(curPosition)>='a'&&targetStr.charAt(curPosition)<='z'||targetStr.charAt(curPosition)>='A'&&targetStr.charAt(curPosition)<='Z'||targetStr.charAt(curPosition)=='-'))
{
isWord=true;
thisWord+=targetStr.charAt(curPosition);
curPosition++;
}
if (isWord) //只有isWord为真时,才将单词计入words
{
words.add(thisWord);
}
else
{
thisWord="";
}
}
当遇到非字母的字符时,坐标直接+1,跳过该字符。划分单词时,先预留一个空字符串thisWord,当坐标读到的字符为字母或连词符号时,将字母或连词符号加入到thisWord中,坐标+1,当坐标读到的单词不是字母或者连词符号时,清空thisWord,并标记单词已读完,将thisWord中的内容add到words数组表中,由此可以划分单词。
第二部分的实现代码:
int size=words.size();
int m=0;
ArrayList<String> cword=new ArrayList<String>(); //统计单词数
ArrayList<Integer> num=new ArrayList<Integer>();
//ArrayList<String> cnum=new ArrayList<String>();
//cword.add(words.get(0));
//num.add(1);
ArrayList<String> wordNum=new ArrayList<String>();
for(int i=0;i<size;i++) {
String aa=words.get(i);
wordNum.add(2*i, aa);
wordNum.add(2*i+1,"1");
}
for(int i=0;i<size;i++) {
//System.out.println(wordNum.get(0));
for(int j=i+1;j<size;j++)
if(wordNum.get(2*i).equals(wordNum.get(2*j)))
{
int a=Integer.parseInt(wordNum.get(2*i+1))+1;
wordNum.set(2*i+1,Integer.toString(a));
wordNum.remove(2*j+1);
wordNum.remove(2*j);
j--;
size-=1;
}
/*System.out.println(i);
System.out.println(wordNum.get(2*i));
System.out.print(wordNum.get(2*i+1));
*/
}
return wordNum;
}
此代码的原理:将words中的单词全部存放到wordNum的偶数格中,wordNum的奇数隔放前一个格子的单词的个数,初始全部为1,创建循环,从第一个wordNum偶数格保留的单词开始,依次比较此单词和之后单词是否相同,如果有相同的,那么这个单词之后的数字+1,删除之后的单词与单词个数,同时,数组表的size()相应的减少2。循环直至wordNum中的单词全部扫描一遍。
4.2词频统计:
代码如下:
public static ArrayList<String> sort(ArrayList<String> str){
HashMap<String, Integer> map=new HashMap<> ();
int i=0;
while(i<str.size()-1) {
map.put(str.get(i), Integer.parseInt(str.get(++i)));
i++;
}
// Step2 排序
// 以Key进行排序
TreeMap treemap = new TreeMap(map);
// 以value进行排序
ArrayList<Map.Entry<String, Integer>> list = new ArrayList<Map.Entry<String, Integer>>(
treemap.entrySet());
Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
public int compare(Map.Entry<String, Integer> o1,
Map.Entry<String, Integer> o2) {
return o2.getValue() - o1.getValue();// 降序
// 升序 o1.getValue() - o2.getValue())
}
});
i=0;
for (Map.Entry<String, Integer> string : list) {
// 排除-与空格
if (!(string.getKey().equals(""))
&& !(string.getKey().equals("-"))) {
// 换行"\r\n"不是"\n
str.set(i, string.getKey());
str.set(++i, string.getValue().toString());
i++;
}
}
return str;
}
代码实现,运用hashmap的排序功能实现,先将arrayList转换为hashMap,运用hashMap进行词频排序,然后再转换为arrayList输出。
5.测试用例:
测试单词划分统计功能和词频统计功能,考虑以下几个方面:
a.输入包含非字母的符合,单词统计功能能否去除非字母
b.单词统计的个数是否正确
c.词频统计功能是否正确
d.连词符号‘-’在句首能否处理
涵盖以上4个方面设计20个测试用例。
测试代码:
public class wordFreTest {
@Test
public void testDiviWord() {
//fail("Not yet implemented");
System.out.println(wordFre.diviWord("that,that,a,this")+"\nthat,2,a,1,this,1\n");
System.out.println(wordFre.diviWord("that,that,a,this,this")+"\nthat,2,a,1,this,2\n");
System.out.println(wordFre.diviWord("that,that,a,this,this,a,this")+"\nthat,2,a,2,this,3\n");
System.out.println(wordFre.diviWord("that,that,a,-this,this,a,this")+"\nthat,2,a,2,-this,1,this,2\n");
System.out.println(wordFre.diviWord("a,aa,aaa,aaa,aa,aaa")+"\na,1,aa,2,aaa,3\n");
System.out.println(wordFre.diviWord("a,-aa,-aaa,aaa,aa,aaa")+"\na,1,-aa,1,-aaa,1,aaa,2,aa,1\n");
System.out.println(wordFre.diviWord("-a,that,a,this")+"\n-a,1,that,1,a,1,this,1\n");
System.out.println(wordFre.diviWord("that,that,a,this,this,is-")+"\nthat,2,a,1,this,2,is-,1\n");
System.out.println(wordFre.diviWord("that,that,that,a,this,this,a,this")+"\nthat,3,a,2,this,3\n");
System.out.println(wordFre.diviWord("that,that,a,-this,this,a,-this")+"\nthat,2,a,2,-this,2,this,1\n");
}
@Test
public void testSort() {
//fail("Not yet implemented");
System.out.println(wordFre.diviWord("that,2,a,1,this,1")+"\nthat,2,a,1,this,1\n");
System.out.println(wordFre.diviWord("that,2,a,1,this,2")+"\nthat,2,this,2,a,1\n");
System.out.println(wordFre.diviWord("that,2,a,2,this,3")+"\nthis,3,a,2,that,2\n");
System.out.println(wordFre.diviWord("that,2,a,2,-this,1,this,2")+"\na,2,that,2,this,2,-this,1\n");
System.out.println(wordFre.diviWord("a,1,aa,2,aaa,3")+"\naaa,3,aa,2,a,1\n");
System.out.println(wordFre.diviWord("na,1,-aa,1,-aaa,1,aaa,2,aa,1")+"\naaa,2,a,1,aa,1,-aa,1,-aaa,1\n");
System.out.println(wordFre.diviWord("-a,1,that,1,a,1,this,1")+"\na,1,that,1,this,1,-a,1\n");
System.out.println(wordFre.diviWord("that,2,a,1,this,2,is-,1")+"\nthat,2,this,2,a,1,is-,1\n");
System.out.println(wordFre.diviWord("that,3,a,2,this,3")+"\nthat,3,this,3,a,2\n");
System.out.println(wordFre.diviWord("that,2,a,2,-this,2,this,1")+"\na,2,that,2,-this,2,this,1\n");
}
}
测试运行截图:

[that, 1, a, 1, this, 1] that,2,a,1,this,1 [that, 1, a, 1, this, 1] that,2,this,2,a,1 [that, 1, a, 1, this, 1] this,3,a,2,that,2 [that, 1, a, 1, -this, 1, this, 1] a,2,that,2,this,2,-this,1 [a, 1, aa, 1, aaa, 1] aaa,3,aa,2,a,1 [na, 1, -aa, 1, -aaa, 1, aaa, 1, aa, 1] aaa,2,a,1,aa,1,-aa,1,-aaa,1 [-a, 1, that, 1, a, 1, this, 1] a,1,that,1,this,1,-a,1 [that, 1, a, 1, this, 1, is-, 1] that,2,this,2,a,1,is-,1 [that, 1, a, 1, this, 1] that,3,this,3,a,2 [that, 1, a, 1, -this, 1, this, 1] a,2,that,2,-this,2,this,1 [that, 2, a, 1, this, 1] that,2,a,1,this,1 [that, 2, a, 1, this, 2] that,2,a,1,this,2 [that, 2, a, 2, this, 3] that,2,a,2,this,3 [that, 2, a, 2, -this, 1, this, 2] that,2,a,2,-this,1,this,2 [a, 1, aa, 2, aaa, 3] a,1,aa,2,aaa,3 [a, 1, -aa, 1, -aaa, 1, aaa, 2, aa, 1] a,1,-aa,1,-aaa,1,aaa,2,aa,1 [-a, 1, that, 1, a, 1, this, 1] -a,1,that,1,a,1,this,1 [that, 2, a, 1, this, 2, is-, 1] that,2,a,1,this,2,is-,1 [that, 3, a, 2, this, 3] that,3,a,2,this,3 [that, 2, a, 2, -this, 2, this, 1] that,2,a,2,-this,2,this,1
浙公网安备 33010602011771号