wcPro

1.github地址：

https://github.com/huangjianmin/wcPro/

2.小组分工：

　　黄健民——核心功能（单词划分，词频统计），姚金港——输入控制，孙智超——输出控制，梅智超——main函数

3.PSP表格：

PSP2.1表格

PSP2.1	PSP阶段	预估耗时（分钟）	实际耗时（分钟）
Planning	计划	10	10
· Estimate	· 估计这个任务需要多少时间	5	5
Development	开发	20	15
· Analysis	· 需求分析 (包括学习新技术)	30	20
· Design Spec	· 生成设计文档	20	0
· Design Review	· 设计复审 (和同事审核设计文档)	20	0
· Coding Standard	· 代码规范 (为目前的开发制定合适的规范)	15	15
· Design	· 具体设计	60	60
· Coding	· 具体编码	480	360
· Code Review	· 代码复审	60	30
· Test	· 测试（自我测试，修改代码，提交修改）	60	40
Reporting	报告	30	40
· Test Report	· 测试报告	30	20
· Size Measurement	· 计算工作量	5	5
· Postmortem & Process Improvement Plan	· 事后总结, 并提出过程改进计划	20	0
	合计	865	620

4.接口实现：

4.1单词划分：

　　　　此功能分为两个部分，第一个部分为去掉非字母的字符，划分单词，第二部分为统计不同的单词个数。

　　　　　　第一部分的实现代码：

                int curPosition=0; 
		//System.out.println(targetStr);
		ArrayList<String> words=new ArrayList<String>();  //划分单词
		while(curPosition<targetStr.length()) 
		{ 
			while(curPosition<targetStr.length()&&!(targetStr.charAt(curPosition)>='a'&&targetStr.charAt(curPosition)<='z'||targetStr.charAt(curPosition)>='A'&&targetStr.charAt(curPosition)<='Z'||targetStr.charAt(curPosition)=='-')) 
			{ 
				curPosition++; 
			}//去掉非字母
			boolean isWord=false; 
			String thisWord=""; 
			while (curPosition<targetStr.length()&&(targetStr.charAt(curPosition)>='a'&&targetStr.charAt(curPosition)<='z'||targetStr.charAt(curPosition)>='A'&&targetStr.charAt(curPosition)<='Z'||targetStr.charAt(curPosition)=='-')) 
			{ 
				isWord=true; 
				thisWord+=targetStr.charAt(curPosition); 
				curPosition++; 
			} 
			if (isWord)  //只有isWord为真时，才将单词计入words 
			{
				words.add(thisWord);				
			}
			else
			{
				thisWord="";
			}
			
		}

　　　　当遇到非字母的字符时，坐标直接+1，跳过该字符。划分单词时，先预留一个空字符串thisWord，当坐标读到的字符为字母或连词符号时，将字母或连词符号加入到thisWord中，坐标+1，当坐标读到的单词不是字母或者连词符号时，清空thisWord，并标记单词已读完，将thisWord中的内容add到words数组表中，由此可以划分单词。

　　　　第二部分的实现代码：

　　　　　　　　　　int size=words.size();
		int m=0;
		ArrayList<String> cword=new ArrayList<String>();  //统计单词数
		ArrayList<Integer> num=new ArrayList<Integer>();
		//ArrayList<String> cnum=new ArrayList<String>(); 
		//cword.add(words.get(0));
		//num.add(1);
		ArrayList<String> wordNum=new ArrayList<String>();
		for(int i=0;i<size;i++) {
			String aa=words.get(i);
			wordNum.add(2*i, aa);
			wordNum.add(2*i+1,"1");
			
		}

        for(int i=0;i<size;i++) {
			//System.out.println(wordNum.get(0));
			for(int j=i+1;j<size;j++)
				if(wordNum.get(2*i).equals(wordNum.get(2*j)))
				{
					int a=Integer.parseInt(wordNum.get(2*i+1))+1;
					wordNum.set(2*i+1,Integer.toString(a));
					wordNum.remove(2*j+1);
					wordNum.remove(2*j);
					j--;
					size-=1;
				}
			/*System.out.println(i);
			System.out.println(wordNum.get(2*i));
			System.out.print(wordNum.get(2*i+1));
			*/
		}
                return wordNum;
	    }

　　　　此代码的原理：将words中的单词全部存放到wordNum的偶数格中，wordNum的奇数隔放前一个格子的单词的个数，初始全部为1，创建循环，从第一个wordNum偶数格保留的单词开始，依次比较此单词和之后单词是否相同，如果有相同的，那么这个单词之后的数字+1，删除之后的单词与单词个数，同时，数组表的size()相应的减少2。循环直至wordNum中的单词全部扫描一遍。

4.2词频统计：

　　　　代码如下：

public static ArrayList<String> sort(ArrayList<String> str){
		HashMap<String, Integer> map=new HashMap<> ();
		int i=0;
		while(i<str.size()-1) {
			map.put(str.get(i), Integer.parseInt(str.get(++i)));
			i++;
		}
		// Step2 排序
		// 以Key进行排序
		TreeMap treemap = new TreeMap(map);
		// 以value进行排序
		ArrayList<Map.Entry<String, Integer>> list = new ArrayList<Map.Entry<String, Integer>>(
				treemap.entrySet());
		Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
			public int compare(Map.Entry<String, Integer> o1,
					Map.Entry<String, Integer> o2) {
				return o2.getValue() - o1.getValue();// 降序
				// 升序 o1.getValue() - o2.getValue()）
			}
		});
		i=0;
		for (Map.Entry<String, Integer> string : list) {
			// 排除-与空格
			if (!(string.getKey().equals(""))
					&& !(string.getKey().equals("-"))) {
				// 换行"\r\n"不是"\n
				str.set(i, string.getKey());
				str.set(++i, string.getValue().toString());
				i++;
			}
	}
		return str;
	}

　　代码实现，运用hashmap的排序功能实现，先将arrayList转换为hashMap，运用hashMap进行词频排序，然后再转换为arrayList输出。

5.测试用例：

　　测试单词划分统计功能和词频统计功能，考虑以下几个方面：

　　a.输入包含非字母的符合，单词统计功能能否去除非字母

　　b.单词统计的个数是否正确

　　c.词频统计功能是否正确

　　d.连词符号‘-’在句首能否处理

　　涵盖以上4个方面设计20个测试用例。

　　测试代码：

public class wordFreTest {

	@Test
	public void testDiviWord() {
		//fail("Not yet implemented");
		System.out.println(wordFre.diviWord("that,that,a,this")+"\nthat,2,a,1,this,1\n");
		System.out.println(wordFre.diviWord("that,that,a,this,this")+"\nthat,2,a,1,this,2\n");
		System.out.println(wordFre.diviWord("that,that,a,this,this,a,this")+"\nthat,2,a,2,this,3\n");
		System.out.println(wordFre.diviWord("that,that,a,-this,this,a,this")+"\nthat,2,a,2,-this,1,this,2\n");
		System.out.println(wordFre.diviWord("a,aa,aaa,aaa,aa,aaa")+"\na,1,aa,2,aaa,3\n");
		System.out.println(wordFre.diviWord("a,-aa,-aaa,aaa,aa,aaa")+"\na,1,-aa,1,-aaa,1,aaa,2,aa,1\n");
		System.out.println(wordFre.diviWord("-a,that,a,this")+"\n-a,1,that,1,a,1,this,1\n");
		System.out.println(wordFre.diviWord("that,that,a,this,this,is-")+"\nthat,2,a,1,this,2,is-,1\n");
		System.out.println(wordFre.diviWord("that,that,that,a,this,this,a,this")+"\nthat,3,a,2,this,3\n");
		System.out.println(wordFre.diviWord("that,that,a,-this,this,a,-this")+"\nthat,2,a,2,-this,2,this,1\n");
	}

	@Test
	public void testSort() {
		//fail("Not yet implemented");
		System.out.println(wordFre.diviWord("that,2,a,1,this,1")+"\nthat,2,a,1,this,1\n");
		System.out.println(wordFre.diviWord("that,2,a,1,this,2")+"\nthat,2,this,2,a,1\n");
		System.out.println(wordFre.diviWord("that,2,a,2,this,3")+"\nthis,3,a,2,that,2\n");
		System.out.println(wordFre.diviWord("that,2,a,2,-this,1,this,2")+"\na,2,that,2,this,2,-this,1\n");
		System.out.println(wordFre.diviWord("a,1,aa,2,aaa,3")+"\naaa,3,aa,2,a,1\n");
		System.out.println(wordFre.diviWord("na,1,-aa,1,-aaa,1,aaa,2,aa,1")+"\naaa,2,a,1,aa,1,-aa,1,-aaa,1\n");
		System.out.println(wordFre.diviWord("-a,1,that,1,a,1,this,1")+"\na,1,that,1,this,1,-a,1\n");
		System.out.println(wordFre.diviWord("that,2,a,1,this,2,is-,1")+"\nthat,2,this,2,a,1,is-,1\n");
		System.out.println(wordFre.diviWord("that,3,a,2,this,3")+"\nthat,3,this,3,a,2\n");
		System.out.println(wordFre.diviWord("that,2,a,2,-this,2,this,1")+"\na,2,that,2,-this,2,this,1\n");
	}

}

　　测试运行截图：

[that, 1, a, 1, this, 1]    that,2,a,1,this,1

[that, 1, a, 1, this, 1]    that,2,this,2,a,1

[that, 1, a, 1, this, 1]    this,3,a,2,that,2

[that, 1, a, 1, -this, 1, this, 1]    a,2,that,2,this,2,-this,1

[a, 1, aa, 1, aaa, 1]    aaa,3,aa,2,a,1

[na, 1, -aa, 1, -aaa, 1, aaa, 1, aa, 1]    aaa,2,a,1,aa,1,-aa,1,-aaa,1

[-a, 1, that, 1, a, 1, this, 1]    a,1,that,1,this,1,-a,1

[that, 1, a, 1, this, 1, is-, 1]    that,2,this,2,a,1,is-,1

[that, 1, a, 1, this, 1]    that,3,this,3,a,2

[that, 1, a, 1, -this, 1, this, 1]    a,2,that,2,-this,2,this,1

[that, 2, a, 1, this, 1]    that,2,a,1,this,1

[that, 2, a, 1, this, 2]    that,2,a,1,this,2

[that, 2, a, 2, this, 3]    that,2,a,2,this,3

[that, 2, a, 2, -this, 1, this, 2]    that,2,a,2,-this,1,this,2

[a, 1, aa, 2, aaa, 3]    a,1,aa,2,aaa,3

[a, 1, -aa, 1, -aaa, 1, aaa, 2, aa, 1]    a,1,-aa,1,-aaa,1,aaa,2,aa,1

[-a, 1, that, 1, a, 1, this, 1]    -a,1,that,1,a,1,this,1

[that, 2, a, 1, this, 2, is-, 1]    that,2,a,1,this,2,is-,1

[that, 3, a, 2, this, 3]    that,3,a,2,this,3

[that, 2, a, 2, -this, 2, this, 1]    that,2,a,2,-this,2,this,1

posted @ 2018-04-08 12:49 星夜墨痕阅读(272) 评论(2) 收藏举报

刷新页面返回顶部