wcPro

1.github地址:

https://github.com/huangjianmin/wcPro/

2.小组分工:

  黄健民——核心功能(单词划分,词频统计),姚金港——输入控制,孙智超——输出控制,梅智超——main函数

3.PSP表格:

PSP2.1表格

PSP2.1

PSP阶段

预估耗时

(分钟)

实际耗时

(分钟)

Planning

计划

 10  10

· Estimate

· 估计这个任务需要多少时间

 5  5

Development

开发

 20  15

· Analysis

· 需求分析 (包括学习新技术)

 30  20

· Design Spec

· 生成设计文档

 20  0

· Design Review

· 设计复审 (和同事审核设计文档)

 20  0

· Coding Standard

· 代码规范 (为目前的开发制定合适的规范)

 15  15

· Design

· 具体设计

 60  60

· Coding

· 具体编码

 480  360

· Code Review

· 代码复审

 60  30

· Test

· 测试(自我测试,修改代码,提交修改)

 60  40

Reporting

报告

 30  40

· Test Report

· 测试报告

 30  20

· Size Measurement

· 计算工作量

 5  5

· Postmortem & Process Improvement Plan

· 事后总结, 并提出过程改进计划

 20  0
 

合计

 865  620

4.接口实现:

4.1单词划分:

    此功能分为两个部分,第一个部分为去掉非字母的字符,划分单词,第二部分为统计不同的单词个数。

      第一部分的实现代码:

                int curPosition=0; 
		//System.out.println(targetStr);
		ArrayList<String> words=new ArrayList<String>();  //划分单词
		while(curPosition<targetStr.length()) 
		{ 
			while(curPosition<targetStr.length()&&!(targetStr.charAt(curPosition)>='a'&&targetStr.charAt(curPosition)<='z'||targetStr.charAt(curPosition)>='A'&&targetStr.charAt(curPosition)<='Z'||targetStr.charAt(curPosition)=='-')) 
			{ 
				curPosition++; 
			}//去掉非字母
			boolean isWord=false; 
			String thisWord=""; 
			while (curPosition<targetStr.length()&&(targetStr.charAt(curPosition)>='a'&&targetStr.charAt(curPosition)<='z'||targetStr.charAt(curPosition)>='A'&&targetStr.charAt(curPosition)<='Z'||targetStr.charAt(curPosition)=='-')) 
			{ 
				isWord=true; 
				thisWord+=targetStr.charAt(curPosition); 
				curPosition++; 
			} 
			if (isWord)  //只有isWord为真时,才将单词计入words 
			{
				words.add(thisWord);				
			}
			else
			{
				thisWord="";
			}
			
		}     

 

 

     当遇到非字母的字符时,坐标直接+1,跳过该字符。划分单词时,先预留一个空字符串thisWord,当坐标读到的字符为字母或连词符号时,将字母或连词符号加入到thisWord中,坐标+1,当坐标读到的单词不是字母或者连词符号时,清空thisWord,并标记单词已读完,将thisWord中的内容add到words数组表中,由此可以划分单词。

    第二部分的实现代码:

          int size=words.size();
		int m=0;
		ArrayList<String> cword=new ArrayList<String>();  //统计单词数
		ArrayList<Integer> num=new ArrayList<Integer>();
		//ArrayList<String> cnum=new ArrayList<String>(); 
		//cword.add(words.get(0));
		//num.add(1);
		ArrayList<String> wordNum=new ArrayList<String>();
		for(int i=0;i<size;i++) {
			String aa=words.get(i);
			wordNum.add(2*i, aa);
			wordNum.add(2*i+1,"1");
			
		}
        for(int i=0;i<size;i++) {
			//System.out.println(wordNum.get(0));
			for(int j=i+1;j<size;j++)
				if(wordNum.get(2*i).equals(wordNum.get(2*j)))
				{
					int a=Integer.parseInt(wordNum.get(2*i+1))+1;
					wordNum.set(2*i+1,Integer.toString(a));
					wordNum.remove(2*j+1);
					wordNum.remove(2*j);
					j--;
					size-=1;
				}
			/*System.out.println(i);
			System.out.println(wordNum.get(2*i));
			System.out.print(wordNum.get(2*i+1));
			*/
		}
                return wordNum;
	    }    
         

     此代码的原理:将words中的单词全部存放到wordNum的偶数格中,wordNum的奇数隔放前一个格子的单词的个数,初始全部为1,创建循环,从第一个wordNum偶数格保留的单词开始,依次比较此单词和之后单词是否相同,如果有相同的,那么这个单词之后的数字+1,删除之后的单词与单词个数,同时,数组表的size()相应的减少2。循环直至wordNum中的单词全部扫描一遍。

4.2词频统计:

    代码如下:

public static ArrayList<String> sort(ArrayList<String> str){
		HashMap<String, Integer> map=new HashMap<> ();
		int i=0;
		while(i<str.size()-1) {
			map.put(str.get(i), Integer.parseInt(str.get(++i)));
			i++;
		}
		// Step2 排序
		// 以Key进行排序
		TreeMap treemap = new TreeMap(map);
		// 以value进行排序
		ArrayList<Map.Entry<String, Integer>> list = new ArrayList<Map.Entry<String, Integer>>(
				treemap.entrySet());
		Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
			public int compare(Map.Entry<String, Integer> o1,
					Map.Entry<String, Integer> o2) {
				return o2.getValue() - o1.getValue();// 降序
				// 升序 o1.getValue() - o2.getValue())
			}
		});
		i=0;
		for (Map.Entry<String, Integer> string : list) {
			// 排除-与空格
			if (!(string.getKey().equals(""))
					&& !(string.getKey().equals("-"))) {
				// 换行"\r\n"不是"\n
				str.set(i, string.getKey());
				str.set(++i, string.getValue().toString());
				i++;
			}
	}
		return str;
	}

   代码实现,运用hashmap的排序功能实现,先将arrayList转换为hashMap,运用hashMap进行词频排序,然后再转换为arrayList输出。

 5.测试用例:

  测试单词划分统计功能和词频统计功能,考虑以下几个方面:

  a.输入包含非字母的符合,单词统计功能能否去除非字母

  b.单词统计的个数是否正确

  c.词频统计功能是否正确

  d.连词符号‘-’在句首能否处理

  涵盖以上4个方面设计20个测试用例。

  测试代码:

public class wordFreTest {

	@Test
	public void testDiviWord() {
		//fail("Not yet implemented");
		System.out.println(wordFre.diviWord("that,that,a,this")+"\nthat,2,a,1,this,1\n");
		System.out.println(wordFre.diviWord("that,that,a,this,this")+"\nthat,2,a,1,this,2\n");
		System.out.println(wordFre.diviWord("that,that,a,this,this,a,this")+"\nthat,2,a,2,this,3\n");
		System.out.println(wordFre.diviWord("that,that,a,-this,this,a,this")+"\nthat,2,a,2,-this,1,this,2\n");
		System.out.println(wordFre.diviWord("a,aa,aaa,aaa,aa,aaa")+"\na,1,aa,2,aaa,3\n");
		System.out.println(wordFre.diviWord("a,-aa,-aaa,aaa,aa,aaa")+"\na,1,-aa,1,-aaa,1,aaa,2,aa,1\n");
		System.out.println(wordFre.diviWord("-a,that,a,this")+"\n-a,1,that,1,a,1,this,1\n");
		System.out.println(wordFre.diviWord("that,that,a,this,this,is-")+"\nthat,2,a,1,this,2,is-,1\n");
		System.out.println(wordFre.diviWord("that,that,that,a,this,this,a,this")+"\nthat,3,a,2,this,3\n");
		System.out.println(wordFre.diviWord("that,that,a,-this,this,a,-this")+"\nthat,2,a,2,-this,2,this,1\n");
	}

	@Test
	public void testSort() {
		//fail("Not yet implemented");
		System.out.println(wordFre.diviWord("that,2,a,1,this,1")+"\nthat,2,a,1,this,1\n");
		System.out.println(wordFre.diviWord("that,2,a,1,this,2")+"\nthat,2,this,2,a,1\n");
		System.out.println(wordFre.diviWord("that,2,a,2,this,3")+"\nthis,3,a,2,that,2\n");
		System.out.println(wordFre.diviWord("that,2,a,2,-this,1,this,2")+"\na,2,that,2,this,2,-this,1\n");
		System.out.println(wordFre.diviWord("a,1,aa,2,aaa,3")+"\naaa,3,aa,2,a,1\n");
		System.out.println(wordFre.diviWord("na,1,-aa,1,-aaa,1,aaa,2,aa,1")+"\naaa,2,a,1,aa,1,-aa,1,-aaa,1\n");
		System.out.println(wordFre.diviWord("-a,1,that,1,a,1,this,1")+"\na,1,that,1,this,1,-a,1\n");
		System.out.println(wordFre.diviWord("that,2,a,1,this,2,is-,1")+"\nthat,2,this,2,a,1,is-,1\n");
		System.out.println(wordFre.diviWord("that,3,a,2,this,3")+"\nthat,3,this,3,a,2\n");
		System.out.println(wordFre.diviWord("that,2,a,2,-this,2,this,1")+"\na,2,that,2,-this,2,this,1\n");
	}

}

   测试运行截图:

    

[that, 1, a, 1, this, 1]    that,2,a,1,this,1

[that, 1, a, 1, this, 1]    that,2,this,2,a,1

[that, 1, a, 1, this, 1]    this,3,a,2,that,2

[that, 1, a, 1, -this, 1, this, 1]    a,2,that,2,this,2,-this,1

[a, 1, aa, 1, aaa, 1]    aaa,3,aa,2,a,1

[na, 1, -aa, 1, -aaa, 1, aaa, 1, aa, 1]    aaa,2,a,1,aa,1,-aa,1,-aaa,1

[-a, 1, that, 1, a, 1, this, 1]    a,1,that,1,this,1,-a,1

[that, 1, a, 1, this, 1, is-, 1]    that,2,this,2,a,1,is-,1

[that, 1, a, 1, this, 1]    that,3,this,3,a,2

[that, 1, a, 1, -this, 1, this, 1]    a,2,that,2,-this,2,this,1

[that, 2, a, 1, this, 1]    that,2,a,1,this,1

[that, 2, a, 1, this, 2]    that,2,a,1,this,2

[that, 2, a, 2, this, 3]    that,2,a,2,this,3

[that, 2, a, 2, -this, 1, this, 2]    that,2,a,2,-this,1,this,2

[a, 1, aa, 2, aaa, 3]    a,1,aa,2,aaa,3

[a, 1, -aa, 1, -aaa, 1, aaa, 2, aa, 1]    a,1,-aa,1,-aaa,1,aaa,2,aa,1

[-a, 1, that, 1, a, 1, this, 1]    -a,1,that,1,a,1,this,1

[that, 2, a, 1, this, 2, is-, 1]    that,2,a,1,this,2,is-,1

[that, 3, a, 2, this, 3]    that,3,a,2,this,3

[that, 2, a, 2, -this, 2, this, 1]    that,2,a,2,-this,2,this,1

 

posted @ 2018-04-08 12:49  星夜墨痕  阅读(272)  评论(2)    收藏  举报