MapReduce实例——查询缺失扑克牌

问题：

解决：

首先分为两个过程，Map过程将<=10的牌去掉，然后只针对于>10的牌进行分类，Reduce过程，将Map传过来的键值对进行统计，然后计算出少于3张牌的的花色

1.代码

1) Map代码

1     String line = value.toString();
2     String[] strs = line.split("-");
3     if(strs.length == 2){
4         int number = Integer.valueOf(strs[1]);
5         if(number > 10){
6             context.write(new Text(strs[0]), value);
7         }
8     }

2) Reduce代码

1      Iterator<Text> iter = values.iterator();
2      int count = 0;
3      while(iter.hasNext()){
4         iter.next();
5         count ++;
6     }
7     if(count < 3){
8         context.write(key, NullWritable.get());
9     }

3) Runner代码

 1     Configuration conf = new Configuration();
 2     Job job = Job.getInstance(conf);
 3     job.setJobName("poker mr");
 4     job.setJarByClass(pokerRunner.class);
 5             
 6     job.setMapperClass(pakerMapper.class);
 7     job.setReducerClass(pakerRedue.class);
 8             
 9     job.setMapOutputKeyClass(Text.class);
10     job.setMapOutputValueClass(Text.class);
11             
12     job.setOutputKeyClass(Text.class);
13     job.setOutputValueClass(NullWriter.class);
14             
15     FileInputFormat.addInputPath(job, new Path(args[0]));
16     FileOutputFormat.setOutputPath(job, new Path(args[1]));
17             
18     job.waitForCompletion(true);

2.运行结果

File System Counters

FILE: Number of bytes read=87

FILE: Number of bytes written=211167

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=366

HDFS: Number of bytes written=6

HDFS: Number of read operations=6

HDFS: Number of large read operations=0

HDFS: Number of write operations=2

Job Counters

Launched map tasks=1

Launched reduce tasks=1

Data-local map tasks=1

Total time spent by all maps in occupied slots (ms)=109577

Total time spent by all reduces in occupied slots (ms)=42668

Total time spent by all map tasks (ms)=109577

Total time spent by all reduce tasks (ms)=42668

Total vcore-seconds taken by all map tasks=109577

Total vcore-seconds taken by all reduce tasks=42668

Total megabyte-seconds taken by all map tasks=112206848

Total megabyte-seconds taken by all reduce tasks=43692032

Map-Reduce Framework

Map input records=49

Map output records=9

Map output bytes=63

Map output materialized bytes=87

Input split bytes=110

Combine input records=0

Combine output records=0

Reduce input groups=4

Reduce shuffle bytes=87

Reduce input records=9

Reduce output records=3

Spilled Records=18

Shuffled Maps =1

Failed Shuffles=0

Merged Map outputs=1

GC time elapsed (ms)=992

CPU time spent (ms)=3150

Physical memory (bytes) snapshot=210063360

Virtual memory (bytes) snapshot=652480512

Total committed heap usage (bytes)=129871872

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=256

File Output Format Counters

Bytes Written=6

3.运行方法

在Eclipse里编译好，生出jar包，然后上传到linux系统上，在集群上运行该文件

运行命令：bin/hadoop **.jar 类包名 /

例如：bin/hadoop **.jar com.test.mr /

posted @ 2017-03-24 17:26 calmLang 阅读(905) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

编程成为一种习惯

MapReduce实例——查询缺失扑克牌

问题：

解决：

1.代码

2.运行结果

3.运行方法

公告