使用eclipse编写查找最高温度的MapReduce程序
首先 ,我用java程序写了个产生温度数据的txt文件。
import java.io.*;
import java.util.Random;
public class CreateFile {
/**
* @param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
BufferedWriter bw = null;
Random ran=new Random();
int id=0;
double temp=0;
long count=10000;
try {
bw=new BufferedWriter(new FileWriter(new File("sensor.txt")));
for(long i=0;i<count;i++){
id=ran.nextInt(1000);
temp=ran.nextDouble()*100;
bw.write(id+" "+temp+"\n");
}
bw.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
这样产生的文件比较小,作为例子示范。前面产生了2g的txt,结果打开时,就死机了,为了验证产生的文件数据个数对不对,又写了个计算文件行数的程序。
import java.io.*;
import java.util.Random;
public class CalcLineNum {
/**
* @param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
BufferedReader br = null;
Random ran=new Random();
int id=0;
double temp=0;
long count=0;
try {
br=new BufferedReader(new FileReader(new File("sensor.txt")));
while(br.readLine()!=null){
count++;
System.out.println(count);
}
br.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
这样,实例文件应该是没问题了。接下来在eclipse里写MapReduce。
package com.xioyaozi;
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
public class MaxTemp {
public static class Map extends MapReduceBase implements Mapper<LongWritable,Text,IntWritable,DoubleWritable>{
private final static IntWritable one=new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key,Text value,OutputCollector<IntWritable,DoubleWritable> output,Reporter reporter) throws IOException{
String line=value.toString();
String[] str=line.split(" ");
int id=Integer.parseInt(str[0]);
double temp=Double.parseDouble(str[1]);
if(id<=1000&&id>=0&&temp<100&&temp>=0)
output.collect(new IntWritable(id),new DoubleWritable(temp));
}
}
public static class Reduce extends MapReduceBase implements Reducer<IntWritable, DoubleWritable, IntWritable, DoubleWritable> {
public void reduce(IntWritable key, Iterator<DoubleWritable> values, OutputCollector<IntWritable, DoubleWritable> output, Reporter reporter) throws IOException {
double maxTemp=0;
while (values.hasNext()) {
maxTemp = Math.max(maxTemp, values.next().get());
}
output.collect(key, new DoubleWritable(maxTemp));
}
}
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(MaxTemp.class);
conf.setJobName("maxTemp");
conf.setOutputKeyClass(IntWritable.class);
conf.setOutputValueClass(DoubleWritable.class);
conf.setMapperClass(Map.class);
// conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
}
产生的文件是正确的。至此,MapReduce已基本初步掌握。
备注:使用eclipse时,注意设置运行参数,并且记得第二次运行程序时,要将结果文件删除。
浙公网安备 33010602011771号