map matching数据库操作系列

说明:这是在进行map matching时,对数据库操作

需求:1)目前设计了图1~2 所示的TABLE结构,需要将point_cfm中的各字段填满,并且trace_cfm的trace_id, point_id[]需要填上

   

                      图1

        

                            图2

步骤:

  1、数据预处理:gps数据原始格式如下:

     

        为了达到把gps点分成一条条轨迹的目的,即trace_cfm需要知道哪些点是属于某条轨迹的,现在处理一下这些数据。

  生成一列数据作为trace id,trace id 的生成包含了3部分内容:a、taxi id b、一年的第多少天 c、一天的哪一个时段。python代码如下:

 1 import time
 2 import string
 3 
 4 def segtime( strtime ):
 5     """
 6         param: a time string
 7         ret  : a integer composed by two parts, the first is the day of in
 8                that year(1~366) and the last is the 0~47
 9         author: shaohui zhu
10     """
11     struct_time = time.strptime( strtime,"%Y-%m-%d %H:%M:%S")
12     hour = struct_time.tm_hour
13     minute = struct_time.tm_min
14     dayofyear = struct_time.tm_yday
15     if minute >= 30:
16         ret = hour*2 + 1
17     else:
18         ret = hour*2
19     ret = dayofyear*100 + ret
20     return ret 
21 
22 #segtime("2015-07-08 23:30:59")
23 
24 f = open('366.txt','r')
25 fd = open('366ok.txt','w')
26 lines = f.readlines()
27 for line in lines:
28     comma_err = 0
29     comma1 = line.find(',')
30     if -1 != comma1 :
31         comma2 = line.find(',', comma1+1)
32         if -1 != comma2 :
33             comma3 = line.find(',', comma2+1)
34             if -1 != comma3 :
35                 comma4 = line.find(',', comma3+1)
36                 if -1 != comma4:
37                     comma_err = 1
38             else:
39                 comma_err = 1
40         else:
41             comma_err = 1
42     else:
43         comma_err = 1
44 
45     if comma_err == 0:
46         seg = segtime( line[4:23] )
47         itrace = 366 * 100000 + seg
48         enterindex = line.find('\n')
49         if enterindex != -1:
50             line = line[0:enterindex-1]
51         newline = line + ',' + "%d"%itrace +'\n'
52         print newline
53         fd.write( newline )

  即完成了数据预处理过程,结果如下:

  

 

  2、要在导入gps数据到point_cfm时,在trace_cfm中填写trace_id字段,创建一个触发器来完成,代码如下:

create or replace function get_point() returns trigger as 
$BODY$
	DECLARE
		rec record;
		oldrec record;
	BEGIN
		FOR rec IN (SELECT trace_id FROM point_cfm) LOOP
			SELECT INTO oldrec * FROM trace_cfm WHERE trace_id = rec.trace_id;
			IF NOT FOUND THEN
				INSERT INTO trace_cfm VALUES(rec.trace_id);
			END IF;
		END LOOP;
		return NEW;
	END;
$BODY$ LANGUAGE plpgsql;



create trigger tri_get_point 
AFTER insert or update or delete
on point_cfm 
for each row
execute procedure get_point()

  按照另一篇文章中的方法导入gps数据到point_cfm中

  

  3、填写trace_cfm中的point_id[],pl代码如下:

CREATE OR REPLACE FUNCTION write_point() RETURNS void
AS $BODY$
DECLARE
	id bigint;
BEGIN
	FOR id IN (SELECT trace_id FROM trace_cfm) LOOP

		DECLARE
			points bigint[];
			p bigint;
			i integer;
		BEGIN
			i := 0;
			FOR p IN (SELECT point_pk FROM point_cfm WHERE trace_id = id) LOOP
				points[i] := p;
				i := i+1;
			END LOOP;
			--SELECT sort_asc(points);
			UPDATE trace_cfm SET point_id = points WHERE trace_id = id;
		END;

	END LOOP;
END;
$BODY$ LANGUAGE plpgsql;

SELECT write_point();

  

  上述步骤即完成。

  4、注意:

    1)导入的point必须是按照时间有序的,一般来说应该是的,但要注意

    2)本来我计划在把points填入point_id[]时,首先对points进行排序(利用intarray的sort_asc函数),但是该函数只对integer数据有效,所以就没有进行。

    但是,trace一定是严格时间序的,在每次操作时,注意检查一下。

posted @ 2015-07-08 20:40  iken  阅读(632)  评论(0)    收藏  举报