摘要
对于包含有时空属性信息的海量交通轨迹数据进行存储、检索等具有重要的实际意义.针对交通轨迹数据的时空特性、无序性以及采样率高等特点,提出一种基于时空距离进行聚类的数据项构造方法;同时针对数据的时空特性和传统R树的节点重叠率较高导致检索效率慢的情况,提出增加时间维度且基于改进的层次聚类算法的R树构造方法.解决了传统方法中树过高以及节点重复率高导致的检索效率问题.实验结果表明,该构造方法得到的R树结构在检索效率方面性能优于传统方法.
It is of great practical significance to store and retrieve massive traffic trajectory data containing spatio-temporal information.Aiming at the spatio-temporal characteristics,disorder and high sampling rate of traffic trajectory data,a method of constructing data items is proposed.The method is based on a clustering algorithm using spatio-temporal distance.At the same time,the trajectory data set has spatio-temporal characteristics,and the high node overlap rate of the traditional R tree will result in slow retrieval efficiency.In response to this situation,a method of constructing R-Tree with time dimension is proposed.The method is based on an improved hierarchical clustering algorithm.It solves the problem of retrieval efficiency of traditional method caused by the height of tree and the high rate of node repetition.Experimental results show that the method outperforms the traditional method in terms of retrieval efficiency.
引文
[1] Zheng Y.Trajectory data mining:An Overview[M].New York:ACM,2015.
[2] Guttman A.R-trees:a dynamic index structure for spatial searching[C]//ACM SIGMOD International Conference on Management of Data.New York:ACM,1984.
[3] 戚将辉,张丰,杜震洪等.基于内存数据库的矢量数据存储与空间索引研究[J].浙江大学学报:理学版,2015,42(3):365-370.
[4] 刘润涛,安晓华,高晓爽.一种基于R-树的空间索引结构[J].计算机工程,2009,35(23):32-34.
[5] 侯丽敏,王文莉.基于SOM改进的K-Means聚类算法[J].内蒙古大学学报:自然科学版,2011,42(5):586-590.
[6] 赵金东,于彦伟,刘惊雷.面向实时海量数据流的数据聚类[J].北京邮电大学学报,2016,39(3):114-119.
[7] 汪璟玢.一种结合空间聚类算法的R树优化算法[J].计算机工程与应用,2014,50(5):112-115.
[8] Mokbel M F,Ghanem T M,Aref W G.Spatio-temporal access methods[J].IEEE Data Engineering Bulletin,2003,26(2):40-49.
[9] 龚俊,柯胜男,朱庆等.一种集成R树、哈希表和B+树的高效轨迹数据索引方法[J].测绘学报,2015,44(5):570-577.
[10] Bastani F,Huang Y,Xie X,et al.A greener transportation mode:flexible routes discovery from GPS trajectory data[C]//ACM Sigspatial International Symposium on Advances in Geographic Information Systems,Chicago,Usa,2011:405-408.