A scalable and fast OPTICS for clustering trajectory big data
详细信息    查看全文
  • 作者:Ze Deng ; Yangyang Hu ; Mao Zhu ; Xiaohui Huang ; Bo Du
  • 关键词:Trajectory big data ; Clustering ; Big data computing ; GPGPU
  • 刊名:Cluster Computing
  • 出版年:2015
  • 出版时间:June 2015
  • 年:2015
  • 卷:18
  • 期:2
  • 页码:549-562
  • 全文大小:1,568 KB
  • 参考文献:1.Akodjènou-Jeannin, M.I., Salamatian, K., Gallinari, P.: Flexible grid-based clustering. LNAI 4702, 350-57 (2007)
    2.Alhamazani, K., Ranjan, R., Jayaraman, P.P., Mitra, K., Wang, M., Huang, Z.G., Wang, L., Rabhi, F.A.: Real-time qos monitoring for cloud-based big data analytics applications in mobile environments. In: IEEE international conference on mobile data management, pp. 661-70 (2014)
    3.Alon, J., Sclaroff, S., Kollios, G., Pavlovic, V.: Discovering clusters in motion time-series data. In: IEEE conference on computer vision and pattern recognition, pp. 375-81 (2003)
    4.Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: ordering points to identify the clustering structure. In: ACM SIGMOD international conference on management of data, pp. 49-0 (1999)
    5.Birant, D., Kut, A.: St-dbscan: an algorithm for clustering spatial temporal data. Data Knowl. Eng. 60, 208-21 (2007)View Article
    6.B?hm, C., Noll, R., Plant, C., Wackersreuther, B.: Density-based clustering using graphics processors. In: ACM international conference on information and knowledge management, pp. 661-70 (2009)
    7.Camargo, S.J., Robertson, A.W., Gaffney, C.J., Smyth, P., Ghil, M.: Cluster analysis of typhoon tracks. Part ii: large-scale circulation and enso. J. Clim. 20, 3654-676 (2007)View Article
    8.Chawla, S., Zheng, Y., Hu, J.: Inferring the root cause in road traffic anomalies. In: International conference on data mining, pp. 141-50 (2012)
    9.Chen, D., Li, X., Wang, L., Khan, S., Wang, J., Zeng, K., Cai, C.: Fast and scalable multi-way analysis of massive neural data. IEEE Trans. Comput. 63 (2014).
    10.Chen, L., ?zsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: ACM SIGMOD international conference on management of data, pp. 491-02 (2005)
    11.Chen, D., Wang, L., Zomaya, A.Y., Dou, M., Chen, J., Deng, Z., Hariri, S.: Parallel simulation of complex evacuation scenarios with adaptive agent models. IEEE Trans. Parallel Distrib. Syst. 25 (2014)
    12.Chen, D., Li, X., Cui, D., Wang, L., Lu, D.: Global synchronization measurement of multivariate neural signals with massively parallel nonlinear interdependence analysis. IEEE Trans. Neural Syst. Rehabil. Eng. 22, 33-3 (2014)View Article
    13.Chudova, D., Gaffney, S., Mjolsness, E., Smyth, P.: Translation-invariant mixture models for curve clustering. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 79-8 (2003)
    14.Defays, D.: An efficient algorithm for a complete link method. Comput. J. 20, 364-66 (1977)View Article MATH MathSciNet
    15.Deng, Z., Wu, X., Wang\(\ast \) , L., Chen, X., Ranjan, R., Zomaya, A., Chen\(\ast \) , D.: Parallel processing of dynamic continuous queries over streaming data flows. IEEE Trans. Parallel Distrib. Syst. PrePrint
    16.Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining, pp. 226-31 (1996)
    17.Ferreira, N., Silva, C., Klosowski, J.T., Scheidegger, C.: Vector field k-means: clustering trajectories by fitting multiple vector fields. Comput. Graph. Forum 32, 201-10 (2013)View Article
    18.Frentzos, E., Gratsias, K., Theodoridis, Y.: Index-based most similar trajectory search. In: IEEE international conference on data engineering, pp. 816-25 (2007)
    19.Frentzos, E., Gratsias, K., Pelekis, N., Theodoridis, Y.: Algorithms for nearest neighbor search on moving object trajectories. Geoinformatica 11, 159-93 (2007)View Article
    20.Geolife project (Microsoft Research Asia). http://?research.?microsoft.?com/?en-us/?downloads/?b16d359d-d164-469e-9fd4-daa38f2b2e13/-/span> (2012)
    21.Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, Burlington (2011)
    22.Kisilevich, S., Mansmann, F., Nanni, M., Rinzivillo, S.: Spatio-temporal clustering. Data Mining and Knowledge Discovery Handbook, 2nd edn, pp. 855-74. Springer, New York (2010)
    23.Kolodziej, J., Khan, S.U.: Multi-level hierarchical genetic-based scheduling of independent jobs in dynamic heterogeneous grid environment. Inf. Sci. 214, 1-9 (2012)View Article
    24.Ko?odziej, J., González-Vélez, H., Wang, L.: Advances in data-intensive modelling and simulation. Future Gener. Comput. Syst. 37, 282-83 (2014)View Article
    25.Lee, J.G., Han, J., Whang, K.Y.: Trajectory clustering: a partition-and-group framework. In: ACM SIGMOD international conference on management of data, pp. 49-0 (2007)
    26.Liu, L., Song, J., Guan, B., Wu, Z., He, K.: Tra-dbscan: a algorithm of clustering trajectories. Front. Manuf. Des. Sci. II(121-26), 4875-879 (2012)
    27.Liu, H., Chen, S., Kubota, N.: Intelligent video systems and analytics: a survey. IEEE Trans. Ind. Inform. 9, 1222-223 (2013)
    28.Liu, P., Yuan, T., Ma, Y., Wang, L., Liu, D., Yue, S., Ko?odziej, J.: Parallel processing o
  • 作者单位:Ze Deng (1)
    Yangyang Hu (1)
    Mao Zhu (1)
    Xiaohui Huang (1)
    Bo Du (1)

    1. School of Computer Science, China University of Geosciences (Wuhan), Wuhan, Hubei, China
  • 刊物类别:Computer Science
  • 刊物主题:Processor Architectures
    Operating Systems
    Computer Communication Networks
  • 出版者:Springer Netherlands
  • ISSN:1573-7543
文摘
Clustering trajectory data is an important way to mine hidden information behind moving object sampling data, such as understanding trends in movement patterns, gaining high popularity in geographic information and so on. In the era of ‘Big data- the current approaches for clustering trajectory data generally do not apply for excessive costs in both scalability and computing performance for trajectory big data. Aiming at these problems, this study first proposes a new clustering algorithm for trajectory big data, namely Tra-POPTICS by modifying a scalable clustering algorithm for point data (POPTICS). Tra-POPTICS has employed the spatiotemporal distance function and trajectory indexing to support trajectory data. Tra-POPTICS can process the trajectory big data in a distributed manner to meet a great scalability. Towards providing a fast solution to clustering trajectory big data, this study has explored the feasibility to utilize the contemporary general-purpose computing on the graphics processing unit (GPGPU). The GPGPU-aided clustering approach parallelized the Tra-POPTICS with the Hyper-Q feature of Kelper GPU and massive GPU threads. The experimental results indicate that (1) the Tra-POPTICS algorithm has a comparable clustering quality with T-OPTICS (the state of art work of clustering trajectories in a centralized fashion) and outperforms T-OPTICS by average four times in terms of scalability, and (2) the G-Tra-POPTICS has a comparable clustering quality with T-POPTICS as well and further gains about 30 speedup on average for clustering trajectories comparing to Tra-POPTICS with eight threads. The proposed algorithms exhibit great scalability and computing performance in clustering trajectory big data.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700