Ensemble anomaly detection from multi-resolution trajectory features
详细信息    查看全文
  • 作者:Shin Ando (1)
    Theerasak Thanomphongphan (2)
    Yoichi Seki (1)
    Einoshin Suzuki (3)

    1. Division of Electronics and Informatics
    ; Gunma University ; 1-5-1 Tenjincho ; Kiryu ; Gunma ; Japan
    2. Panasonic AVC Networks (Thailand) Co.
    ; Ltd. ; Bangsaothong ; Samutprakarn ; Thailand
    3. Department of Informatics
    ; ISEE ; Kyushu University ; Fukuoka ; Japan
  • 关键词:Behavioral data mining ; Trajectory data mining ; Multi ; resolution features ; Ensemble anomaly detection
  • 刊名:Data Mining and Knowledge Discovery
  • 出版年:2015
  • 出版时间:January 2015
  • 年:2015
  • 卷:29
  • 期:1
  • 页码:39-83
  • 全文大小:2,095 KB
  • 参考文献:1. Ailon N, Charikar M, Newman A (2008) Aggregating inconsistent information: ranking and clustering. J ACM 55, 23:1鈥?3:27
    2. Ando S, Thanomphongphan T, Hoshino D, Seki Y, Suzuki E (2011) ACE: anomaly clustering ensemble for multi-perspective anomaly detection in robot behaviors. In: Proceedings of the tenth SIAM international conference on data mining, pp 1鈥?2
    3. Angiulli F, Basta S, Pizzuti C (2006) Distance-based detection and prediction of outliers. IEEE Trans Knowl Data Eng 18(2):145鈥?60 CrossRef
    4. Angiulli F, Fassetti F (2010) Distance-based outlier queries in data streams: the novel task and algorithms. Data Min Knowl Discov 20(2):290鈥?24 18-009-0159-9" target="_blank" title="It opens in new window">CrossRef
    5. Anjum N, Cavallaro A (2008) Multifeature object trajectory clustering for video analysis. IEEE Trans Circuits Syst Video Technol 18(11):1555鈥?564 CrossRef
    6. Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms. Society for Industrial and Applied Mathematics, Philadelphia, pp 1027鈥?035
    7. Bache K, Lichman M (2012) UCI machine learning repository. University of California, Irvine, School of Information and Computer Science. http://archive.ics.uci.edu/ml. Accessed Mar 2012
    8. Banerjee A, Langford J (2004) An objective evaluation criterion for clustering. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 515鈥?20
    9. Blanchard G, Lee G, Scott C (2010) Semi-supervised novelty detection. J Mach Learn Res 11:2973鈥?009
    10. Bonchi F, Castillo C, Donato D, Gionis A (2009) Taxonomy-driven lumping for sequence mining. Data Min Knowl Discov 19(2):227鈥?44 18-009-0141-6" target="_blank" title="It opens in new window">CrossRef
    11. Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. SIGMOD Rec 29(2):93鈥?04 CrossRef
    12. Bu Y, Chen L, Fu AWC, Liu D (2009) Efficient anomaly monitoring over moving object trajectory streams. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 159鈥?68
    13. Budhaditya S, Pham DS, Lazarescu M, Venkatesh S (2009) Effective anomaly detection in sensor networks data streams. In: Proceedings of the 2009 ninth IEEE international conference on data mining, ICDM鈥?9. IEEE Computer Society, Washington, DC, pp 722鈥?27
    14. Castro N, Azevedo PJ (2010) Multiresolution motif discovery in time series. In: Proceedings of tenth SIAM international conference on data mining. SIAM, pp 665鈥?76
    15. Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):1鈥?8 1880.1541882" target="_blank" title="It opens in new window">CrossRef
    16. Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 493鈥?98
    17. Cotofrei P, Stoffel K (2002) Classification rules + time = temporal rules. In: Proceedings of the international conference on computational science-Part I. Springer-Verlag, London, pp 572鈥?81
    18. Daubechies I (1992) Ten lectures on wavelets. Society for Industrial and Applied Mathematics, Philadelphia CrossRef
    19. Dereszynski E, Dietterich T (2007) Probabilistic models for anomaly detection in remote sensor data streams. In: Proceedings of the twenty-third conference annual conference on uncertainty in artificial intelligence, UAI-07. AUAI Press, Corvallis, pp 75鈥?2
    20. Dietterich TG (2000) Ensemble methods in machine learning. In: Proceedings of the first international workshop on multiple classifier systems. Springer-Verlag, London, pp 1鈥?5
    21. Ester M, Kriegel HP, Sander J枚, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining (KDD-96). AAAI Press, Portland, pp 226鈥?31
    22. Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the twenty-first international conference on machine learning. ACM, New York, pp 36鈥?3
    23. Fraley C, Raftery AE (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J 41(8):578鈥?88 CrossRef
    24. Freire A, Barreto G, Veloso M, Varela A (2009) Short-term memory mechanisms in neural network learning of robot navigation tasks: a case study. In: Proceedings of the 6th Latin American Robotics, Symposium (LARS2009), pp 1鈥?
    25. Gaffney S, Smyth P (1999) Trajectory clustering with mixtures of regression models. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 63鈥?2
    26. Ghoting A, Parthasarathy S, Otey ME (2008) Fast mining of distance-based outliers in high-dimensional datasets. Data Min Knowl Discov 16(3):349鈥?64 18-008-0093-2" target="_blank" title="It opens in new window">CrossRef
    27. Giannotti F, Nanni M, Pinelli F, Pedreschi D (2007) Trajectory pattern mining. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 330鈥?39
    28. Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):1鈥?0 CrossRef
    29. Han J, Lee JG, Gonzalez H, Li X (2008) Mining massive RFID, trajectory, and traffic data sets (Tutorial). In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York
    30. Hido S, Tsuboi Y, Kashima H, Sugiyama M, Kanamori T (2011) Statistical outlier detection using direct density ratio estimation. Knowl Inf Syst 26:309鈥?36 CrossRef
    31. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832鈥?44 CrossRef
    32. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264鈥?23 CrossRef
    33. Jiang S, Ferreira J, Gonz盲lez M (2012) Clustering daily patterns of human activities in the city. Data Min Knowl Discov 25:478鈥?10 18-012-0264-z" target="_blank" title="It opens in new window">CrossRef
    34. Johnson N, Hogg D (1995) Learning the distribution of object trajectories for event recognition. In: Proceedings of the sixth british conference on machine vision B, vol 2. BMVA Press, Surrey, pp 583鈥?92
    35. Keogh E, Lin J (2005) Clustering of time-series subsequences is meaningless: implications for previous and future research. Knowl Inf Syst 8(2):154鈥?77 CrossRef
    36. Keogh E, Lin J, Fu A (2005) HOT SAX: efficiently finding the most unusual time series subsequence. In: Proceedings of the fifth IEEE international conference on data mining. IEEE Computer Society, Washington, DC, pp 226鈥?33
    37. Khalid S, Naftel A (2005) Classifying spatiotemporal object trajectories using unsupervised learning of basis function coefficients. In: Proceedings of the third ACM international workshop on video surveillance & sensor networks. ACM, New York, pp 45鈥?2
    38. Khalid S, Naftel A (2006) Classifying spatiotemporal object trajectories using unsupervised learning in the coefficient feature space. Multimed Syst 12(3):227鈥?38 CrossRef
    39. Kim S, Cho NW, Kang B, Kang SH (2011) Fast outlier detection for very large log data. Expert Syst Appl 38(8):9587鈥?596 CrossRef
    40. Knorr EM, Ng RT, Tucakov V (2000) Distance-based outliers: algorithms and applications. VLDB J 8(3鈥?):237鈥?53 CrossRef
    41. Kr枚ger T (2010) On-line trajectory generation in robotic systems, springer tracts in advanced robotics, vol 58. Springer, Berlin CrossRef
    42. Kumar S, Nguyen HT, Suzuki E (2010) Understanding the behaviour of reactive robots in a patrol task by analysing their trajectories. In: Proceedings of the 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, vol 02, WI-IAT鈥?0. IEEE Computer Society, Washington, DC, pp 56鈥?3
    43. Lazarevic, A., Kumar V (2005) Feature bagging for outlier detection. In: Proceeding of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining. ACM Press, New York, pp 157鈥?66
    44. Lee JG, Han J, Li X (2008) Trajectory outlier detection: a partition-and-detect framework. In: Proceedings of the 2008 IEEE 24th international conference on data engineering, ICDM鈥?8. IEEE Computer Society, Washington, DC, pp 140鈥?49
    45. Lehmann EL (2006) Nonparametrics: statistical methods based on ranks (revised edition). Springer, New York
    46. Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Discov 15(2):107鈥?44 18-007-0064-z" target="_blank" title="It opens in new window">CrossRef
    47. Liu Z, Yu JX, Chen L, Wu D (2008) Detection of shape anomalies: a probabilistic approach using hidden markov models. In: Proceedings of the 2008 IEEE 24th international conference on data engineering. IEEE Computer Society, Washington, DC, pp 1325鈥?327
    48. Markou M, Singh S (2003a) Novelty detection: a review鈥攑art 1: statistical approaches. Signal Process 83:2481鈥?497
    49. Markou M, Singh S (2003b) Novelty detection: a review鈥攑art 2: neural network based approaches. Signal Process 83:2499鈥?521
    50. Markou M, Singh S (2006) A neural network-based novelty detector for image sequence analysis. IEEE Trans Pattern Anal Mach Intell 28(10):1664鈥?677 CrossRef
    51. Morris B, Trivedi M (2008) Learning, modeling, and classification of vehicle track patterns from live video. IEEE Trans Intell Transp Syst 9(3):425鈥?37 CrossRef
    52. Morris B, Trivedi M (2009) Learning trajectory patterns by clustering: experimental studies and comparative evaluation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 312鈥?19
    53. Morris BT, Trivedi MM (2008) A survey of vision-based trajectory learning and analysis for surveillance. IEEE Trans Circuits Syst Video Technol 18(8):1114鈥?127 CrossRef
    54. Nguyen HV, Ang HH, Gopalkrishnan V (2010) Mining outliers with ensemble of heterogeneous detectors on random subspaces. In: Proceedings of the 15th international conference on database systems for advanced applications, DASFAA鈥?0, vol I. Springer, Berlin, pp 368鈥?83
    55. Noto K, Brodley C, Slonim D (2012) FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection. Data Min Knowl Discov 25(1):109鈥?33 18-011-0234-x" target="_blank" title="It opens in new window">CrossRef
    56. Pelekis N, Kopanakis I, Kotsifakos EE, Frentzos E, Theodoridis Y (2011) Clustering uncertain trajectories. Knowl Inf Syst 28(1):117鈥?47 CrossRef
    57. Pham DT, Chan AB (1998) Control chart pattern recognition using a new type of self-organizing neural network. In: Proceedings of the institution of mechanical engineers, part I. J Syst Control Eng 212(2):115鈥?27
    58. Piciarelli C, Foresti GL (2006) On-line trajectory clustering for anomalous events detection. Pattern Recogn Lett 27(15):1835鈥?842 CrossRef
    59. Piciarelli C, Foresti GL (2007) Anomalous trajectory detection using support vector machines. In: Proceedings of the 2007 IEEE conference on advanced video and signal based surveillance. IEEE Computer Society, Washington, DC, pp 153鈥?58
    60. Piciarelli C, Micheloni C, Foresti G (2008) Trajectory-based anomalous event detection. IEEE Trans Circuits Syst Video Technol 18(11):1544鈥?554 CrossRef
    61. Porikli F, Haga T (2004) Event detection by eigenvector decomposition using object and frame features. In: Conference on computer vision and pattern recognition workshop, 2004. CVPRW 鈥?4, p 114
    62. Roddick JF, Spiliopoulou M (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE Trans Knowl Data Eng 14:750鈥?67 CrossRef
    63. Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33:1鈥?9 CrossRef
    64. Rosswog J, Ghose K (2012) Detecting and tracking coordinated groups in dense, systematically moving, crowds. In: Proceedings of the twelfth SIAM international conference on data mining, pp 1鈥?1
    65. Saito N (1994) Local feature extraction and its applications using a library of bases. Ph.D. Thesis, Yale University, New Haven
    66. Steland A (1998) Bootstrapping rank statistics. Metrika 47:251鈥?64 CrossRef
    67. Strehl A, Ghosh J (2003) Cluster ensembles鈥攁 knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583鈥?17
    68. Suzuki N, Hirasawa K, Tanaka K, Kobayashi Y, Sato Y, Fujino Y (2007) Learning motion patterns and anomaly detection by human trajectory analysis. In: IEEE international conference on systems, man and cybernetics, ISIC2007, pp 498鈥?03
    69. Wan L, Ng WK, Dang XH, Yu PS, Zhang K (2009) Density-based clustering of data streams at multiple resolutions. ACM Trans Knowl Discov Data 3, 14:1鈥?4:28
    70. Wang Q, Megalooikonomou V, Faloutsos C (2010) Time series analysis with multiple resolutions. Inf Syst 35(1):56鈥?4 CrossRef
    71. Wang X, Li G, Jiang G, Shi Z (2011) Semantic trajectory-based event detection and event pattern mining. Knowl Inf Syst. doi:10.1007/s10115-011-0471-8
    72. Webb A, Copsey K (2011) Statistical pattern recognition. Wiley, New York CrossRef
    73. Williams BH, Toussaint M, Storkey AJ (2007) A primitive based generative model to infer timing information in unpartitioned handwriting data. In: Proceedings of the 20th international joint conference on artifical intelligence, IJCAI鈥?7. Morgan Kaufmann Publishers Inc., San Francisco, pp 1119鈥?124
    74. Xiong Y, Yeung DY (2002) Mixtures of ARMA models for model-based time series clustering. In: Proceedings of the 2002 IEEE international conference on data mining. IEEE Computer Society, Washington, DC, pp. 717鈥?20
    75. Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645鈥?78 CrossRef
    76. Yamanishi K, Takeuchi J, Williams G, Milne P (2004) On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. Data Min Knowl Discov 8(3):275鈥?00 185.7c" target="_blank" title="It opens in new window">CrossRef
    77. Yang Q (2009) Activity recognition: linking low-level sensors to high-level intelligence. In: Proceedings of the 21st international joint conference on artifical intelligence, IJCAI鈥?9. Morgan Kaufmann Publishers Inc., San Francisco, CA, pp 20鈥?5
    78. Yang Y, Chen K (2011) Temporal data clustering via weighted clustering ensemble with different representations. IEEE Trans Knowl Data Eng 23:307鈥?20 CrossRef
    79. Yankov D, Keogh E, Rebbapragada U (2008) Disk-aware discord discovery: finding unusual time series in terabyte sized datasets. Knowl Inf Syst 17(2):241鈥?62 CrossRef
    80. Zheng Y, Zhou X (2011) Computing with spatial trajectories, 1st edn. Springer Publishing Company, Incorporated, New York CrossRef
  • 刊物类别:Computer Science
  • 刊物主题:Data Mining and Knowledge Discovery
    Computing Methodologies
    Artificial Intelligence and Robotics
    Statistics
    Statistics for Engineering, Physics, Computer Science, Chemistry and Geosciences
    Information Storage and Retrieval
  • 出版者:Springer Netherlands
  • ISSN:1573-756X
文摘
The numerical, sequential observation of behaviors, such as trajectories, have become an important subject for data mining and knowledge discovery research. Processing the raw observation into representative features of the behaviors involves an implicit choice of time-scale and resolution, which critically affect the final output of the mining techniques. The choice is associated with the parameters of data-processing, e.g., smoothing and segmentation, which unintuitively yet strongly influence the intrinsic structure of the numerical data. Data mining techniques generally require users to provide an appropriately processed input, but selecting a resolution is an arduous task that may require an expensive, manual examination of outputs between different settings. In this paper, we propose a novel ensemble framework for aggregating outcomes in different settings of scale and resolution parameters for an anomaly detection task. Such a task is difficult for existing ensemble approaches based on weighted combination because: (a) evaluating and weighing an output requires training samples of anomalies which are generally unavailable, (b) the detectability of anomalies can depend on the resolution, i.e., the distinction from normal instances may only be apparent within a small, selective range of parameters. In the proposed framework, predictions based on different resolutions are aggregated to construct meta-feature representations of the behavior instances. The meta-features provide the discriminative information for conducting a clustering-based anomaly detection. In the proposed framework, two interrelated tasks of the behavior analysis: processing the numerical data and discovering anomalous patterns, are addressed jointly, providing an intuitive alternative for a knowledge-intensive parameter selection. We also design an efficient clustering-based anomaly detection algorithm which reduces the computational burden of mining at multiple resolutions. We conduct an empirical study of the proposed framework using real-world trajectory data. It shows that the proposed framework achieves a significant improvement over the conventional ensemble approach.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700