Classification-driven temporal discretization of multivariate time series
详细信息    查看全文
  • 作者:Robert Moskovitch ; Yuval Shahar
  • 关键词:Temporal knowledge discovery ; Temporal data mining ; Temporal abstraction ; Time intervals mining ; Frequent pattern mining ; Classification ; Discretization
  • 刊名:Data Mining and Knowledge Discovery
  • 出版年:2015
  • 出版时间:July 2015
  • 年:2015
  • 卷:29
  • 期:4
  • 页码:871-913
  • 全文大小:2,391 KB
  • 参考文献:Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832-43MATH View Article
    Azulay R, Moskovitch R, Stopel D, Verduijn M, de Jonge E, Shahar Y (2007) Temporal Discretization of medical time series—A comparative study, Workshop on Intelligent Data Analysis in Biomedicine and Pharmacology, Amsterdam, The Netherlands
    Batal I, Fradkin D, Harrison J, Moerchen F, Hauskrecht M (2012) Mining recent temporal patterns for event detection in multivariate time series data. In: Proceedings of Knowledge Discovery and Data Mining (KDD), Beijing, China
    Batal I, Valizadegan H, Cooper G, Hauskrecht M (2013) A temporal pattern mining approach for classifying electronic health record data. ACM TIST 4(4). doi:10.-145/-508037.-508044
    Bellazzi R, Diomidous M, Sarkar IN, Takabayashi K, Ziegler A, McCray AT (2011) Data analysis and data mining: current issues in biomedical informatics. Methods Inf Med 50(6):536-44View Article
    Freksa C (1992) Temporal reasoning based on semi-intervals. Artif Intell 54(1):199-27MathSciNet View Article
    Hauskrecht M, Visweswaran S, Cooper G, Clermont G (2013) Data-driven identification of unusual clinical actions in the ICU. In: Proceedings of the Annual Symposium of the American Medical Informatics Association, Washington DC
    H?ppner F (2001) Learning temporal rules from state sequences. In: Proceedings of WLTSD
    H?ppner F (2002) Time series abstraction methods—A Survey. Workshop on Knowledge Discovery in Databases, Dortmund
    H?ppner F, Peter S (2014) Temporal interval pattern languages to characterize time flow. Wiley Interdisc. Rew. Data Min Knowl Discov 4(3):196-12View Article
    Hu B, Chen Y, Keogh E (2013) Time series classification under more realistic assumptions. In: Proceedings of SIAM Data Mining, p 578
    Jakkula VR, Cook DJ (2011) Detecting anomalous sensor events in smart home data for enhancing the living experience. Artif Intell Smarter Living 11:1-
    Kam PS, Fu AWC (2000) Discovering temporal patterns for interval based events, In: Proceedings DaWaK-00
    Kohavi R, Sahami M (1996) Error based and entropy based discretization of continuous features. In: Proceedings of KDD
    Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79-6MATH MathSciNet View Article
    Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series with implications for streaming algorithms, In: 8th ACM SIGMOD DMKD workshop, pp 2-1
    M?rchen F, Ultsch A (2005) Optimizing time series discretization for knowledge discovery, In: Proceeding of KDD05
    M?rchen F (2006) Algorithms for time series knowledge mining. In: Proceedings of KDD
    Moskovitch R, Hessing A, Shahar Y (2004) Vaidurya–a concept-based, context-sensitive search engine for clinical guidelines. Medinfo 11:140-44
    Moskovitch R, Gus I, Pluderman S, Stopel D, Glezer C, Shahar Y, Elovici Y (2007a) Detection of unknown computer worms activity based on computer behavior using data mining. In: Computational Intelligence in Security and Defense Applications, pp 169-77
    Moskovitch R, Stopel D, Verduijn M, Peek N, de Jonge E, Shahar Y (2007b) Analysis of ICU patients using the time series knowledge mining method. IDAMAP, Amsterdam
    Moskovitch R, Rokach L, Elovici Y (2008) Detection of unknown computer worms based on behavioral classification of the host. Comput Stat Data Anal 52:4544-566MATH MathSciNet View Article
    Moskovitch R, Shahar Y (2009) Medical Temporal-Knowledge Discovery via Temporal Abstraction, AMIA 2009, San Francisco, USA
    Moskovitch R, Peek N, Shahar Y (2009) Classification of ICU Patients via Temporal Abstraction and temporal patterns mining. IDAMAP 2009, Verona, Italy
    Moskovitch R (2011) A framework for Discovery and Classification of Multivariate Time Series via Temporal Abstraction, Ph.D. Dissertation, Ben Gurion University
    Moskovitch R, Shahar Y (2013) Fast time intervals mining using the transitivity of temporal relations. Knowl Inf Syst. doi:10.-007/?s10115-013-0707-x
    Moskovitch R, Shahar Y (2014) Classification of multivariate time series via temporal abstraction and time-intervals mining. Knowl Inf Syst. doi:10.-007/?s10115-014-0784-5
    Moskovitch R, Walsh C, Hripsack G, Tatonetti N (2014) Prediction of biomedical events via time intervals mining. ACM KDD Workshop on Connected Health in Big Data Era, NY, USA
    Papapetrou P, Kollios G, Sclaroff S, Gunopulos D (2009) Mining frequent arrangements of temporal intervals. Knowl Inf Syst 21(2):133-71View Article
    Patel D, Hsu W, Lee ML (2008) Mining Relationships among Interval-based Events for Classification. In: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pp 393-04
    Pei J, Han J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu MC (2001) PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 17th International Conference Data Engineering (ICDE -1),
  • 作者单位:Robert Moskovitch (1) (2)
    Yuval Shahar (1)

    1. Department of Information Systems Engineering, Ben Gurion University, Beer Sheva, Israel
    2. Department of Biomedical Informatics, Systems Biology, and Medicine, Columbia University, New York, NY, USA
  • 刊物类别:Computer Science
  • 刊物主题:Data Mining and Knowledge Discovery
    Computing Methodologies
    Artificial Intelligence and Robotics
    Statistics
    Statistics for Engineering, Physics, Computer Science, Chemistry and Geosciences
    Information Storage and Retrieval
  • 出版者:Springer Netherlands
  • ISSN:1573-756X
文摘
Biomedical data, in particular electronic medical records data, include a large number of variables sampled in irregular fashion, often including both time point and time intervals, thus providing several challenges for analysis and data mining. Classification of multivariate time series data is a challenging task, but is often necessary for medical care or research. Increasingly, temporal abstraction, in which a series of raw-data time points is abstracted into a set of symbolic time intervals, is being used for classification of multivariate time series. In this paper, we introduce a novel supervised discretization method, geared towards enhancement of classification accuracy, which determines the cutoffs that will best discriminate among classes through the distribution of their states. We present a framework for classification of multivariate time series analysis, which implements three phases: (1) application of a temporal-abstraction process that transforms a series of raw time-stamped data points into a series of symbolic time intervals (based on either unsupervised or supervised temporal abstraction); (2) mining these time intervals to discover frequent temporal-interval relation patterns (TIRPs), using versions of Allen’s 13 temporal relations; (3) using the patterns as features to induce a classifier. We evaluated the framework, focusing on the comparison of three versions of the new, supervised, temporal discretization for classification (TD4C) method, each relying on a different symbolic-state distribution-distance measure among outcome classes, to several commonly used unsupervised methods, on real datasets in the domains of diabetes, intensive care, and infectious hepatitis. Using only three abstract temporal relations resulted in a better classification performance than using Allen’s seven relations, especially when using three symbolic states per variable. Similarly when using the horizontal support and mean duration as the TIRPs feature representation, rather than a binary (existence) representation. The classification performance when using the three versions of TD4C was superior to the performance when using the unsupervised (EWD, SAX, and KB) discretization methods.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700