Dissimilarity measures and divisive clustering for symbolic multimodal-valued data

详细信息	查看全文 \| 推荐本文 \|

作者：Jaejik Kim^a ; ^{jaekim@georgiahealth.edu} ; ^{jaejik@gmail.com} ; L. Billard^b
关键词：Multimodal-valued data ; Gowda&ndash ; Diday dissimilarity measure ; Ichino&ndash ; Yaguchi dissimilarity measure ; Divisive clustering
刊名：Computational Statistics and Data Analysis
出版年：2012
期刊代码：19_01679473
类别：cp
出版时间：September, 2012
卷：56
期：9
页码：2795-2808
文件大小：390 K

摘要

Nowadays, most government agencies and local authorities regularly and routinely collect a large amount of data from censuses and surveys and officially publish them for public purposes. The most frequently used form for the publication is as statistical tables and it is usually not possible to access the raw data for those tables due to privacy issues. Under these situations, we have to analyze data using only those aggregated tables. These tables typically have formats summarized by ordinal or nominal items. Tables for quantitative variables have histogram-valued formats and those for qualitative variables are represented by multimodal-valued types. Both are classes of the so-called symbolic data. In this study, we propose dissimilarity measures and a divisive clustering algorithm for symbolic multimodal-valued data. In order to split a partition efficiently at each stage, the algorithm extends the monothetic method for binary data. The proposed method is verified by simulation studies and applied to a work-related nonfatal injury and illness dataset.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700