Cartogram visualization for nonlinear manifold learning models

设为首页

收藏本站

网站地图 | English | 公务邮箱

About the library

Background
History
Leadership
Organization

Readers' Guide

Opening Hours
Collections
Help Via Email

Publications

Electronic Information Resources

Cartogram visualization for nonlinear manifold learning models

详细信息查看全文

作者：Alfredo Vellido (1)
David L. García (1)
àngela Nebot (1)
关键词：Cartogram ; Data visualization ; Generative topographic mapping ; Manifold learning ; Nonlinear mapping distortion ; Magnification factor
刊名：Data Mining and Knowledge Discovery
出版年：2013
出版时间：July 2013
年：2013
卷：27
期：1
页码：22-54
全文大小：1716KB
参考文献：1. Alahakoon D, Halgamuge SK, Srinivasan B (2000) Dynamic self-organizing maps with controlled growth for knowledge discovery. IEEE Trans Neural Netw 11(3): 601-14 CrossRef
2. Aupetit M (2007) Visualizing distortions and recovering topology in continuous projection techniques. Neurocomputing 70(7-): 1304-330 CrossRef
3. Bishop CM (1998) Latent variable models. In: Jordan MI (eds) Learning in graphical models. The MIT Press, Cambridge, pp 371-04 CrossRef
4. Bishop CM, Tipping ME (1998) A hierarchical latent variable model for data visualization. IEEE Trans Pattern Anal 20(3): 281-93 CrossRef
5. Bishop CM, Svensén M, Williams CKI (1997a) Magnification factors for the GTM algorithm. In: Proceedings of the IEE Fifth international conference on artificial neural networks. Cambridge, U.K., pp 64-9
6. Bishop CM, Svensén M, Williams CKI (1997b) Magnification factors for the SOM and GTM algorithms. In: WSOM-7, Helsinki, Finland, pp 333-38
7. Bishop CM, Svensén M, Williams CKI (1998) GTM: the generative topographic mapping. Neural Comput 10(1): 215-34 CrossRef
8. Cruz R, Vellido A (2010) Semi-supervised geodesic generative topographic mapping. Pattern Recognit Lett 31(3): 202-09 CrossRef
9. Cruz R, Vellido A (2011) Semi-supervised analysis of human brain tumours from partially labeled MRS information, using manifold learning models. Int J Neural Syst 21(1): 17-9 CrossRef
10. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal 1(2): 224-27 CrossRef
11. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc B 39(1): 1-8
12. Dey TK, Edelsbrunner H, Guha S (1999) Computational topology. In: Chazelle B, Goodman JE, Pollack R (eds) Advances in discrete and computational geometry (Contemporary Mathematics, 223), pp 109-43. American Mathematical Society
13. Du Q, Faber V, Gunzburger M (1999) Centroidal Voronoi tessellations: applications and algorithms. SIAM Rev 41(4): 637-76 CrossRef
14. Fayyad U, Piatetski-Shapiro G, Smith P (1996) From data mining to knowledge discovery in databases. AI Mag 17(3): 37-4
15. Furukawa T (2009) SOM of SOMs. Neural Netw 22(4): 463-78 CrossRef
16. Gastner MT, Newman MEJ (2004) Diffusion-based method for producing density-equalizing maps. Proc Natl Acad Sci USA 101(20): 7499-504 CrossRef
17. Gisbrecht A, Mokbel B, Hammer B (2011) Relational generative topographic mapping. Neurocomputing 74(9): 1359-371 CrossRef
18. Govindaraju V, Young K, Maudsley AA (2000) Proton NMR chemical shifts and coupling constants for brain metabolites. NMR Biomed 13(3): 129-53 CrossRef
19. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3: 1157-182
20. Guyon I, Gunn S, Nikravesh M, Zadeh LA (2006) Feature extraction: foundations and applications. Studies in Fuzziness and Soft Computing. Springer, Berlin
21. Hammer B, Villmann Th (2003) Mathematical aspects of neural networks. In: ESANN 2003, d-side pub, Brussels, Belgium, pp 59-2
22. Hammer B, Hasenfuss A, Villmann Th (2007) Magnification control for batch neural gas. Neurocomputing 70(7-): 1225-234 CrossRef
23. Jain AK (2010) Data clustering: 50?years beyond k-means. Pattern Recognit Lett 31(8): 651-66 CrossRef
24. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3): 264-23 CrossRef
25. Jeanny H (2010) Vision: images, signals and neural networks. Models of neural processing in visual perception. World Scientific Publishing, Singapore
26. Jolliffe IT (2002) Principal component analysis (2nd ed.) Springer Series in Statistics. Springer, Berlin
27. Julià-Sapé M, Acosta D, Mier M, Arús C, Watson D, The INTERPRET Consortium (2006) A multi-centre, web-accessible and quality control checked database of in vivo MR spectra of brain tumour patients. Magn Reson Mater Phys 19: 22-3 CrossRef
28. Kohonen T (2000) Self-organizing maps, (3rd ed.) Information Science Series. Springer, Berlin
29. Kim M, Ramakrishna RS (2005) New indices for cluster validity assessment. Pattern Recognit Lett 26(15): 2353-363 CrossRef
30. Leban G, Zupan B, Vidmar G, Bratko I (2006) VizRank: data visualization guided by machine learning. Data Min Knowl Discov 13(2): 119-36 CrossRef
31. Lee JA, Verleysen M (2007) Nonlinear dimensionality reduction, information science and statistics. Springer, Berlin CrossRef
32. Likert R (1932) A technique for the measurement of attitudes. Arch Psychol 140: 1-5
33. Lisboa PJG, Vellido A, Tagliaferri R, Napolitano F, Ceccarelli M, Martin-Guerrero JD, Biganzoli E (2010) Data mining in cancer research. IEEE Comput Intell Mag 5(1): 14-8 CrossRef
34. McLachlan G, Peel D (2000) Finite mixture models. Series in Probability and Statistics. Wiley-Blackwell
35. Meyers LS, Guarino A, Gamst G (2005) Applied multivariate research: design and interpretation. Sage Publications, Thousand Oaks
36. Miikkulainen R, Bednar JA, Choe Y, Sirosh J (2005) Computational maps in the visual cortex. Springer, Berlin
37. Okabe A, Boots B, Sugihara K, Chiu SN (2000) Spatial tessellations: concepts and applications of Voronoi diagrams (2nd ed.). Wiley-Blackwell, New York CrossRef
38. Paulovich FV, Eler DM, Poco J, Botha CP, Minghim R, Nonato LG (2011) Piecewise Laplacian-based projection for interactive data exploration and organization. Comput Graph Forum (Proceedings EuroVis) 30(3): 1091-100 CrossRef
39. Peel D, McLachlan GJ (2000) Robust mixture modelling using the t-distribution. Stat Comput 10: 339-48 CrossRef
40. Pointer JS (1986) The cortical magnification factor and photopic vision. Biol Rev 61(2): 97-19 CrossRef
41. Rauber A, Merkl D, Dittenbach M (2002) The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data. IEEE Trans Neural Netw 13(6): 1331-341 CrossRef
42. Rong G, Liu Y, Wang W, Yin X, Gu XD, Guo X (2011) GPU-assisted computation of centroidal Voronoi tessellation. IEEE Trans Vis Comput Graph 17(3): 345-56 CrossRef
43. Rossi F (2006) Visual data mining and machine learning. In: ESANN 2006, d-side pub, Brussels, Belgium, pp 251-64
44. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500): 2323-326 CrossRef
45. Shearer C (2000) The CRISP-DM model: the new blueprint for data mining. J Data Warehous 5(4): 13-2
46. Svensén M (1998) GTM: The Generative Topographic Mapping. PhD Thesis. Birmingham, UK: Aston University
47. Tino P, Nabney I (2002) Hierarchical GTM: Constructing localized nonlinear projection manifolds in a principled way. IEEE Trans Pattern Anal 24(5): 639-56 CrossRef
48. Tobler WR (2004) Thirty-five years of computer cartograms. Ann Assoc Am Geogr 94: 58-3 CrossRef
49. Tosi A, Vellido A (2012) Cartogram representation of the batch-SOM magnification factor. In ESANN 2012, Bruges, Belgium, 25-7th of April, pp 203-08
50. Ultsch A (1992) Self-organizing neural networks for visualization and classification. In: GfKl 1992, Dortmund, Germany.
51. Ultsch A, M?rchen F (2005) ESOM-Maps: tools for clustering, visualization, and classification with Emergent SOM. Technical Report 46, CS Department, Philipps-University Marburg, Germany
52. Vellido A (2006) Missing data imputation through GTM as a mixture of t-distributions. Neural Netw 19(10): 1624-635 CrossRef
53. Vellido A, Romero E, González-Navarro FF, Belanche-Mu?oz L, Julià-Sapé M, Arús C (2009) Outlier exploration and diagnostic classification of a multi-centre 1H-MRS brain tumour database. Neurocomputing 72(13-15): 3085-097 CrossRef
54. Vellido A, Martín JD, Rossi F, Lisboa PJG (2011) Seeing is believing: the importance of visualization in real-world machine learning applications. In: ESANN 2011, d-side pub, Brussels, Belgium, pp 219-26
55. Vellido A, Martín-Guerrero JD, Lisboa PJG, Making machine learning models interpretable. In: ESANN 2012, d-side pub, Brussels, Belgium, pp 163-72
56. Venna, J (2007) Dimensionality reduction for visual exploration of similarity structures. Doctoral thesis, Helsinki University of Technology, Dissertations in Computer and Information Science, Report D20, Espoo, Finland
57. Villmann Th, Claussen JC (2006) Magnification control in self-organizing maps and neural gas. Neural Comput 18(2): 446-69 CrossRef
58. W?ssle H, Grünert U, R?hrenbeck J, Boycott BB (1990) Retinal ganglion cell density and cortical magnification factor in the primate. Vision Res 30(11): 1897-911 CrossRef
59. Ziemkiewicz C, Kosara R (2009) Preconceptions and individual differences in understanding visual metaphors. Comput Graph Forum (Proceedings EuroVis) 28(3): 911-18 CrossRef
作者单位：Alfredo Vellido (1)
David L. García (1)
àngela Nebot (1)

1. Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya, C. Jordi Girona, 1-3, 08034, Barcelona, Spain
ISSN：1573-756X

文摘

Real-world applications of multivariate data analysis often stumble upon the barrier of interpretability. Simple data analysis methods are usually easy to interpret, but they risk providing poor data models. More involved methods may instead yield faithful data models, but limited interpretability. This is the case of linear and nonlinear methods for multivariate data visualization through dimensionality reduction. Even though the latter have provided some of the most exciting visualization developments, their practicality is hindered by the difficulty of explaining them in an intuitive manner. The interpretability, and therefore the practical applicability, of data visualization through nonlinear dimensionality reduction (NLDR) methods would improve if, first, we could accurately calculate the distortion introduced by these methods in the visual representation and, second, if we could faithfully reintroduce this distortion into such representation. In this paper, we describe a technique for the reintroduction of the distortion into the visualization space of NLDR models. It is based on the concept of density-equalizing maps, or cartograms, recently developed for the representation of geographic information. We illustrate it using Generative Topographic Mapping (GTM), a nonlinear manifold learning method that can provide both multivariate data visualization and a measure of the local distortion that the model generates. Although illustrated here with GTM, it could easily be extended to other NLDR visualization methods, provided a local distortion measure could be calculated. It could also serve as a guiding tool for interactive data visualization.