文摘
Many real-world datasets can be modeled as graphs,where each node corresponds to a data instance and an edge represents the relation/similarity between two nodes. To partition the nodes into different clusters,spectral clustering is used to find the normalized minimum cut of the graph in the relaxed sense). As one of the most popular clustering schemes,spectral clustering is limited to a single graph. However,in practice,we often need to collectively consider rich information generated from multiple heterogeneous sources,e.g. scientific data fMRI scans of different individuals),social data different types of relationship among different people),and web data multi-type contents). Such complex datasets demand complex graph models. In this dissertation,we explore novel formulations to extend spectral clustering to a variety of complex graph models and study how to apply them to real-world problems. We start with incorporating pairwise constraints into spectral clustering,which extends spectral clustering from unsupervised setting to semi-supervised setting. Then we further extend our constrained spectral clustering formulation from passive learning to active learning. We justify the effectiveness of our approach by exploring its link to a classic graph-based semi-supervised learning technique,namely label propagation. Finally we study how to extend spectral clustering to the multi-view learning setting. Our proposed algorithms were not only tested on benchmark datasets but also successfully applied to real-world applications,such as machine translation aided document clustering and resting-state fMRI analysis.