A branch-and-bound approach for maximum quasi-cliques

详细信息查看全文

作者：Foad Mahdavi Pajouh (1)
Zhuqi Miao (1)
Balabhaskar Balasundaram (1)
关键词：Clique ; Quasi ; clique ; Cluster detection ; Graph ; based data mining
刊名：Annals of Operations Research
出版年：2014
出版时间：May 2014
年：2014
卷：216
期：1
页码：145-161
全文大小：591 KB
参考文献：1. Abello, J., Pardalos, P. M., & Resende, M. G. C. (1999). On maximum clique problems in very large graphs. In J. Abello & J. Vitter (Eds.), / DIMACS series on discrete mathematics and theoretical computer science: Vol.聽 / 50. / External memory algorithms and visualization (pp. 119鈥?30). Providence: American Mathematical Society.
2. Abello, J., Resende, M. G. C., & Sudarsky, S. (2002). Massive quasi-clique detection. In S. Rajsbaum (Ed.), / LATIN 2002: proceedings of the 5th Latin American symposium on theoretical informatics (pp. 598鈥?12). London: Springer. CrossRef
3. Adamic, L., & Huberman, B. (2000). Power-law distribution of the World Wide Web. / Science, / 287, 2115a. CrossRef
4. Alba, R. D. (1973). A graph-theoretic definition of a sociometric clique. / The Journal of Mathematical Sociology, / 3(1), 113鈥?26. CrossRef
5. Almaas, E., & Barab谩si, A. L. (2006). Power laws in biological networks. In E. Koonin, Y. I. Wolf, & G. P. Karev (Eds.), / Power laws, scale-free networks and genome biology (pp. 1鈥?1). New York: Springer. CrossRef
6. Balasundaram, B., Butenko, S., & Trukhanov, S. (2005). Novel approaches for analyzing biological networks. / Journal of Combinatorial Optimization, / 10(1), 23鈥?9. CrossRef
7. Balasundaram, B., Butenko, S., & Hicks, I. V. (2011). Clique relaxations in social network analysis: the maximum / k-plex problem. / Operations Research, / 59(1), 133鈥?42. CrossRef
8. Barab谩si, A. L., & Albert, R. (1999). Emergence of scaling in random networks. / Science, / 286(5439), 509鈥?12. CrossRef
9. Barab谩si, A. L., Albert, R., & Jeong, H. (2000). Scale-free characteristics of random networks: the topology of the World Wide Web. / Physica. A, / 281(1鈥?), 69鈥?7. CrossRef
10. Batagelj, V., & Mrvar, A. (2006). Pajek datasets: Reuters terror news network. Online: http://vlado.fmf.uni-lj.si/pub/networks/data/CRA/terror.htm. Accessed March 2008.
11. Boginski, V., Butenko, S., & Pardalos, P. M. (2003). On structural properties of the market graph. In A.聽Nagurney (Ed.), / Innovation in financial and economic networks. London: Edward Elgar.
12. Boginski, V., Butenko, S., & Pardalos, P. (2006). Mining market data: a network approach. / Computers & Operations Research, / 33, 3171鈥?184. CrossRef
13. Bomze, I. M., Budinich, M., Pardalos, P. M., & Pelillo, M. (1999). The maximum clique problem. In D. Z. Du & P. M. Pardalos (Eds.), / Handbook of combinatorial optimization (pp. 1鈥?4). Dordrecht: Kluwer Academic. CrossRef
14. Broido, A., & Claffy, K. C. (2001). Internet topology: connectivity of IP graphs. In S. Fahmy & K. Park (Eds.), / Scalability and traffic control in IP networks (pp. 172鈥?87). Bellingham: SPIE. CrossRef
15. Brunato, M., Hoos, H., & Battiti, R. (2008). On effectively finding maximal quasi-cliques in graphs. In V.聽Maniezzo, R. Battiti, & J. P. Watson (Eds.), / Lecture notes in computer science: Vol.聽 / 5313. / Learning and intelligent optimization (pp. 41鈥?5). Berlin: Springer. CrossRef
16. Bu, D., Zhao, Y., Cai, L., Xue, H., Zhu, X., Lu, H., Zhang, J., Sun, S., Ling, L., Zhang, N., Li, G., & Chen, R. (2003). Topological structure analysis of the protein-protein interaction network in budding yeast. / Nucleic Acids Research, / 31(9), 2443鈥?450. CrossRef
17. Carlson, J. M., & Doyle, J. (1999). Highly optimized tolerance: a mechanism for power laws in designed systems. / Physical Review E, / 60(2), 1412鈥?427. CrossRef
18. Carraghan, R., & Pardalos, P. (1990). An exact algorithm for the maximum clique problem. / Operations Research Letters, / 9, 375鈥?82. CrossRef
19. Chung, F., & Lu, L. (2006). / CBMS lecture series. / Complex graphs and networks. Providence: American Mathematical Society.
20. Cook, D. J., & Holder, L. B. (2000). Graph-based data mining. / IEEE Intelligent Systems, / 15(2), 32鈥?1. CrossRef
21. Corman, S., Kuhn, T., McPhee, R., & Dooley, K. (2002). Studying complex discursive systems: centering resonance analysis of organizational communication. / Human Communication Research, / 28(2), 157鈥?06.
22. Corneil, D., & Perl, Y. (1984). Clustering and domination in perfect graphs. / Discrete Applied Mathematics, / 9, 27鈥?9. CrossRef
23. Dimacs (1995). Cliques, coloring, and satisfiability: second Dimacs implementation challenge. Online: http://dimacs.rutgers.edu/Challenges/. Accessed March 2007.
24. Erd枚s, P., & R茅nyi, A. (1959). On random graphs. / Publicationes Mathematicae, / 6, 290鈥?97.
25. Faloutsos, M., Faloutsos, P., & Faloutsos, C. (1999). On power-law relationships of the Internet topology. In / Proceedings of the ACM-SIGCOMM conference on applications, technologies, architectures, and protocols for computer communication, Cambridge (pp. 251鈥?62). CrossRef
26. Feige, U., Kortsarz, G., & Peleg, D. (2001). The dense / k-subgraph problem. / Algorithmica, / 29, 410鈥?21. CrossRef
27. Gagneur, J., Krause, R., Bouwmeester, T., & Casari, G. (2004). Modular decomposition of protein-protein interaction networks. / Genome Biology, / 5(8), R57. CrossRef
28. Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. / Proceedings of the National Academy of Sciences of the United States of America, / 99(12), 7821鈥?826. CrossRef
29. Grossman, J., Ion, P., & Castro, R. D. (1995). The Erd枚s number project. Online: http://www.oakland.edu/enp/. Accessed March 2007.
30. IBM Corporation (2010). IBM ILOG CPLEX Optimizer 12.2. http://www.ibm.com/software/integration/optimization/cplex-optimizer/. IBM Academic Initiative. Accessed June 2011.
31. Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., & Sakaki, Y. (2001). A comprehensive two-hybrid analysis to explore the yeast protein interactome. / Proceedings of the National Academy of Sciences of the United States of America, / 98(8), 4569鈥?574. CrossRef
32. Jiang, D., & Pei, J. (2009). Mining frequent cross-graph quasi-cliques. / ACM Transactions on Knowledge Discovery from Data, / 2(4), 16. CrossRef
33. Kortsarz, G., & Peleg, D. (1993). On choosing a dense subgraph. In / Proceedings of the 34th annual IEEE symposium on foundations of computer science (pp. 692鈥?01). Piscataway: IEEE Comput. Soc.
34. Kreher, D. L., & Stinson, D. R. (1998). / Combinatorial algorithms: generation, enumeration, and search (1st ed.). Boca Raton: CRC Press.
35. Leskovec, J., & Horvitz, E. (2008). Planetary-scale views on a large instant-messaging network. In / Proceeding of the 17th international conference on World Wide Web. WWW 鈥?8 (pp. 915鈥?24). New York: ACM. CrossRef
36. Lu, H., Zhu, X., Liu, H., Skogerb, G., Zhang, J., Zhang, Y., Cai, L., Zhao, Y., Sun, S., Xu, J., Bu, D., & Chen, R. (2004). The interactome as a tree鈥攁n attempt to visualize the protein-protein interaction network in yeast. / Nucleic Acids Research, / 32(16), 4804鈥?811. CrossRef
37. Luce, R. D. (1950). Connectivity and generalized cliques in sociometric group structure. / Psychometrika, / 15(2), 169鈥?90. CrossRef
38. Mokken, R. J. (1979). Cliques, clubs and clans. / Quality and Quantity, / 13(2), 161鈥?73. CrossRef
39. 脰sterg氓rd, P. R. J. (2002). A fast algorithm for the maximum clique problem. / Discrete Applied Mathematics, / 120, 197鈥?07. CrossRef
40. Patillo, J., Veremyev, A., Butenko, S., & Boginski, V. (2012). On the maximum quasi-clique problem. / Discrete Applied Mathematics. doi:10.1016/j.dam.2012.07.019 .
41. Pei, J., Jiang, D., & Zhang, A. (2005a). Mining cross-graph quasi-cliques in gene expression and protein interaction data. In / Proceedings of the 21st international conference on data engineering. ICDE 2005 (pp. 353鈥?56).
42. Pei, J., Jiang, D., & Zhang, A. (2005b). On mining cross-graph quasi-cliques. In / Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining. KDD 鈥?5 (pp.聽228鈥?38). New York: ACM. CrossRef
43. Peng, X., Langston, M. A., Saxton, A. M., Baldwin, N. E., & Snoddy, J. R. (2007). Detecting network motifs in gene co-expression networks through integration of protein domain information. In P. McConnell, S. M. Lin, & P. Hurban (Eds.), / Methods of microarray data analysis V (pp. 89鈥?02). New York: Springer. CrossRef
44. Seidman, S. B., & Foster, B. L. (1978). A graph theoretic generalization of the clique concept. / The Journal of Mathematical Sociology, / 6, 139鈥?54. CrossRef
45. Simonite, T. (2011). Bracing for the data deluge. http://www.technologyreview.com/business/37506/. Accessed May 2011.
46. Spirin, V., & Mirny, L. A. (2003). Protein complexes and functional modules in molecular networks. / Proceedings of the National Academy of Sciences of the United States of America, / 100(21), 12123鈥?2128. CrossRef
47. Washio, T., & Motoda, H. (2003). State of the art of graph-based data mining. / ACM SIGKDD Explorations Newsletter, / 5(1), 59鈥?8. CrossRef
48. West, D. (2001). / Introduction to graph theory. Upper Saddle River: Prentice-Hall.
49. Zeng, Z., Wang, J., Zhou, L., & Karypis, G. (2006). Coherent closed quasi-clique discovery from large dense graph databases. In / Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. KDD 鈥?6 (pp. 797鈥?02). New York: ACM. CrossRef
50. Zeng, Z., Wang, J., Zhou, L., & Karypis, G. (2007). Out-of-core coherent closed quasi-clique mining from large dense graph databases. / ACM Transactions on Database Systems, / 32, 13. CrossRef
作者单位：Foad Mahdavi Pajouh (1)
Zhuqi Miao (1)
Balabhaskar Balasundaram (1)

1. School of Industrial Engineering & Management, Oklahoma State University, Stillwater, OK, 74078, USA
ISSN：1572-9338

文摘

Detecting quasi-cliques in graphs is a useful tool for detecting dense clusters in graph-based data mining. Particularly in large-scale data sets that are error-prone, cliques are overly restrictive and impractical. Quasi-clique detection has been accomplished using heuristic approaches in various applications of graph-based data mining in protein interaction networks, gene co-expression networks, and telecommunication networks. Quasi-cliques are not hereditary, in the sense that every subset of a quasi-clique need not be a quasi-clique. This lack of heredity introduces interesting challenges in the development of exact algorithms to detect maximum cardinality quasi-cliques. The only exact approaches for this problem are limited to two mixed integer programming formulations that were recently proposed in the literature. The main contribution of this article is a new combinatorial branch-and-bound algorithm for the maximum quasi-clique problem.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700