参考文献:1. Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Array. Proc. Nat鈥檒 Academy of Science聽96(12), 6745鈥?750 (1999) CrossRef 2. Bellman, R.: Adaptive Control Processes. In: A Guided Tour, Princeton University Press, Princeton (1961) 3. Bittner, M., Meltzer, P., Chen, Y., Jiang, Y., Seftor, E., Hendrix, M., Radmacher, M., Simon, R., Yakhini, Z., et al.: Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature聽406(6795), 536鈥?40 (2000) CrossRef 4. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Chapman and Hall, Boca Raton (1984) 5. Brown, M.B., Forsythe, A.B.: The small sample behavior of some statistics which test the equality of several means. Technometrics聽16, 129鈥?32 (1974) CrossRef 6. Cochran, W.G.: Problems arising in the analysis of a series of similar experiments. J. R. Stat. Soc. Ser. C Appl. Stat.聽4, 102鈥?18 (1937) 7. Dechang, C., Zhenqiu, L., Xiaobin, M., Dong, H.: Selecting Genes by Test Statistics. Journal of Biomedicine and Biotechnology聽2, 132鈥?38 (2005) 8. Demsar, J.: Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research聽7, 1鈥?0 (2006) 9. Dowdy, S., Wearden, S.: Statistics for research. Wiley (1983) 10. Friedman, M.: A comparison of alternative tests of significance for the problem of m rankings. Annals of Mathematical Statistics聽11, 86鈥?2 (1940) CrossRef 11. Fu, L.M., Liu, C.S.F.: Evaluation of gene importance in microarray data based upon probability of selection. BMC Bioinformatics聽6, 67 (2005) CrossRef 12. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science聽286(5439), 531鈥?37 (1999) CrossRef 13. Guyon, I., Elisseff, A.: An Introduction to variable and feature selection. Journal of Machine Learning Research聽3, 1157鈥?182 (2003) 14. Hartung, J., Argac, D., Makambi, K.: Small sample properties of tests on homogeneity in oneway ANOVA and meta-analysis. Statist Papers聽43, 197鈥?35 (2002) CrossRef 15. Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., Meltzer, P.S.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med.聽7(6), 673鈥?79 (2001) CrossRef 16. Kohavi, R., John, G.: Wrapper for feature subset selection. Artificial Intelligence聽97(1-2), 273鈥?24 (1997) CrossRef 17. Li, T., Zhang, C., Ogihara, M.: Comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics聽20, 2429鈥?437 (2004) CrossRef 18. Neter, J., Kutner, M.H., Nachtsheim, C.J., et al.: Applied Linear Statistical Models, 4th edn. McGraw-Hill, Chicago (1996) 19. Nutt, C.L., Mani, D.R., Betensky, R.A., Tamayo, P., Cairncross, J.G., Ladd, C., Pohl, U., Hartmann, C., McLaughlin, M.E., Batchelor, T.T., Black, P.M., von Deimling, A., Pomeroy, S.L., Golub, T.R., Louis, D.N.: Gene expressionbased classification of malignant gliomas correlates better with survival than histological classification. Cancer Res.聽63(7), 1602鈥?607 (2003) 20. Pearson, K.: Notes on the History of Correlation. Biometrika聽13(1), 25鈥?5 (1920) CrossRef 21. Pomeroy, S.L., Tamayo, P., Gaasenbeek, M., Sturla, L.M., Angelo, M., McLaughlin, M.E., Kim, J.Y.H., et al.: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature聽415(6870), 436鈥?42 (2002) CrossRef 22. Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C.H., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J.P., Poggio, T., Gerald, W., Loda, M., Lander, E.S., Golub, T.R.: Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. Sci. USA聽98(26), 15149鈥?5154 (2001) CrossRef 23. Ross, D.T., Scherf, U., Eisen, M.B., Perou, C.M., Rees, C., Spellman, P., Iyer, V., Jeffrey, S.S., Van De Rijn, M., Walthamet, M., et al.: Systematic Variation in Gene Expression Patterns in Human Cancer Cell Lines. Nature Genet.聽24, 227鈥?35 (2000) CrossRef 24. Shah, S., Kusiak, A.: Cancer gene search with data mining and genetic algorithms. Computers in Biology Medicine聽37(2), 251鈥?61 (2007) CrossRef 25. Shannon, C.E., Weaver, W.: The mathematical theory of Communication. University of Illinois Press, Urbana (1949) 26. Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell聽1(2), 203鈥?09 (2002) CrossRef 27. Su, A.I., Welsh, J.B., Sapinoso, L.M., Kern, S.G., Dimitrov, P., Lapp, H., Schultz, P.G., Powell, S.M., Moskaluk, C.A., Frierson, H.F., Hampton, G.M.: Molecular classification of human carcinomas by use of gene expression signatures. Cancer Res.聽61(20), 7388鈥?393 (2001) 28. Su, Y., Murali, T.M., et al.: RankGene: identification of diagnostic genes based on expression data. Bionformatics聽19(12), 1578鈥?579 (2003) CrossRef 29. Welch, B.L.: On the comparison of several mean values: An alternative approach. Biometrika聽38, 330鈥?36 (1951)
作者单位:Manju Sardana (21) Baljeet Kaur (22) R. K. Agrawal (21)
21. School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India, 110067 22. Hansraj College, Delhi University, Delhi, India, 110007
ISSN:1611-3349
文摘
Microarray data is often characterized by high dimension and small sample size. Gene ranking is one of the most widely explored techniques to reduce the dimension because of its simplicity and computational efficiency. Many ranking methods have been suggested which depict their efficiency dependent upon the problem at hand. We have investigated the performance of six ranking methods on eleven cancer microarray datasets. The performance is evaluated in terms of classification accuracy and number of genes. Experimental results on all dataset show that there is significant variation in classification accuracy which depends on the choice of ranking method and classifier. Empirical results show that Brown Forsythe test statistics and Mutual Information method exhibit high accuracy with few genes whereas Gini Index and Pearson Coefficient perform poorly in most cases.