Performance Evaluation of Ranking Methods for Relevant Gene Selection in Cancer Microarray Datasets
详细信息    查看全文
  • 作者:Manju Sardana (21)
    Baljeet Kaur (22)
    R. K. Agrawal (21)
  • 关键词:Microarrays ; Ranking method ; Brown Forsythe test ; Gini Index ; Mutual Information ; Pearson Coefficient ; Cochran test ; Adjusted Welch test
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2013
  • 出版时间:2013
  • 年:2013
  • 卷:7629
  • 期:1
  • 页码:419-431
  • 全文大小:292KB
  • 参考文献:1. Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Array. Proc. Nat鈥檒 Academy of Science聽96(12), 6745鈥?750 (1999) CrossRef
    2. Bellman, R.: Adaptive Control Processes. In: A Guided Tour, Princeton University Press, Princeton (1961)
    3. Bittner, M., Meltzer, P., Chen, Y., Jiang, Y., Seftor, E., Hendrix, M., Radmacher, M., Simon, R., Yakhini, Z., et al.: Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature聽406(6795), 536鈥?40 (2000) CrossRef
    4. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Chapman and Hall, Boca Raton (1984)
    5. Brown, M.B., Forsythe, A.B.: The small sample behavior of some statistics which test the equality of several means. Technometrics聽16, 129鈥?32 (1974) CrossRef
    6. Cochran, W.G.: Problems arising in the analysis of a series of similar experiments. J. R. Stat. Soc. Ser. C Appl. Stat.聽4, 102鈥?18 (1937)
    7. Dechang, C., Zhenqiu, L., Xiaobin, M., Dong, H.: Selecting Genes by Test Statistics. Journal of Biomedicine and Biotechnology聽2, 132鈥?38 (2005)
    8. Demsar, J.: Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research聽7, 1鈥?0 (2006)
    9. Dowdy, S., Wearden, S.: Statistics for research. Wiley (1983)
    10. Friedman, M.: A comparison of alternative tests of significance for the problem of m rankings. Annals of Mathematical Statistics聽11, 86鈥?2 (1940) CrossRef
    11. Fu, L.M., Liu, C.S.F.: Evaluation of gene importance in microarray data based upon probability of selection. BMC Bioinformatics聽6, 67 (2005) CrossRef
    12. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science聽286(5439), 531鈥?37 (1999) CrossRef
    13. Guyon, I., Elisseff, A.: An Introduction to variable and feature selection. Journal of Machine Learning Research聽3, 1157鈥?182 (2003)
    14. Hartung, J., Argac, D., Makambi, K.: Small sample properties of tests on homogeneity in oneway ANOVA and meta-analysis. Statist Papers聽43, 197鈥?35 (2002) CrossRef
    15. Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., Meltzer, P.S.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med.聽7(6), 673鈥?79 (2001) CrossRef
    16. Kohavi, R., John, G.: Wrapper for feature subset selection. Artificial Intelligence聽97(1-2), 273鈥?24 (1997) CrossRef
    17. Li, T., Zhang, C., Ogihara, M.: Comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics聽20, 2429鈥?437 (2004) CrossRef
    18. Neter, J., Kutner, M.H., Nachtsheim, C.J., et al.: Applied Linear Statistical Models, 4th edn. McGraw-Hill, Chicago (1996)
    19. Nutt, C.L., Mani, D.R., Betensky, R.A., Tamayo, P., Cairncross, J.G., Ladd, C., Pohl, U., Hartmann, C., McLaughlin, M.E., Batchelor, T.T., Black, P.M., von Deimling, A., Pomeroy, S.L., Golub, T.R., Louis, D.N.: Gene expressionbased classification of malignant gliomas correlates better with survival than histological classification. Cancer Res.聽63(7), 1602鈥?607 (2003)
    20. Pearson, K.: Notes on the History of Correlation. Biometrika聽13(1), 25鈥?5 (1920) CrossRef
    21. Pomeroy, S.L., Tamayo, P., Gaasenbeek, M., Sturla, L.M., Angelo, M., McLaughlin, M.E., Kim, J.Y.H., et al.: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature聽415(6870), 436鈥?42 (2002) CrossRef
    22. Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C.H., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J.P., Poggio, T., Gerald, W., Loda, M., Lander, E.S., Golub, T.R.: Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. Sci. USA聽98(26), 15149鈥?5154 (2001) CrossRef
    23. Ross, D.T., Scherf, U., Eisen, M.B., Perou, C.M., Rees, C., Spellman, P., Iyer, V., Jeffrey, S.S., Van De Rijn, M., Walthamet, M., et al.: Systematic Variation in Gene Expression Patterns in Human Cancer Cell Lines. Nature Genet.聽24, 227鈥?35 (2000) CrossRef
    24. Shah, S., Kusiak, A.: Cancer gene search with data mining and genetic algorithms. Computers in Biology Medicine聽37(2), 251鈥?61 (2007) CrossRef
    25. Shannon, C.E., Weaver, W.: The mathematical theory of Communication. University of Illinois Press, Urbana (1949)
    26. Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell聽1(2), 203鈥?09 (2002) CrossRef
    27. Su, A.I., Welsh, J.B., Sapinoso, L.M., Kern, S.G., Dimitrov, P., Lapp, H., Schultz, P.G., Powell, S.M., Moskaluk, C.A., Frierson, H.F., Hampton, G.M.: Molecular classification of human carcinomas by use of gene expression signatures. Cancer Res.聽61(20), 7388鈥?393 (2001)
    28. Su, Y., Murali, T.M., et al.: RankGene: identification of diagnostic genes based on expression data. Bionformatics聽19(12), 1578鈥?579 (2003) CrossRef
    29. Welch, B.L.: On the comparison of several mean values: An alternative approach. Biometrika聽38, 330鈥?36 (1951)
  • 作者单位:Manju Sardana (21)
    Baljeet Kaur (22)
    R. K. Agrawal (21)

    21. School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India, 110067
    22. Hansraj College, Delhi University, Delhi, India, 110007
  • ISSN:1611-3349
文摘
Microarray data is often characterized by high dimension and small sample size. Gene ranking is one of the most widely explored techniques to reduce the dimension because of its simplicity and computational efficiency. Many ranking methods have been suggested which depict their efficiency dependent upon the problem at hand. We have investigated the performance of six ranking methods on eleven cancer microarray datasets. The performance is evaluated in terms of classification accuracy and number of genes. Experimental results on all dataset show that there is significant variation in classification accuracy which depends on the choice of ranking method and classifier. Empirical results show that Brown Forsythe test statistics and Mutual Information method exhibit high accuracy with few genes whereas Gini Index and Pearson Coefficient perform poorly in most cases.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700