Semi-supervised SVM-based Feature Selection for Cancer Classification using Microarray Gene Expression Data
详细信息    查看全文
  • 关键词:Support vector machines ; Semi ; supervised ; Feature selection ; Cancer ; Gene expression
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2015
  • 出版时间:2015
  • 年:2015
  • 卷:9101
  • 期:1
  • 页码:468-477
  • 全文大小:620 KB
  • 参考文献:1.Saeys, Y., Inza, I., Larraaga, P.: A review of feature selection techniques in Bioinformatics. Bioinformatics. 23, 2507鈥?517 (2007)View Article
    2.Loscalzo, S., Yu, L., Ding, C.: Consensus group stable feature selection. Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 567鈥?76. ACM, New York, NY, USA (2009)
    3.Witten, D.M., Tibshirani, R.: A Framework for Feature Selection in Clustering. Journal of the American Statistical Association 105, 713鈥?26 (2010)View Article MathSciNet
    4.Xu, R., Damelin, S., Nadler, B., Wunsch II, D.C.: Clustering of high-dimensional gene expression data with feature filtering methods and diffusion maps. Artificial intelligence in medicine 48, 91鈥?8 (2010)View Article
    5.Du, W., Sun, Y., Wang, Y., Cao, Z., Zhang, C., Liang, Y.: A novel multistage feature selection method for microarray expression data analysis. International journal of data mining and bioinformatics 7, 58鈥?7 (2013)View Article
    6.Gaafar, M.A., Yousri, N.A., Ismail, M.A.: A novel ensemble selection method for cancer diagnosis using microarray datasets. IEEE 12th International Conference on BioInformatics and BioEngineering, BIBE 2012. pp. 368鈥?73 (2012)
    7.Liang, Y., Liu, C., Luan, X.-Z., Leung, K.-S., Chan, T.-M., Xu, Z.-B., Zhang, H.: Sparse logistic regression with a L1/2 penalty for gene selection in cancer classification. BMC Bioinformatics 14, 198 (2013)View Article
    8.Barkia, H., Elghazel, H., Aussem, A.: Semi-supervised Feature Importance Evaluation with Ensemble Learning. In: 2011 IEEE 11th International Conference on Data Mining (ICDM), pp. 31鈥?0 (2011)
    9.Benabdeslem, K., Hindawi, M.: Efficient Semi-supervised Feature Selection: Constraint, Relevance and Redundancy. IEEE Transactions on Knowledge and Data Engineering 1, (2013)
    10.Helleputte, T., Dupont, P.: Partially supervised feature selection with regularized linear models. In: Proceedings of the 26th Annual International Conference on Machine Learning. pp. 409鈥?16. ACM, New York, NY, USA (2009)
    11.Kalakech, M., Biela, P., Macaire, L., Hamad, D.: Constraint scores for semi-supervised feature selection: A comparative study. Pattern Recognition Letters 32, 656鈥?65 (2011)View Article
    12.Zhao, Z., Liu, H.: Semi-supervised Feature Selection via Spectral Analysis. SDM, 641鈥?46 (2007)
    13.Kotsiantis, S.B.: Supervised Machine Learning: A Review of Classification Techniques. Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies. pp. 3鈥?4. IOS Press, Amsterdam, The Netherlands (2007)
    14.Zhili, W.: Kernel Based Learning Methods for Pattern and Feature Analysis (2004)
    15.Wu, Z., Li, C.: Feature Selection with Transductive Support Vector Machines. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds.) Feature Extraction. Studies in Fuzziness and Soft Computing, vol. 207, pp. 325鈥?41. Springer, Heidelberg (2006)View Article
    16.LeCun, Y., Denker, J.S., Solla, S.A.: Optimal Brain Damage. Advances in Neural Information Processing Systems. pp. 598鈥?05. Morgan Kaufmann (1990)
    17.Scholkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the Support of a High-Dimensional Distribution. Neural computation 13, 1443鈥?471 (2001)
    18.Gordon, G.J., Jensen, R.V., Hsiao, L.-L., Gullans, S.R., Blumenstock, J.E., Ramaswamy, S., Richards, W.G., Sugarbaker, D.J., Bueno, R.: Translation of Microarray Data into Clinically Relevant Cancer Diagnostic Tests Using Gene Expression Ratios in Lung Cancer and Mesothelioma. Cancer research 62, 4963鈥?967 (2002). http://鈥媋lgorithmics.鈥媘olgen.鈥媘pg.鈥媎e/鈥婼tatic/鈥婼upplements/鈥婥ompCancer/鈥媎atasets.鈥媓tm
    19.Chowdary, D., Lathrop, J., Skelton, J., Curtin, K., Briggs, T., Zhang, Y., Yu, J., Wang, Y., Mazumder, A.: Prognostic Gene Expression Signatures Can Be Measured in Tissues Collected in RNAlater Preservative. The journal of molecular diagnostics 8, 31鈥?9 (2006)View Article
    20.Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene Selection for Cancer Classification using Support Vector Machines. Machine learning 46, 389鈥?22 (2002)View Article MATH
  • 作者单位:Jun Chin Ang (9)
    Habibollah Haron (9)
    Haza Nuzly Abdull Hamed (9)

    9. Department of Computer Science, Faculty of Computing, Universiti Teknologi Malaysia, Skudai, Johor, Malaysia
  • 丛书名:Current Approaches in Applied Artificial Intelligence
  • ISBN:978-3-319-19066-2
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
文摘
Gene expression data always suffer from the high dimensionality issue, therefore feature selection becomes a fundamental tool in the analysis of cancer classification. Basically, the data can be collected easily without providing the label information, which is quite useful in improving the accuracy of the classification. Label information usually difficult to obtain as the labelling processes are tedious, costly and error prone. Previous studies of gene selection are mostly dedicated to supervised and unsupervised approaches. Support vector machine (SVM) is a common supervised technique to address gene selection and cancer classification problems. Hence, this paper aims to propose a semi-supervised SVM-based feature selection (S\(^3\)VM-FS), which simultaneously exploit the knowledge from unlabelled and labelled data. Experimental results on the gene expression data of lung cancer show that S\(^3\)VM-FS achieves the higher accuracy yet requires shorter processing time compares with the well-known supervised method, SVM-based recursive feature elimination (SVM-RFE) and the improved method, S\(^3\)VM-RFE.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700