用户名: 密码: 验证码:
肿瘤参数属性偏序结构可视化实现乳腺癌诊断
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Diagnosis of Breast Cancer Based on Tumor Parameters and Visualization of the Attribute Partial Order Structure Diagram
  • 作者:梁怀新 ; 宋佳霖 ; 郑存芳 ; 洪文学
  • 英文作者:Liang Huaixin;Song Jialin;Zheng Cunfang;Hong Wenxue;Institute of Electrical Engineering,Yanshan University;LiRen College,Yanshan University;
  • 关键词:Lasso ; 增量学习 ; 属性偏序结构图 ; 可视化 ; 乳腺癌诊断
  • 英文关键词:lasso;;incremental learning;;attribute partial order structure diagram;;visualization;;breast cancer diagnosis
  • 中文刊名:ZSWY
  • 英文刊名:Chinese Journal of Biomedical Engineering
  • 机构:燕山大学电气工程学院;燕山大学里仁学院;
  • 出版日期:2018-08-20
  • 出版单位:中国生物医学工程学报
  • 年:2018
  • 期:v.37;No.179
  • 基金:国家自然科学基金(61273019,81373767,61501397,61201111);; 河北省自然科学基金重点项目(F2016203443)
  • 语种:中文;
  • 页:ZSWY201804003
  • 页数:10
  • CN:04
  • ISSN:11-2057/R
  • 分类号:23-32
摘要
为实现乳腺癌数据规则可视化,提出一种基于Lasso和增量学习结合的、以改进的属性偏序结构图为可视化工具的乳腺癌诊断规则提取方法。采用乳腺癌数据为数据源基础上算法分为4步:首先使用Lasso方法进行特征选择实现降维,在9个特征中选出前4个关联度最大的特征;其次进行基于Gini指数的连续数据粒化,通过增量学习方式动态生成形式背景;再次融合二次Lasso筛选,将维数由17降为3;最后使用新的基于基尼指数和覆盖对象的行列优化方法生成属性偏序结构图可视化规则,提取出规则7条。将数据处理结果与主流分类器对比,结果表明,基于该算法的规则提取实现96.52%的诊断准确率,均高于随机森林(94.25%)、Adaboost(90.00%)、1NN(91.33%)、3NN(90.67%)、支持向量机算法(95.00%)。最后采用不同增量比例(10%~90%)数据验证增量学习算法效果,表明顺序学习数据量达到30%时模式已经完备,数据量在20%时准确率已经接近支持向量机算法,证明该方法是一种用于诊断可视化的规则发现的有效手段。
        In order to realize the visualization of the rules of breast cancer data,a method based on the combination of Lasso and incremental learning,was proposed,using the optimized attribute partial order structure diagram as a tool. Firstly,having the dimensions reduced by using Lasso to select the features of the breast cancer data,and four attributes that gained the largest correlation were selected from nine features.Granulation process was completed under the Gini index,generating the formal context by means of the incremental learning algorithm. Next,the second Lasso process was completed,which made the dimensions reduced from 17 to 3. Meanwhile,a new method processing the rows and columns of the formal context based on the Gini index and the covering theory was proposed to generate the attribute partial order structure diagram to visualize the rules concerned. As there have been seven rules extracted by analyzing the diagram reported in literatures,we compared the proposed classification accuracy of the method with those classical mainstream classifiers. Results showed that the classification precision of our method reached 96. 52%,higher than the other five classifiers including Random Forest( 94. 25%),Adaboost( 90. 00%),1 NN( 91. 33%),3 NN( 90. 67%),and SVM( 95. 00%). At last,different incremental proportional( 10%-90%) data were used to verify the effect of incremental learning algorithm,results showed that the model had been completed when the amount of data reached 30%,and the precision was almost approaching to that of support vector machine,which proved that the proposed method represented an effective means of visualizing the diagnosis rules of breast cancer.
引文
[1]沈艳,郭筱兰.早期乳腺癌的影像学筛查现状与进展[J].中华乳腺病杂志,2017,11(2):114-116.
    [2]陈万青,郑荣寿.中国女性乳腺癌发病死亡和生存状况[J].中国肿瘤临床,2015,42(13):668-674.
    [3]左婷婷,陈万青.中国乳腺癌全人群生存率分析研究进展[J].中国肿瘤临床,2016,43(14):639-642.
    [4]叶华容,杨怡,林萱,等.BP神经网络在高频彩超特征诊断乳腺癌中的应用[J].中国卫生统计,2016,33(1):71-72.
    [5]饶飘雪,叶枫.基于Logistic回归、ANN、SVM的乳腺癌复发影响因素研究[J].计算机系统应用,2016,25(7):259-263.
    [6]吴辰文,李长生,王伟,等.一种改进的SVM算法在乳腺癌诊断方面的应用[J].计算机工程与科学,2017,39(3):562-566.
    [7]毛利锋,瞿海斌.一种基于决策树的乳腺癌计算机辅助诊断新方法[J].江南大学学报(自然科学版),2004,3(3):227-229.
    [8]邓泽林,谭冠政,叶吉祥,等.一种用于乳腺癌诊断的免疫分类算法[J].中南大学学报(自然科学版),2010,41(4):1485-1490.
    [9]邱天宇,申富饶,赵金熙.自组织增量学习神经网络综述[J].软件学报,2016,27(9):2230-2247.
    [10]徐敏政,何宗宜,刘亚虹,等.双向渐进式概念格生成算法[J].小型微型计算机系统,2014,35(1):172-176.
    [11]王爱平,万国伟,程志全,等.支持在线学习的增量式极端随机森林分类器[J].软件学报,2011,29(9):2059-2074.
    [12]王爱平,万国伟,程志全,等.支持在线学习的增量式极端随机森林分类器[J].软件学报,2011,22(9):2059-2074.
    [13]曾舒如.基于多模态增量学习模型的目标物体检测方法研究[D].南昌:南昌大学,2016.
    [14]Robert T.Regression shrinkage and selection via the lasso[J].Journal of the Royal Statistical Society.Series B(Methodological).1996,58(1):267-288.
    [15]Vliet MHV,Wessels LF,Reinders MJ.Knowledge driven decomposition of tumor expression profiles[J].BMC Bioinformatics,2009,10(Suppl 1):1-12.
    [16]Erdem C,Nagle Alison M,Casa Angelo J,et al.Proteomic screening and lasso regression reveal differential signaling in insulin and insulin-like growth factor I(IGF1)pathways[J].Molecular&Cellular Proteomics:MCP,2016,15(9):3045-3057.
    [17]Wille R.Restructuring lattice theory:An approach based on hierarchies of concepts[M]//Formal Concept Analysis.Berlin:Springer Berlin Heidelberg,2009:314-339.
    [18]樊凤杰,洪文学,宋佳霖,等.方剂配伍规律的可视化表示方法与知识发现[J].中国生物医学工程学报,2016,35(6):764-768.
    [19]张仲鹏.基于属性偏序原理的脑功能近红外光谱分析方法研究[D].秦皇岛:燕山大学,2016.
    [20]郝连旺,洪文学,魏鹍.基于优选特征属性偏序结构分析的白细胞图像分类规则发现[J].高技术通讯,2015,25(10/11):871-877.
    [21]靖鲲鹏,宋之杰.基于属性偏序结构图的文本型灾情多元信息可视化[J].灾害学,2014,29(3):57-63.
    [22]屈华.基于属性偏序结构图原理的《伤寒论》知识发现方法研究[D].广州:广州中医药大学,2013.
    [23]Hong Wenxue,Luan Jingmin,Li Shaoxiong.The complete definitions of covering and properties description based on partial ordered theory[J].ICIC Express Letters Part B:Applications,2015,6(4):1055-1060.
    [24]杨律,丁守鸿,谢志峰,等.Lasso整脸形状回归的人脸配准算法[J].计算机辅助设计与图形学学报,2015,27(7):1313-1319.
    [25]王金甲,卢阳.特征交互lasso用于肝病分类[J].生物医学工程学杂志,2015,36(06):1227-1232.
    [26]Bradley E,Trevor H,Iain J,et al.Least angle regression[J].The Annals of Statistics,2004,32(2),407-499.
    [27]Hong Wenxue,Yu Jianping,Cai Fei,et al.A new method of attribute reduction for decision formal context[J].ICIC Express Letters Part B:Applications,2012,3(5):1061-1068.
    [28]李少雄,闫恩亮,宋佳霖,等.偏序结构图的一种计算机生成算法[J].燕山大学学报,2014,38(5):403-408.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700