临床文本自然语言处理系统构建研究——以cTAKES为例
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Study on the Building of Clinical Text Natural Language Processing System-Taking cTAKES as an Example
  • 作者:杨晨柳 ; 胡佳慧 ; 方安 ; 王蕾 ; 任慧玲
  • 英文作者:YANG Chenliu;HU Jiahui;FANG An;WANG Lei;REN Huiling;Institute of Medical Information,Chinese Academy of Medical Sciences;
  • 关键词:临床文本 ; cTAKES ; 模块化 ; 语料库 ; 自然语言处理
  • 英文关键词:clinical text;;cTAKES;;modularization;;corpus;;Natural Language Processing(NLP)
  • 中文刊名:YXQB
  • 英文刊名:Journal of Medical Informatics
  • 机构:中国医学科学院医学信息研究所;
  • 出版日期:2018-12-25
  • 出版单位:医学信息学杂志
  • 年:2018
  • 期:v.39;No.289
  • 基金:中国医学科学院医学与健康科技创新工程项目(项目编号:2017-I2M-3-014);中国医学科学院中央级公益性科研院所基本科研业务费项目(项目编号:2018PT33005)
  • 语种:中文;
  • 页:YXQB201812015
  • 页数:6
  • CN:12
  • ISSN:11-5447/R
  • 分类号:52-57
摘要
从系统架构、语料库构建、应用效果3方面阐述临床文本自然语言处理系统cTAKES构建方法,从设计基于开源框架的系统架构、开发模块化组件、构建临床语料库、注重创新以及针对中文特点建设系统5个方面提出对我国中文临床文本自然语言处理系统构建的建议。
        The paper elaborates on the method for building cTAKES, a clinical text natural language processing system, from the three aspects of system architecture, building of corpus and application effect, and puts forward recommendations on the building of Chinese clinical text natural language processing system from the five aspects, including design of system architecture based on open source framework, development of modular components, building of clinical corpus, attention given to innovation and building of the system in the light of the feature of Chinese language.
引文
1包小源,黄婉晶,张凯,等.非结构化电-子病历中信息抽取的定制化方法[J].北京大学学报(医学版),2018, 50(2):256-263.
    2 Meystre SM, Savova GK, Kipper-Schuler KC, et al. Extracting Information from Textual Documents in the Electronic Health Record:a review of recent research[J]. IMIA Year book of Medical Informatics 2008, 47(1):128-144.
    3 Khudairi, Sally. The Apache Software Foundation Announces Apache cTAKES v4. 0[EB/OL].[2018-11-27]. https://globenewswire. com/news-release/2017/04/25/970806/0/en/The-Apache-Software-Foundation-Announces-Apache-cTAKES-v4-0. html.
    4 Jovanovic J, Bagheri E. Semantic Annotation in Biomedicine:the current landscape[J]. Journal of Biomedical Semantics, 2017, 8(1):44.
    5 Becker M, Bckmann B. Extraction of UMLS Concepts Using Apache cTAKES for German Language[J]. Stud Health Technol Inform, 2016(223):71-76.
    6 James Masanz. cTAKES 4. 0 Component Use Guide[EB/OL].[2018-11-27]. https://cwiki. apache. org/confluence/display/CTAKES/cTAKES+4.0+Component+Use+Guide.
    7 Eleni Miltsakaki, Rashmi Prasad, Aravind Joshi, et al.The Penn Discourse Treebank[EB/OL].[2018-11-27]. https://alliance.seas.upenn.edu/-nlp/publications/pdf/miltsakaki2004. pdf.
    8 S Kulick, A Bies, M Liberman, M Mandel, et al. Integrated Annotation for Biomedical Information Extraction[C] Boston:HLT/NAACL2004 Workshop:Biolink, 2004:61-68.
    9 Savova G K, Masanz J J, Ogren P V, et al. Mayo Clinical Text Analysis and Knowledge Extraction System(cTAKES):architecture, component evaluation and applications[J].Journal of the American Medical Informatics Association Jamia, 2010, 17(5):507.
    10 Lars-Erik Bruce. Apache UIMA and Mayo cTAKES UIMA and How It Is Used in the Clinical Domain[EB/OL].[2018-11-27]. https://www.uio.no/studier/emner/matnat/ifi/INF5880/v12/undervisningsmateriale/seminar. pdf.
    11 Pramod Chandrayan. A Guide To NLP Implementation Using OpenNLP:making machines speak[EB/OL].[2018-11-27]. https://codeburst. io/nlp-implementation-using-java-opennlp-guide-and-examples-80d86b02b5b5.
    12 Sha R, Pereira F. Shallow Parsing with Conditional Random Fields[C]. Edmonton:NLT-NAACL 2003 workshop:2003, 134-141.
    13 David Ferrucci, Adam Lally. UIMA:an architectural approach to unstructured information processing in the corporate research environment[EB/OL].[2018-11-27].https://pdfs. semanticscholar. org/9f8e/b04dbafdfda997ac5e06cd6c521f82bf4e4c. pdf.
    14 Agah A. Medical Applications of Artificial Intelligence[M].Boca Raton:CRC Press, Inc. 2013, 387-388.
    15苏嘉,吴昊,杨锦锋,等.基于中文电子病历的心血管疾病风险因素标注体系及语料库构建[J].自动化学报,2017, 44(X):1-7.
    16 Hui W, Weide Z, Qiang Z, et al. Extracting important information from Chinese Operation Notes with natural language processing methods[J]. Journal of Biomedical Informatics, 2014,(48):130-136.
    17 James Masanz. cTAKES 4.0-YTEX SentenceAnnotator[EB/OL].[2017-11-27]. https://cwiki. apache.org/confluence/display/CTAKES/cTAKES+4.0+-+YTEX+SentenceAnnotator.
    18 Olivier Bodenreider. The UMLS and the Semantic Web[EB/OL].[2018-11-27]. https://www. w3. org/wiki/images/7/71/HCISIG_BioRDF_Subgroup%24%24Meetings%24%242008-09-22_Conference_Call%24080922-BioRDF-UMLS-1. pdf.
    19 Lesk M. Automatic Sense Disambiguation Using Machine Readable Dictionaries:how to tell a pine cone from an ice cream cone[C]. New York:Proceedings of the 5th Annual International Gonference on Systems Documentation, 1986:24-26.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700