基于实体识别的软件开发问答网站中的API讨论主题分析
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:ANALYSIS OF API DISCUSSION TOPICS IN SOFTWARE DEVELOPMENT Q&A WEBSITE BASED ON ENTITY RECOGNITION
  • 作者:和晓健 ; 彭鑫 ; 赵文耘
  • 英文作者:He Xiaojian;Peng Xin;Zhao Wenyun;Software School,Fudan University;Shanghai Key Laboratory of Data Science;
  • 关键词:API ; 实体识别 ; 软件开发问答网站
  • 英文关键词:API;;Entity recognition;;Software development Q&A website
  • 中文刊名:JYRJ
  • 英文刊名:Computer Applications and Software
  • 机构:复旦大学软件学院;上海市数据科学重点实验室;
  • 出版日期:2019-07-12
  • 出版单位:计算机应用与软件
  • 年:2019
  • 期:v.36
  • 基金:科技部重点研发计划项目(2016YFB1000801)
  • 语种:中文;
  • 页:JYRJ201907037
  • 页数:6
  • CN:07
  • ISSN:31-1260/TP
  • 分类号:219-223+229
摘要
目前软件领域中软件开发问答网站应用广泛,但是针对网站用户的API讨论情况的研究较少。对于用户在问答网站中的API讨论在句式、语义上的研究,将帮助后续研究人员更好地构造自然语言处理程序,自动化提取网站中用户讨论的核心内容,开展其他方面的研究。收集Java与Android API集合,定义规则生成API的别名库,使用文本匹配的方法对Stack Overflow帖子中的API进行实体识别。人工分析Stack Overflow中用户对10个常用API的讨论后,得到API常以导入语句和赋值表达式的形式出现在不规范句子中,在规范语句中作为主语和宾语,用户倾向于讨论程序错误、原理及用法介绍和同类API对比,以及用户习惯省略方法参数或过长的全限定名的结论。
        At present, the software Q&A website is very popular, but there are few studies on the API discussion of website users. The syntax and semantics research of the user's API discussion can help follow-up researchers to better construct the natural language processing program, automatically extract the key point of the user discussion in the website, and carry out other research. We collected Java and Android API collections, defined rules to generate API aliases library, and used text matching methods to recognize API entities in Stack Overflow posts. After manually analyzing the 10 common APIs in Stack Overflow, we found that the API often appeared in the form of import statements and assignment expressions in irregular sentences, as the subject and object normally, and users tended to discuss program errors, principles and usage and comparisons of similar API, as well as user habits to omit method parameters or long full qualified name.
引文
[1] Dagenais B,Robillard M P.Recovering traceability links between an API and its learning resources[C]//International Conference on Software Engineering.IEEE Press,2012:47-57.
    [2] Rigby P C,Robillard M P.Discovering essential code elements in informal documentation[C]//International Conference on Software Engineering.IEEE,2013:832-841.
    [3] Subramanian S,Inozemtseva L,Holmes R.Live API documentation[C]//International Conference on Software Engineering.ACM,2014:643-652.
    [4] Treude C,Robillard M P.Augmenting API documentation with insights from stack overflow[C]//International Conference on Software Engineering.ACM,2016:392-403.
    [5] Jiang H,Zhang J,Ren Z,et al.An unsupervised approach for discovering relevant tutorial fragments for APIs[C]//IEEE/ACM International Conference on Software Engineering.IEEE,2017:38-48.
    [6] Zhou Y,Gu R,Chen T,et al.Analyzing APIs documentation and code to detect directive defects[C]//IEEE/ACM International Conference on Software Engineering.IEEE,2017:27-37.
    [7] Tian Y,Thung F,Sharma A,et al.APIBot:Question answering bot for API documentation[C]//2017 32nd IEEE/ACM International Conference on Automated Software Engineering(ASE).IEEE Computer Society,2017:153-158.
    [8] Anderson A,Huttenlocher D,Kleinberg J.Discovering value from community activity on focused question answering sites:a case study of Stack Overflow[C]//ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.ACM,2012:850-858.
    [9] Vasilescu B,Filkov V,Serebrenik A.StackOverflow and GitHub:Associations between software development and crowdsourced knowledge[C]//International Conference on Social Computing.IEEE Computer Society,2013:188-195.
    [10] Movshovitzattias D,Movshovitzattias Y,Steenkiste P,et al.Analysis of the reputation system and user contributions on a question answering website:StackOverflow[C]//International Conference on Advances in Social Networks Analysis & Mining.IEEE Computer Society,2013:886-893.
    [11] Ye D,Xing Z,Foo C Y,et al.Software-specific named entity recognition in software engineering social content[C]//IEEE International Conference on Software Analysis,Evolution,and Reengineering.IEEE,2016:90-101.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700