文本地理编码关键技术研究与分析
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research and analysis on key technologies of text geocoding
  • 作者:闫梦宇 ; 钟志农 ; 景宁 ; 吴烨
  • 英文作者:YAN Mengyu;ZHONG Zhinong;JING Ning;WU Ye;College of Electronic Science,National University of Defense Technology;
  • 关键词:地理信息检索 ; 地理编码 ; 地理实体识别 ; 地理实体消歧 ; 文本位置聚焦 ; 语言模型
  • 英文关键词:geographic information retrieval;;geocoding;;geographic entity recognition;;geographic entity disambiguation;;text location focus;;language model
  • 中文刊名:CHTB
  • 英文刊名:Bulletin of Surveying and Mapping
  • 机构:国防科技大学电子科学学院;
  • 出版日期:2019-05-25
  • 出版单位:测绘通报
  • 年:2019
  • 期:No.506
  • 语种:中文;
  • 页:CHTB201905015
  • 页数:5
  • CN:05
  • ISSN:11-2246/P
  • 分类号:80-84
摘要
随着互联网应用的发展,所产生的非结构化文本大多与地理位置相关联,因此,地理信息检索(GIR)成为当前GIS和IR领域研究的热点。文本地理编码是建立文本与地理位置坐标对应关系的过程,是实现GIR的基础。本文对文本地理编码涉及的地理实体识别、地理实体消歧、文本位置聚焦、区域语言建模等关键技术进行分类总结,提出了该领域未来研究工作和面临的挑战,为文本地理编码进一步相关研究提供新思路。
        With the development of Internet applications,the resulting unstructured text is mostly associated with geographic location.Therefore,geographic information retrieval( GIR) has become a research hotspot in the field of GIS and IR. Text geocoding is the process of establishing the correspondence between text and geographic coordinates,and it is the basis for implementing GIR. This paper classifies key technologies such as geographic entity recognition,geographic entity disambiguation,text location focus,and regional language modeling involved in text geocoding,and proposes future research work and challenges in this field,and further research on text geocoding,provide new ideas for further related research on text geocoding.
引文
[1] LEIDNER J L,LIEBERMAN M D. Detecting geographical references in the form of place names and associated spatial natural language[J]. SIGSPATIAL Special,2011,3(2):5-11.
    [2] LIEBERMAN M D,SAMET H. Multifaceted toponym recognition for streaming news[C]∥Proceedings of International ACM SIGIR Conference on Research&Development in Information Retrieval.[S. l.]:ACM,2011.
    [3] WOODRU A G,PLAUNT C. GIPSY:georeferenced information processing system[J]. Journal of the American Society for Information Science, 1994,45(9),645-655.
    [4] AMITAY E,HAR’EL N,SIVAN R,et al. Web-awhere:geotagging web content[C]∥Proceedings of International ACM SIGIR Conference on Research&Development in Information Retrieval.[S. l.]:ACM,2004.
    [5] PURVES R S,CLOUGH P,JONES C B,et al. The design and implementation of SPIRIT:a spatially aware search engine for information retrieval on the Internet[J]. International Journal of Geographical Information Science,2007,21(7):717-745.
    [6] TEITLER B E,LIEBERMAN M D,PANOZZO D,et al.News Stand:a new view on news[C]∥Proceedings of ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems.[S. l.]:ACM,2008.
    [7] BILHAUT F,CHARNOIS T,ENJALBERT P,et al.Geographic reference analysis for geographic document querying[C]∥Workshop on the Analysis of Geographic References.[S. l.]:Universitéde Caen BasseNormandie,2003:55-62.
    [8] SMART P D,JONES C B,TWAROCH F A. Multisource toponym data integration and mediation for a Meta-Gazetteer service[M]. Berlin:Springer Berlin Heidelberg,2010.
    [9] WANG W. Automated spatiotemporal and semantic information extraction for hazards[C]∥Proceedings of Dissertations&Theses-Gradworks.[S.l.]:University of Lowa,2014.
    [10]唐旭日,陈小荷,张雪英.中文文本的地名解析方法研究[J].武汉大学学报(信息科学版),2010,35(8):930-935.
    [11]邬伦,刘磊,李浩然,等.基于条件随机场的中文地名识别方法[J].武汉大学学报(信息科学版),2017,42(2):150-156.
    [12]廖健平.基于中文文本的地名要素关联方法[D].南京:南京师范大学,2016.
    [13]孙虹,陈俊杰.双层CRF与规则相结合的中文地名识别方法研究[J].计算机应用与软件,2014(11):175-177.
    [14] INKPEN D,JI L,FARZINDAR A,et al. Location detection and disambiguation from twitter messages[J].Journal of Intelligent Information Systems,2017,49(1):1-17.
    [15]王星光,张瑞洁,张毅.基于地理关联度和证据理论的地名消歧方法研究[J].北京大学学报,2017,53(2):344-352.
    [16]马雷雷,李宏伟,连世伟,等.地名知识辅助的中文地名消歧方法[J].地理与地理信息科学,2016,32(4):5-10.
    [17] HOSOKAWA Y. Improving vertical geo/geo disambiguation by increasing geographical feature weights of places[C]∥Proceedings of ACM Research in Applied Computation Symposium.[S.l.]:ACM,2012.
    [18] LI G,HU J,FENG J,et al. Effective location identification from microblogs[C]∥Proceedings of IEEE International Conference on Data Engineering.[S.l.]:IEEE,2014.
    [19] WING B P,BALDRIDGE J. Simple supervised document geolocation with geodesic grids[C]∥Proceedings of the Association for Computational Linguistics:Human Language Technologies.[S.l.]:DBLP,2012.
    [20] ADAMS B T,JANOWICZ K. On the geo-indicativeness of non-georeferenced text[C]∥Proceedings of International AAAI Conference on Weblogs and Social Media.[S.l.]:AAAI,2012.
    [21] CHENG Z,CAVERLEE J,LEE K. You are where you tweet:a content-based approach to geo-locating twitter users[J]. CIKM,2010,19(4):759-768.
    [22] LAERE O V,SCHOCKAERT S,DHOEDT B. Towards automated georeferencing of Flickr photos[C]∥Proceedings of Workshop on Geographic Information Retrieval.[S.l.]:ACM,2010.
    [23] KORDOPATISZILOSG,PAPADOPOULOS S,KOMPATSIARIS Y. Geotagging social media content with a refined language modelling approach[J]. Lecture Notes in Computer Science,2015,9074:21-40.
    [24] CHA M,GWON Y,KUNG H T. Twitter geolocation and regional classification via sparse coding[C]∥Proceedings of ACM Sigplan Conference on Programming Language Design&Implementation.[S. l.]:ACM,2003.
    [25] KORDOPATIS-ZILOS G,PAPADOPOULOS S,KOMPATSIARIS I. Geotagging text content with language models and feature mining[C]∥Proceedings of the IEEE.[S.l.]:IEEE,2017:1-16.
    [26] ROLLER S,SPERIOSU M,RALLAPALLI S,et al.Supervised text-based geolocation using language models on an adaptive grid[C]∥Proceedings of Joint Conference on Empirical Methods in Natural Language Processing&Computational Natural Language Learning.[S.l.]:University of Texas at Austin,2012.
    [27] HAUFF C,HOUBEN G J. Geo-location estimation of Flickr images:social web based enrichment[C]∥Proceedings of European Conference on Information Retrieval. Berlin:Springer,2012.
    [28] TOYAMA K,LOGAN R,ROSEWAY A. Geographic location tags on digital images[C]∥Proceedings of the7th ACM International Conference on Multimedia.[S.l.]:ACM,2003.
    [29] VAN LAERE O,SCHOCKAERT S,DHOEDT B. Finding locations of Flickr resources using language models and similarity search[C]∥Proceedings of ACM International Conference on Multimedia Retrieval.[S. l.]:ACM,2011.
    [30] LAERE O V,SCHOCKAERT S,TANASESCU V,et al.Georeferencing Wikipedia documents using data from social media sources[J]. ACM Transactions on Information Systems,2014,32(3):1-32.
    [31] ZUO J,WANG M,WAN J,et al. Information retrieval model combining sentence level retrieval[C]∥Proceedings of International Conference on Asian Language Processing.[S. l.]:IEEE Computer Society,2013.
    [32]张春菊,张雪英,朱少楠,等.基于网络爬虫的地名数据库维护方法[J].地球信息科学学报,2011,13(4):492-499.
    [33]周源,郑灿辉,刘禹鑫.基于众包模式的地理信息采集开发与应用研究[J].测绘与空间地理信息,2016(12):92-94.
    [34] LIN T L,ALMEIDA J,PENATTI O A B,et al. A rank aggregation framework for video multimodal geocoding[J]. Multimedia Tools&Applications,2014,73(3):1323-1359.
    [35] HAYS J,EFROS A A. IM2GPS:estimating geographic information from a single image[C]∥Proceedings of IEEE Conference on Computer Vision&Pattern Recognition.[S.l.]:IEEE,2008.
    [36] LOURENTZOU I,MORALES A,ZHAI C X. Text-based geolocation prediction of social media users with neural networks[C]∥Proceedings of 2017 IEEE International Conference on Big Data(Big Data).[S. l.]:IEEE,2018.
    [37] THOMAS P,HENNIG L. Twitter geolocation prediction using neural networks[C]∥Proceedings of International Conference of the German Society for Computational Linguistics&Language Technology.[S. l.]:Springer,2017.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700