Using Statistical Search to Discover Semantic Relations of Political Lexica – Evidences from Bulgarian-Slovak EUROPARL 7 Corpus

详细信息查看全文

关键词：Data mining ; Combinatorics on words ; Machine translation
刊名：Lecture Notes in Computer Science
出版年：2016
出版时间：2016
年：2016
卷：9582
期：1
页码：335-339
全文大小：2,346 KB
参考文献：1.Gale, W., Church, K.: A program for aligning sentences in bilingual corpora. Comput. Linguist. 19(1), 5–102 (1993)
2.Kilgarriff, A., Reddy, S., Pomikalek, J., Avinesh, P.: A corpus factory for many languages. In: Proceedings of the LREC 2010, pp. 904–910 (2010)
3.Kilgarriff, A., Rundell, M.: Lexical profiling software and its lexicographic applications: a case study. In: Proceedings from EURALEX 2002, pp. 807–811 (2002)
4.Kilgarriff, A., Rychly, P., Smrz, P., Tugwell, D.: The sketch engine. In: Proceedings from EURALEX 2004, pp. 105–116 (2004)
5.Koehn, P.: Europarl: A parallel corpus for statistical machine translation. In: Proceedings from MT Summit, pp. 79–86 (2005)
6.Michelfeit, J.: Parallel corpora in sketch engine. In: Sketch Engine Workshop IV, Tallinn (2013) (presentation)
7.Ondrejovic, S.: Between purism and glocalism. In: Sociolinguistica Slovaca, vol. 8, pp. 25–32. VEDA (2014)
8.Stoykova, V., Petkova, E.: Automatic extraction of mathematical terms for precalculus. In: Proceedia Technology, vol. 1, pp. 464–468. Elsevier (2012)
9.Stoykova, V., Simkova, M., Majchrakova, D., Gajdosova, K.: Detecting time expressions for bulgarian and slovak language from electronic text corpora. Proc. Soc. Behav. Sci. 186, 257–260 (2015). ElsevierCrossRef
作者单位：Velislava Stoykova (16)

16. Institute for Bulgarian Language - BAS, 52, Shipchensky Proh. Str., Bl. 17, 1113, Sofia, Bulgaria
丛书名：Mathematical Aspects of Computer and Information Sciences
ISBN：978-3-319-32859-1
刊物类别：Computer Science
刊物主题：Artificial Intelligence and Robotics
Computer Communication Networks
Software Engineering
Data Encryption
Database Management
Computation by Abstract Devices
Algorithm Analysis and Problem Complexity
出版者：Springer Berlin / Heidelberg
ISSN：1611-3349
卷排序：9582

文摘

The paper presents statistical approach to discover semantic relations of political lexica using parallel Bulgarian-Slovak EUROPARL 7 Corpus. It employs statistical properties incorporated in the Sketch Engine software to generate concordances, co-occurrences and collocations. A comparative analysis of semantic structure of political lexica investigating synonymic, attributive and reciprocal semantic relations of most frequent key words from two parallel corpora – for both Bulgarian and Slovak languages is offered. The paper address some issue related to correct terms discovery, their translations and use in political speech. Finally, more general conclusions about semantic properties of political lexica are presented.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700