Automatic parallelization of XQuery programs on multi-core systems
详细信息    查看全文
  • 作者:Rongxin Chen ; Husheng Liao ; Zongyue Wang ; Hang Su
  • 关键词:XQuery ; Automatic parallelization ; Intermediate language ; Cost model ; Multi ; core computing
  • 刊名:The Journal of Supercomputing
  • 出版年:2016
  • 出版时间:April 2016
  • 年:2016
  • 卷:72
  • 期:4
  • 页码:1517-1548
  • 全文大小:1,632 KB
  • 参考文献:1.Alexandrov A, Ewen S, Heimel M, Hueske F, Kao O, Markl V, Nijkamp E, Warneke D (2011) MapReduce and PACT-comparing data parallel programming models. In: Proceedings of the 14th conference on database systems for business, technology, and web (BTW), pp 25–44
    2.Bamford R, Borkar V, Brantner M, Fischer PM, Florescu D, Graf D, Kossmann D, Kraska T, Muresan D, Nasoi S (2009) XQuery reloaded. Proc VLDB Endow 2(2):1342–1353CrossRef
    3.Berglund A, Boag S, Chamberlin D, Fernandez MF, Kay M, Robie J, Siméon J (2015) XML Path Language (XPath) 2.0 (Second Edition). http://​www.​w3.​org/​TR/​xpath20/​
    4.Berglund A, Fernández M, Malhotra A, Marsh J, Nagy M, Walsh N (2015) XQuery 1.0 and XPath 2.0 Data Model (XDM) (Second Edition). http://​www.​w3.​org/​TR/​xpath-datamodel/​
    5.Bidoit N, Colazzo D, Sartiani C, Solimando A, Ulliana F (2015) Andromeda: a system for processing queries and updates on big XML documents. In: Morzy T, Valduriez P, Bellatreche L (eds) New trends in databases and information systems: ADBIS 2015 short papers and workshops, BigDap, DCSA, GID, MEBIS, OAIS, SW4CH, WISARD. Springer International Publishing, Cham, pp 218–228. doi:10.​1007/​978-3-319-23201-0_​24
    6.Boag S, Chamberlin D, Fernández MF, Florescu D, Robie J, Siméon J, Stefanescu M (2015) XQuery 1.0: An XML Query Language (Second Edition). http://​www.​w3.​org/​TR/​xquery/​
    7.Bordawekar R, Lim L, Kementsietsidis A, Kok BWL (2010) Statistics-based parallelization of XPath queries in shared memory systems. In: Proceedings of the 13th International Conference on Extending Database Technology (EDBT2010), pp 159–170
    8.Camachorodríguez J, Colazzo D, Manolescu I (2014) PAXQuery: a massively parallel XQuery processor. In: Proceedings of workshop on data analytics in the cloud (Danac’14), pp 1–4. doi:10.​1145/​2627770.​2627772
    9.Carman EP Jr, Vinayak TW, Borkar R, Michael J, Carey, Tsotras VJ (2015) Apache VXQuery: a scalable XQuery implementation. arXiv:​1504.​00331 (CoRR)
    10.Chamberlin D, Fankhauser P, Florescu D, Marchiori M, Robie J (2007) XML query use cases. http://​www.​w3.​org/​TR/​xquery-use-cases/​
    11.Chen R, Liao H, Chen W (2011) XML parsing schema based on parallel sub-tree construction. Comput Sci 38(3):191–194
    12.Damigos M, Gergatsoulis M, Plitsos S (2014) Distributed processing of XPath queries using mapReduce. In: Catania B, Cerquitelli T, Chiusano S et al. (eds) New trends in databases and information systems: 17th east european conference on advances in databases and information systems. Springer International Publishing, Cham, pp 69–77. doi:10.​1007/​978-3-319-01863-8_​8
    13.Dongarra J (2007) The promise and perils of the coming multicore revolution and its impact. CTWatch Q 3(1):1–33MathSciNet
    14.Draper D, Fankhauser P, Fernández MF, Malhotra A, Rose K, Rys M, Siméon J, Wadler P (2015) XQuery 1.0 and XPath 2.0 Formal Semantics (Second Edition). http://​www.​w3.​org/​TR/​xquery-semantics/​
    15.Fegaras L, Dash R, Wang YH (2006) A fully pipelined XQuery processor. In: XQuery Implementation, Experience and Perspectives (XIME-P’06) Workshop
    16.Fernández M, Jim T, Morton K, Onose N, Simeon J (2007) DXQ: a distributed XQuery scripting language. In: Proceedings of the 4th international workshop on XQuery implementation, experience and perspectives, pp 1–6
    17.Fluet M, Rainey M, Reppy J, Shaw A (2008) Implicitly-threaded parallelism in Manticore. ACM Sigplan Not 9:119–130CrossRef MATH
    18.Frasincar F, Houben G-J, Pau C (2002) XAL: an algebra for XML query optimization. Aust Comput Sci Commun 24 (2):49–56. doi:10.​1145/​563932.​563912
    19.Hammond K (1994) Parallel functional programming: an introduction. In: International Symposium on Parallel Symbolic Computation, Citeseer
    20.Herlihy M, Shavit N (2008) The art of multiprocessor programming. Morgan Kaufmann Publisher, Burlington
    21.Hidaka S, Kato H, Yoshikawa M (2007) A relative cost model for XQuery. In: Proceedings of the 2007 ACM symposium on Applied computing, pp 1332–1333
    22.Jones SP, Leshchinskiy R, Keller G, Chakravarty MMT (2008) Harnessing the multicores: nested data parallelism in Haskell. In: IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2008) Dagstuhl, Citeseer, Germany
    23.Kahn G (1987) Natural semantics. Springer, HeidelbergCrossRef MATH
    24.Kepser S (2004) A simple proof for the Turing-completeness of XSLT and XQuery. In: Extreme Markup Languages
    25.Khatchadourian S, Consens MP, Siméon J (2011) Having a ChuQL at XML on the Cloud. In: Mendelzon Int’l A. Workshop
    26.Kling P, Ozsu MT, Daudjee K (2010) Distributed XML query processing: Fragmentation, localization and pruning. University of Waterloo, Technical Report CS-2010-02
    27.Li X (2006) Efficient and parallel evaluation of XQuery. The Ohio State University
    28.Liao H (2013) Principles and implementation techniques of XQuery. Science Press, Beijing
    29.Liao H, Shan W, Gao H (2013) Automatic parallelization of XQuery programs. J Softw 8(4):842–851CrossRef
    30.Liao H, Tang L (2009) A decorrelation method based on XQA Query algebra. J Beijing Univ Technol 35(8):1108–1114
    31.Loogen R, Ortega-Mallén Y, Peña-Marí R (2005) Parallel functional programming in Eden. J Funct Program 15(03):431–475CrossRef MATH
    32.Lu W, Gannon D (2007) Parallel XML processing by work stealing. In: Proceedings of the 2007 workshop on Service-oriented computing performance: aspects, issues, and approaches, pp 31–38
    33.Machdi I, Amagasa T, Kitagawa H (2010) Parallel holistic twig joins on a multi-core system. Int J Web Inform Syst 6(2):149–177CrossRef
    34.Miao H, Nie T, Yue D, Zhang T, Liu J (2012) Algebra for parallel XQuery processing. In: Bao Z, Gao Y, Gu Y et al. (eds) Web-Age information management: WAIM 2012 international workshops: GDMM, IWSN, MDSP, USDM, and XMLDM. Springer, Berlin, Heidelberg, pp 1–10. doi:10.​1007/​978-3-642-33050-6_​1
    35.Quan Y, Liao H (2014) A Task Scheduling Algorithm for Automated Parallel Processing of XQuery. In: the Sixth International Symposium on Parallel Architectures, Algorithms and Programming (PAAP2014) 2014. pp 250–254
    36.Re C, Brinkley J, Hinshaw K, Suciu D (2004) Distributed xquery. In: Workshop on Information Integration on the Web, Citeseer, pp 116–121
    37.Re C, Siméon J, Fernandez M (2006) A complete and efficient algebraic compiler for XQuery. In: Proceedings of the 22nd International Conference on Data Engineering (ICDE’06), pp 14–25
    38.Schmidt A, Waas F, Kersten M, Carey MJ, Manolescu I, Busse R (2002) XMark: a benchmark for XML data management. In: Proceedings of the 28th international conference on Very Large Data Bases, 2002. VLDB Endowment, pp 974–985
    39.Seshadri P, Pirahesh H, Leung TYC (1996) Complex query decorrelation. In: Proceedings of the Twelfth International Conference on Data Engineering, pp 450–458
    40.Shnaiderman L, Shmueli O (2015) Multi-Core Processing of XML Twig Patterns. IEEE Trans Knowl Data Eng 27(4):1057–1070CrossRef
    41.Sivaramakrishnan KC, Ziarek L, Prasad R, Jagannathan S (2010) Lightweight asynchrony using parasitic threads. In: Proceedings of the 5th ACM SIGPLAN workshop on Declarative aspects of multicore programming (DAMP2010), pp 63–72
    42.Spoonhower D, Blelloch GE, Harper R (2007) A Semantic Framework for Scheduling Parallel Programs. Computer Science Department 886
    43.Sutter H (2005) The free lunch is over: a fundamental turn toward concurrency in software. http://​www.​gotw.​ca/​publications/​concurrency-ddj.​htm
    44.Wilkinson B, Allen M (2005) Parallel programming: techniques and applications using networked workstations and parallel computers, 2nd edn. Prentice hall, New Jersey
    45.Yui M, Miyazaki J, Uemura S, Kato H (2008) XBird/D: distributed and parallel XQuery processing using remote proxy. In: Proceedings of the 2008 ACM symposium on Applied computing, pp 1003–1007
    46.Zhang X, Liao H (2010) A framework for XQuery system with XML algebra and tree pattern query. J Front Comput Sci Technol 4(11):996–1004MathSciNet
    47.Zhang Y, Tang N, Boncz P (2009) Efficient distribution of full-fledged XQuery. In: IEEE 25th International Conference on Data Engineering (ICDE’09) IEEE, pp 565–576
  • 作者单位:Rongxin Chen (1)
    Husheng Liao (2)
    Zongyue Wang (1)
    Hang Su (3)

    1. Computer Engineering College, Jimei University, Xiamen, China
    2. School of Software Engineering, Beijing University of Technology, Beijing, China
    3. College of Computer Science, Beijing University of Technology, Beijing, China
  • 刊物类别:Computer Science
  • 刊物主题:Programming Languages, Compilers and Interpreters
    Processor Architectures
    Computer Science, general
  • 出版者:Springer Netherlands
  • ISSN:1573-0484
文摘
The popularity of multi-core systems makes software parallelization become an important way to improve performance. As a mainstream XML query language, XQuery is the core of XML processing. It is critical to take full advantage of multi-core computing to improve XML processing performance through parallelization of XQuery. However, usually it is difficult to parallelize XQuery programs because of the nested style of XQuery expressions. Moreover, implicit parallelism is necessary to simplify the development of parallel XML application. In this paper, we propose an automatic parallelization approach, which can automatically select proper types of parallelism for a specific XQuery query. Specifically, we propose a functional intermediate language called pFXQL (parallel Functional XML Query Language) to describe parallel query plans. pFXQL has parallel semantics and is well complied with XQuery. We propose a cost model to effectively support plan generation and the selection of a preferred plan. The model estimates both computational cost and parallel cost. We implement our approach in XQuery engine and conduct experiments on various multi-core systems. Experimental results verify the effectiveness of our approach.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700