MapReduce based location selection algorithm for utility maximization with capacity constraints
详细信息    查看全文
  • 作者:Yu Sun (1)
    Jianzhong Qi (1)
    Rui Zhang (1)
    Yueguo Chen (2)
    Xiaoyong Du (2)

    1. Department of Computing and Information Systems
    ; University of Melbourne ; Melbourne ; Australia
    2. Key Laboratory of Data Engineering and Knowledge Engineering
    ; Renmin University of China ; Beijing ; China
  • 关键词:Location selection ; Capacity constraints ; MapReduce ; 68W15
  • 刊名:Computing
  • 出版年:2015
  • 出版时间:April 2015
  • 年:2015
  • 卷:97
  • 期:4
  • 页码:403-423
  • 全文大小:973 KB
  • 参考文献:1. Al-Khateeb A, Rashid NA, Abdullah R (2012) An enhanced meta-scheduling system for grid computing that considers the job type and priority. Computing, pp 389鈥?10
    2. Dean J, Ghemawat S (2004) Mapreduce: Simplified data processing on large clusters. OSDI, pp 137鈥?50
    3. Gufler B, Augsten N, Reiser A, Kemper A (2011) Handling data skew in mapreduce. In: The first international conference on cloud computing and services, science, pp 574鈥?83
    4. Gufler B, Augsten N, Reiser A, Kemper A (2012) Load balancing in mapreduce based on scalable cardinality estimates. ICDE, pp 522鈥?33
    5. Hale, TS, Moberg, CR (2003) Location science research: a review. Ann Oper Res 123: pp. 21-35 CrossRef
    6. Huang J, Wen Z, Pathan M, Taylor K, Xue Y, Zhang R (2011) Ranking locations for facility selection based on potential influences. In: The 37th annual conference on IEEE industrial electronics society, pp 2411鈥?416
    7. Huang J, Wen Z, Qi J, Zhang R, Chen J, He Z (2011) Top-k most influential locations selection. CIKM, pp 2377鈥?380
    8. Huang J, Zhang R, Buyya R, Chen J (2014) Melody-join: efficient earth mover鈥檚 distance similarity join using mapreduce. ICDE
    9. Kahraman, C, Ruan, D, Doan, I (2003) Fuzzy group decision-making for facility location selection. Inf Sci 157: pp. 135-153 CrossRef
    10. Klose, A, Drexl, A (2005) Facility location models for distribution system design. Eur J Oper Res 162: pp. 4-29 CrossRef
    11. Kolb L, Thor A, Rahm E (2012) Load balancing for mapreduce-based entity resolution. ICDE, pp 618鈥?29
    12. Korn F, Muthukrishnan S (2000) Influence sets based on reverse nearest neighbor queries. SIGMOD, pp 201鈥?12
    13. Kwon Y, Balazinska M, Howe B, Rolia J (2012) Skewtune: mitigating skew in mapreduce applications. SIGMOD, pp 25鈥?6
    14. Lu, W, Shen, Y, Chen, S, Ooi, BC (2012) Efficient processing of k nearest neighbor joins using mapreduce. Proc. VLDB Endow. 5: pp. 1016-1027 CrossRef
    15. Melkote, S, Daskin, MS (2001) Capacitated facility location/network design problems. Eur J Oper Res 129: pp. 481-495 CrossRef
    16. Melo, M, Nickel, S, Saldanha da Gama, F (2006) Dynamic multi-commodity capacitated facility location: a mathematical modeling framework for strategic supply chain planning. Comput Oper Res 33: pp. 181-208 CrossRef
    17. Melo, MT, Nickel, S, Saldanha-Da-Gama, F (2009) Facility location and supply chain management-a review. Eur J Oper Res 196: pp. 401-412 CrossRef
    18. Nutanong, S, Tanin, E, Zhang, R (2010) Incremental evaluation of visible nearest neighbor queries. TKDE 22: pp. 665-681
    19. Nutanong S, Zhang R, Tanin E, Kulik L (2010) Analysis and evaluation of v*- / knn: an efficient algorithm for moving knn queries. VLDB J 19(3):307鈥?32
    20. Qi J, Zhang R, Kulik L, Lin D, Xue Y (2012) The min-dist location selection query. ICDE, pp 366鈥?77
    21. Qiao Y, von Bochmann G (2012) Load balancing in peer-to-peer systems using a diffusive approach. Computing, pp 649鈥?78
    22. Quan, X, Wenyin, L, Dou, W, Xiong, H, Ge, Y (2012) Link graph analysis for business site selection. Computer 45: pp. 64-69 CrossRef
    23. Revelle, CS, Eiselt, HA, Daskin, MS (2008) A bibliography for some fundamental problem categories in discrete location science. Eur J Oper Res 184: pp. 817-848 CrossRef
    24. Sun Y, Huang J, Chen Y, Du X, Zhang R (2012) Top-k most incremental location selection with capacity constraint. WAIM, pp 165鈥?71
    25. Sun Y, Huang J, Chen Y, Zhang R, Du X (2012) Location selection for utility maximization with capacity constraints. CIKM, pp 2154鈥?158
    26. Tao Y, Lin W, Xiao X (2013) Minimal mapreduce algorithms. SIGMOD
    27. Mouratidis, LHUK, Yiu, ML, Mamoulis, N (2010) Optimal matching between spatial datasets under capacity constraints. TODS 35: pp. 9:1-9:44
    28. Wong, RC-W, 脰zsu, MT, Fu, AW-C, Yu, PS, Liu, L, Liu, Y (2011) Maximizing bichromatic reverse nearest neighbor for l p -norm in two- and three-dimensional spaces. VLDB J 20: pp. 893-919 CrossRef
    29. Wong RC-W, Tao Y, Fu AW-C, Xiao X (2007) On efficient spatial matching. VLDB, pp 579鈥?90
    30. Xia T, Zhang D, Kanoulas E, Du Y (2005) On computing top-t most influential spatial sites. VLDB, pp 946鈥?57
    31. Yan D, Wong RC-W, Ng W (2011) Efficient methods for finding influential locations with adaptive grids. CIKM, pp 1475鈥?484
    32. Yu, C, Zhang, R, Huang, Y, Xiong, H (2010) High-dimensional knn joins with incremental updates. GeoInformatica 14: pp. 55-82 CrossRef
    33. Yuan J, Zheng Y, Xie X (2012) Discovering regions of different functions in a city using human mobility and pois. KDD, pp 186鈥?94
    34. Zhan L, Zhang Y, Zhang W, Lin X (2012) Finding top k most influential spatial facilities over uncertain objects. CIKM, pp 922鈥?31
    35. Zhang D, Du Y, Xia T, Tao Y (2006) Progressive computation of the min-dist optimal-location query. VLDB, pp 643鈥?54
    36. Zhang, R, Jagadish, HV, Dai, BT, Ramamohanarao, K (2010) Optimized algorithms for predictive range and knn queries on moving objects. Inf Syst 35: pp. 911-932 CrossRef
    37. Zheng, K, Huang, Z, zhou, A, Zhou, X (2012) Discovering the most influential sites over uncertain data: a rank-based approach. TKDE 24: pp. 2156-2169
    38. Zhou Z, Wu W, Li X, Lee ML, Hsu W (2011) Maxfirst for maxbrknn. ICDE, pp 828鈥?39
  • 刊物类别:Mathematics and Statistics
  • 刊物主题:Mathematics
    Computational Mathematics and Numerical Analysis
  • 出版者:Springer Wien
  • ISSN:1436-5057
文摘
Given a set of facility objects and a set of client objects, where each client is served by her nearest facility and each facility is constrained by a service capacity, we study how to find all the locations on which if a new facility with a given capacity is established, the number of served clients is maximized (in other words, the utility of the facilities is maximized). This problem is intrinsically difficult. An existing algorithm with an exponential complexity is not scalable and cannot handle this problem on large data sets. Therefore, we propose to solve the problem through parallel computing, in particular using MapReduce. We propose an arc-based method to divide the search space into disjoint partitions. For load balancing, we propose a dynamic strategy to assign partitions to reducers so that the estimated load difference is within a threshold. We conduct extensive experiments using both real and synthetic data sets of large sizes. The results demonstrate the efficiency and scalability of the algorithm.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700