详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
The rapid development of web technology greatly enriches accessible information resources. However, these resources come with some inherent insufficiencies such as disorder and mixture of junk, making user acquisition of information difficult. The Web Information Acquistion Service (WIAS) means to provide users with Web information products and services to meet their personal network information needs through modern information technology, with pull and push being the main two strategies. Adaptive techniques for WIAS adjust the service behavior to users' information needs, information source characteristics, system load and other factors dynamically, and provide high quality information efficiently and humanizedly.
     Accurate and complete understanding of users' information needs lays foundations of WIAS. Web users are simultaneously consumers and producers of Web information, therefore it is feasible to obtain users' needs through the analysis of their browsing content, behavior and also published information and etc. Once the informaion needs are obtained, retrieving relevant results from the vast amount of Web resources and then presenting them in a more humanized style are keys to the success of WIAS. Besides, as users usually require high time validity on information acquisition, ensuring the performance of WIAS shall also be a vital part of the research on information acquistion.
     To address the above issues, an adaptive information pull technique based on the measurement of user requests' ambiguity is firstly proposed. The demonstration styles of pulling results are decided adaptively according to the quantified ambiguity of user requests. For result filtering and demonstration styles, a ranking algorithm and a clustering algorithm based on the combination of multi-features are proposed correspondingly. These two algorithms are validated using two kinds of respresentive emerging Internet resources: multimedia resources (images for example in the paper) and dynamic resources with frequent updating (blog for example in the paper).
     Secondly, an adaptive information push technique is proposed based on user modeling for information publishers and browsers. Blogs, the popular personal information publishing platform, are taken as the research environment for information publishers and a modeling approach using blog posts is proposed, based on which communities of bloggers with similar preferences in the blogspace are partitioned and recommended as friends. Meanwhile, for information browers, current browsing content is regarded as the evidences for users' profiles and a contextual advertising method based on sentiment and topic analysis is proposed, which ensures the promoted advertisments are not only topic relevant but also conformable the underlying users' attitudes and therefore makes them more targeted.
     After then, we propose a hybrid strategy to distributed index organization in search engine (a typical information pull application), which named Loc-Glob. It is both high performance and scalable. Some optimization strategies are proposed on Loc-Glob further. To smooth the workload across index servers, index is re-distributed and duplicated based on the analysis of index terms workload and user query streams. Query path across index servers is also optimized based on the real-time workload to improve system load-balancing level.
     Based on the above work, a blog information acquistion prototype system adopting adaptive techniques is designed and implemented. This system provides novel applications such as blog search engine, blog friends recommending, advertisement promoting and etc. to validate the feasibility of the adaptive techniques proposed in this paper for the two types of information acquistion services.
     Finally, conclusions and future work are presented.
    7http://www.cipher-sys.com/,Cipher System竞争情报系统。
    [1]. 刘渊.互联网信息服务理论与实证——用户使用、服务提供与行业发展.科学 出版社,2007.
    [2]. 阿尔文.托夫勒(吴迎春译).权力的转移.中信出版社,2006.
    [3]. 胡泳,范海燕.网络为王.海南出版社,1997.
    [4]. 中国互联网信息中心.第21次中国互联网络发展状况统计报告.2008, http://www.cnnic.net.cn/uploadfiles/pdf/2008/1/17/104156.pdf.
    [5]. 任志纯,李恩科,李东.穆尔斯定律及其扩展.情报杂志,2002.21(11):39-40.
    [6]. 张晓静.论网络信息资源管理.现代情报,2003.23(8):70-71.
    [7]. N. Belkin, B. Croft. Information Filtering and Information Retrieval: Two Sides of the Same Coin?. Communications of ACM, 1992. 35(12):29-38.
    [8]. 杨震.个性化信息获取方法的研究[博士学位论文].大连理工大学,2004.
    [9]. 王辉,陈凌,张丽娟.信息推拉技术.情报科学,2004.22(12):1440-1443.
    [10]. 鄢朝晖,方宜仙.个性化信息服务的新形式——论信息推拉服务.吉首大学 学报:社会科学版,2007.28(3):150-153.
    [11]. J. Allan, J. Aslam, N. Belkin, et al. Challenges in Information Retrieval andLanguage Modeling. Report of a Workshop held at CIIR, University ofMassachusetts Amherst, 2002.
    [12]. A. Singhal. Challenges in Running a Commercial Search Engine. Proceedings ofthe 28~9(th) annual international ACM SIGIR conference on Research anddevelopment in information retrieval, 2005, pp.432-432.
    [13]. M. R. Henzinger, R. Motwani, G Silverstein. Challenges in Web Search Engines.Proc. of the 18~(th) International Joint Conference on Artificial Intelligence, 2003,pp.1573-1579.
    [14]. D. H. Widyantoro, T. R. Ioerger, J. Yen. Learning User Interest Dynamics with aThree-Descriptor Representation. Journal of the American Society for InformationScience and Technology, 2001. 52(3):212-225.
    [15]. P. Anick. Using terminological feedback for Web search refinement: a log-basedstudy. Proc. of 13~(th) International World Wide Web Conference, 2004, pp.89-95.
    [16]. X.H. Shen, B. Tan, C.X. Zhai. Context-Sensitive Information Retrieval UsingImplicit Feedback. Proc. of the 28th annual international ACM SIGIR conferenceon Research and development in information retrieval, 2005, pp.43-50.
    [17]. F. Qiu, J.H. Cho. Automatic Identification of User Interest For PersonalizedSearch. Proc. of 15~(th) International World Wide Web Conference, 2006,pp.727-736.
    [18]. K. Sugiyama, K. Hatano, M. Yoshikawa. Adaptive Web Search Based on UserProfile Constructed without Any Effort from Users. Proc. of 13~(th) InternationalWorld Wide Web Conference, 2004, pp.675-684.
    [19]. J.T. Sun, H.J. Zeng, H. Liu. CubeSVD: A Novel Approach to Personalized WebSearch. Proc. of 14~(th) International World Wide Web Conference, 2005,pp.382-390.
    [20]. T. Joachims. Optimizing Search Engines using Click-through Data. Proc. of the11~(th) ACM international conference on Knowledge discovery in data mining, 2005,pp.133-142.
    [21].J. Teevan, S.T. Dumais, E. Horvitz. Personalizing Search via Automated Analysisof Interests and Activities. Proc. of the 28~(th) annual international ACM SIGIRconference on Research and development in information retrieval, 2005,pp.449-456.
    [22]. 李晓明,闫宏飞,王继民.搜索引擎——原理、技术与系统.科学出版社,2004.
    [23]. B. Baeza-Yates, B. RIbeiro-Neto. Modern Information Retrieval. Addison-Wesley,1999.
    [24]. G Salton, M.E. Lesk. Computer Evaluation of Indexing and Text Processing.Journal of the ACM, 1968.15(1):8-36.
    [25]. C.J. van Rijsbergen. Information Retrieval. Butterworths, 1979.
    [26]. S.E. Robertson, C.J. van Rijsbergen, M.F. Porter. Probabilistic models of indexingand searching. Proceedings of the 3~(rd) annual ACM conference on Research anddevelopment in information retrieval, 1980, pp.35-56.
    [27]. H.R. Turtle, W.B. Croft. Inference Networks for Document Retrieval. In Proceedings of the 13~(th) Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1990, pp.1-24.
    [28]. H.R. Turtle, W.B. Croft. Evaluation of an Inference Network-based Retrieval Model. ACM Transactions on Information Systems, 1991. 9(3):187-222.
    [29]. J.P. Callan, W.B. Croft, S.M. Harding. The INQUERY retrieval system. In Proceedings of the 3th International Conference on Database and Expert Systems Applications, 1992, pp.78-83.
    [30]. S. Brin, L. Page. The Anatomy of a Large Scale Hypertextual Web Search Engine. Proc. of 7~(th) International World Wide Web Conference, 1998, pp. 107-117.
    [31]. L. Page, S. Brin, R. Motwani, T. Winograd. The PageRank Citation Ranking: Bringing Order to the Web. Stanford Digital Library Technologies Project TR, 1999.
    [32]. J. Kleinberg. Authoritative sources in a hyperlinked environment. In Proc. 9th Ann. ACM-SIAM Symp. Discrete Algorithms, 1998, pp.668-677.
    [33]. B.J. Jansen, A. Spink, J. Bateman, T. Saracevic. Real Life Information Retrieval: A Study of User Queries on the Web. ACM SIGIR Forum, 1998. 32(1):5-17.
    [34]. H. Joho, J.M. Jose. A Comparative Study of the Effectiveness of Search Result Presentation on the Web. Proc. of the 28th European Conference on Information Retrieval, 2006, pp.302-313.
    [35]. P. Jacso. Clustering search results, Part I: web-wide search engines. Online Information Review, 2007. 31(1):85-91.
    [36]. P, Jacso. Clustering search results, Part II:search engines for highly structured databases. Online Information Review, 2007. 31(2):234-241.
    [37]. O. Zamir, O, Etzioni. Web document clustering: a feasibility demonstration. In Proceedings of the 19th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1998, pp.46-54.
    [38]. H.J. Zeng, Q.C. He, Z. Chen, W.Y. Ma, J.W. Ma. Learning to Cluster Web Search Results. Proceedings of the 27th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2004, pp.210-217.
    [39]. G. Adomavicius, A. Tuzhilin. Toward the Next Generation of Recommender System: A Survey of the State-of-the-Art and Possible Extensions. IEEE Transactions on Knowledge and Data Engineering, 2005. 17(6):734-749.
    [40]. G. Adomavicius, R. Sankaranarayanan, S. Sen, A. Tuzhilin. Incorporating Contextual Information in Recommender Systems Using a Multidimensional Approach. ACM Transactions on Information Systems, 2005.23(1):103-145.
    [41]. R. Burke. Hybrid Recommender Systems: Survey and Experiments. User Modeling and User-Adapted Interaction, 2002.12(4):331-370.
    [42]. K. Aas, L. Eikvil. Text Categorisation: A Survey. Technical report, Norwegian: Norwegian Computer Center, 1999.
    [43]. R.J. Mooney, P.N. Bennett, and L. Roy. Book Recommending Using Text Categorization with Extracted Information. Proc. Recommender Systems Papers from 1998 Workshop, Technical Report WS-98-08,1998.
    [44]. M. Pazzani and D. Billsus. Learning and Revising User Profiles: The Identification of Interesting Web Sites. Machine Learning, 1997. 27(3):313-331.
    [45]. U. Shardanand and P. Maes. Social Information Filtering: Algorithms for Automating 'Word of Mouth'. Proc. Conf. Human Factors in Computing Systems, 1995, pp.210-217.
    [46]. B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-Based Collaborative Filtering Recommendation Algorithms. Proc. 10th International World Wide Web Conference, 2001, pp.285-295.
    [47]. M. Claypool, A. Gokhale, T. Miranda, P. Murnikov, D. Netes, and M. Sartin. Combining Content-Based and Collaborative Filters in an Online Newspaper. Proc. ACM SIGIR '99 Workshop Recommender Systems: Algorithms and Evaluation, Aug. 1999.
    [48]. P. Melville, R.J. Mooney, R. nagarajan. Content-Boosted Collaborative Filtering for Improved Recommendations. Proc of the 18th international conference on Artificial Intelligence, 2002, pp. 187-192.
    [49]. M. Degemmis, P. Lops, G. Semeraro. A content-collaborative recommender that exploits WordNet-based user profiles for neighborhood formation. User Modeling and User-Adapted Interaction, 2007.17(3):217-255.
    [50]. G. Semeraro, M. Degemmis, P. Lops, P. Basile. Combining Learning and Word Sense Disambiguation for Intelligent User Profiling. Proc. of the 20~(th) International Joint Conferences on Artificial Intelligence, 2007, pp.2856-2861.
    [51]. D. Albrecht, I. Zukerman. Introduction to the special issue on statistical and probabilistic methods for user modeling. User Modeling and User-Adapted Interaction, 2007.17(1):1-4.
    [52]. R. Krovetz, W.B. Croft. Lexical ambiguity and information retrieval. ACM Transaction on Information retrieval, 1992.10(2):115-441.
    [53]. M. Sanderson, K. van Rijsbergen. The impact on retrieval effectiveness of skewed frequency distributions. ACM Transactions on Information Systems, 1999. 17(4):440-465.
    [54]. H. Schutze, J. Pederson. Information retrieval based on word senses. In Proceedings of the 4th Annual Symposium on Document Analysis and Information Retrieval, 1995, pp. 161 -175.
    [55]. S. Cronen-Townsend, W.B. Croft. Quantifying query ambiguity. In Proceedings of Human Language Technology 2002,2002, pp.94-98.
    [56]. S. Cronen-Townsend, Y. Zhou, W. B. Croft. Predicting query performance. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, ACM Press, 2002, pp.299-306.
    [57]. I. Soboroff. Overview of the trec 2004 novelty track. In Proceedings of the Thirteenth Text Retrieval Conference, NIST Special Publication, 2004, pp.500-261.
    [58]. M.L. Kherfi, D. Ziou, A. Bernardi. Image Retrieval from the World Wide Web: Issues, Techniques, and Systems. ACM Computing Surveys, 2004. 36(1):25-67.
    [59]. Y. Choi, E.M. Rasmussen. Searching for Images: The Analysis of Users' Queries for Image Retrieval in American History. Journal of the America Society for Information Science and Technology, 2003. 54(6):498-511.
    [60]. C. Frankel, M. Swain, and V. Athitsos. Webseer: An Image Search Engine for the World Wide Web. IEEE Conf. on CVPR, 1997.
    [61]. T.A.S. Coelho, P.P. Calado, L.V. Souza, B. Ribeiro-Neto, R. Muntz. ImageRetrieval Using Multiple Evidence Ranking. IEEE Trans. KDE, 2004.16(4):408-417.
    [62]. G Carneiro, N. Vasconcelos. A Database Centric View of Semantic ImageAnnotation and Retrieval. Proc. 28th Int'l ACM SIGIR conf. on Research anddevelopment in IR, 2005, pp.559-566.
    [63]. R. Entlich. FAQ-Image search engine, http://www.rlg.org/preserv/diginews/diginews5-6.html#faq.
    [64]. Y.T. Zhuang, Q. Li, R.W.H.Lau. Web-Based Image Retrieval: a Hybrid Approach.Proc. Computer Graphics Int'l, 2001, pp.62-69.
    [65]. M. Lei, J.Y. Wang, B.J. Chen, X.M. Li. Improved Relevance Ranking inWebGather. Journal of Computer Science and Technology, 2001.16(5):410-417.
    [66]. M.S. Branicky, V.S. Borkar, S.K. Mitter. A unified framework for hybrid control:Model and optimal control theory. IEEE TRANSACTIONS ON AUTOMATICCONTROL, 1998.43(1):31-46.
    [67]. J. Broglio, J.P. Callan, W.B. Croft, D. W Nachbar. Document Retrieval andRouting Using the INQUERY System. In D.K. Harman, editor, Overview of theTREC-3,1995,pp.29-38.
    [68]. K. Fujimura, H. Toda, T. Inoue and N. Hiroshima. BLOGRANGER-AMulti-faceted Blog Search Engine. In Proceedings of the WWW 2006 Workshopon the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2006.
    [69]. K. Fujimura, T. Inoue and M. Sugizaki. The EigenRumor Algorithm for RankingBlogs. In Proceedings of the WWW 2005 Workshop on the WebloggingEcosystem: Aggregation, Analysis and Dynamics, 2005.
    [70]. Bloglines: http://www.bloglines.com.
    [71]. Blogpulse: http://www.blogpulse.com.
    [72]. D. Beeferman and A. Berger. Agglomerative Clustering of a Search Engine QueryLog. In Proceedings of the sixth ACM SIGKDD international conference onKnowledge discovery and data mining, 2000, pp.407-416.
    [73]. L. Gravano, V. Hatzivassiloglou and R. Lichtenstein. Categorizing Web Queries According to Geographical Locality. In Proceedings of the twelfth international conference on Information and knowledge management, 2003, pp.325-333.
    [74]. D. Shen, R. Pan, J.T. Sun, J.J. Pan, K. Wu, J. Yin and Q. Yang. Q2C@UST: Our Winning Solution to Query Classification in KDDCUP 2005. In ACM SIGKDD Explorations Newsletter, 2005, pp.100-110.
    [75]. M.D. Mulvenna., S.S. Anand and A.G Buchner. Personalization on the Net using Web mining: introduction. Communications of the ACM, 2000.43(8):122-125.
    [76]. M. Eirinaki, M. Vazirgiannis. Web mining for web personalization. ACM Transaction on Internet Technology, 2003. 3(1):1-27.
    [77]. G.I. Webb, M.J. Pazzani and D. Billsus. Machine Learning for User Modeling. User Modeling and User-Adapted Interaction, 2004. 11(1-2):19-29.
    [78]. G. Mishne. Multiple Ranking Strategies for Opinion Retrieval in Blogs. In Proceedings of the fifteenth Text Retrieval Conference (TREC 2006), 2006.
    [79]. G. Mishne and M. de Rijke. A study of blog search. In Proceedings of ECIR 2006, 2006,pp.289-301.
    [80]. H. LIU, X. XIE, X. TANG et al. Effective Browsing of Web Image Search Results. Proceedings of the 6th ACM SIGMM International workshop on Multimedia information retrieval, 2004, pp.84-90.
    [81]. B. LUO, X.G. WANG. X.O. TANG A World Wide Web Based Image Search Engine Using Text and Image Content Features. Proceedings of IS&T/SPIE Electronic Imaging 2003,2003, pp.123-130.
    [82]. B. GAO, T.Y. LIU, T. QIN, et al. Web image clustering by consistent utilization of visual features and surrounding texts. Proceedings of the 13th annual ACM International Conference on Multimedia, 2005, pp. 112-121.
    [83]. X.J. WANG, W.Y. MA, L. ZHANG, et al. Iteratively clustering Web images based on link and attribute reinforcements. Proceedings of the 13th annual ACM International Conference on Multimedia, 2005, pp. 122-131.
    [84]. D. CAI, X.F. HE, Z.W. LI, et al. Hierarchical Clustering of WWW Image Search Results Using Visual, Textual and Link Analysis. Proceedings of the 12th annual ACM International Conference on Multimedia, 2004, pp.952-959.
    [85]. J.A. HARTIGAN, M.A. WONG A K-means clustering algorithm. Applied Statistics, 1979.28(1):100-108.
    [86]. K. VENKATALAKSHMI, P. PRAISY, R. MARAGATHAVALLI,et al. Multispectral Image Clustering Using Enhanced Genetic k-Means Algorithm. Information Technology Journal, 2007. 6(4):554-560.
    [87]. N. VENKATESWARAN, Y.V. RAO RAMANA. K-Means Clustering Based Image Compression in Wavelet Domain. Information Technology Journal, 2007. 6(1):148-153.
    [88]. L.D. WANG. Clustering WWW Image Search Results Using Color Histogram. and Textual Information. USA, The University of Wisconsin Madison: Computer Science Department, 2006.
    [89]. K. Balog, M.D. Rijke. Decomposing Bloggers Moods, 3rd Workshop on Weblogging Ecosystem, WWW 2006.
    [90]. G. Mishne. Experiments with Mood Classification in Blog Posts. 1st Workshop on Stylistic Analysis of Text for Information Access, SIGIR 2005.
    [91]. X.C Ni, G.R Xue, X. Ling, et al. Exploring in the Weblog Space by Detecting Informative and Affective Articles. Proc. of the 15th International Conference on World Wide Web, 2007, pp.281-290.
    [92]. T.Fukuhara, T.Murayama, T.Nishida. Analyzing concerns of people using Weblog articles and real world temporal data, 2nd Workshop on the Weblogging Ecosystem, WWW 2005.
    [93]. M. Thelwall. Bloggers during the London attacks: Top information sources and topics. 3rd Workshop on the Weblogging Ecosystem, WWW 2006.
    [94]. V. Vapnik. Principles of Risk Minimization for Learning Theory. Advances in Neural Information Processding Systems, Morgan Kaufmann, 1992, pp.831-838.
    [95]. A. Qamra, B. Tseng, E.Y. Chang. Mining Blog Stories Using Community-Based and Temporal Clustering. Proceedings of the 15th ACM international conference on Information and knowledge management, 2006, pp.58-67.
    [96]. E. Adar, L.A. Adamic. Tracking information epidemics in blogspace. Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, 2005, pp.207-214.
    [97]. R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. On the bursty evolution of blogspace. In Proceedings of the12th International Conference on World Wide Web (WWW), 2003, pp.568-576.
    [98]. B. L. Tseng, J. Tatemura, and Y. Wu. Tomographic clustering to visualize blog communities as mountain views. In Proceedings of 2nd Annual Workshop on the Weblogging Ecosystem, 2005.
    [99]. Z. Wu and R. Leahy. An optimal graph theoretic approach to data clustering: Theory and its application to image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1993.15(11):1101-1113.
    [100]. J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000.22(8):888-905.
    [101]. A. Broder, M. Fontoura, et al. A Semantic Approach to Contextual Advertising. Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, 2007, pp.559-566.
    [102]. H.K. Bhargava, et al. Paid Placement Strategies for Internet Search Engines. Proceedings of the 11th international conference on World Wide Web, 2002, pp.117-123.
    [103]. S. Mccoy, A. Everard, et al. The Effects of Online Advertising. In Communications of the ACM, 2007. 50(3):84-88.
    [104]. B. Ribeiro-Neto, M. Cristo, et al. Impedance Coupling in Content-targeted Advertising. Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, 2005, pp.496-503.
    [105]. W.T. Yih, J. Goodman and V.R. Carvalho. Finding Advertising Keywords on Web Pages. Proceedings of the 15th international conference on World Wide Web, 2006, pp.213-222.
    [106]. A. Lacerda, M. Cristo, et al. Learning to Advertise. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, 2006, pp.549-556.
    [107]. C.N. Wang, P. Zhang, et al. Understanding Consumers Attitude Toward Advertising. In Proceedings eighth Americas Conference on Information Systems, 2002.
    [108]. J. Feng, H.K.Bhargava, et al. Comparison of allocation rules for paid placement advertising in search engines. In Proceedings of the 5th International Conference on Electronic Commerce, 2003, pp.294-299.
    [109]. Q.Z. Mei, X. Ling, et al. Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs. Proceedings of the 16th international conference on World Wide Web, 2007, pp. 171-180.
    [110]. M. Hu and B. Liu. Mining and Summarizing Customer Reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004, pp.168-177.
    [111]. V. Hatzivassiloglou and K.R. McKeown. Predicting the semantic orientation of adjectives. In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics, 1997, pp.174-181.
    [112]. X.W. Ding and B. Liu. The Utility of Linguistic Rules in Opinion Mining. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, 2007, pp.811-812.
    [113]. P. D. Turney and M. L. Littman. Measuring praise and criticism: Inference of semantic orientation from association. In ACM Transactions on Information Systems, 2003. 21(4):315-346.
    [114]. M. Zhou, C.N. Huang. An Efficient Syntactic Tagging Toll for Corpora. In Proceedings of the 15th conference on Computational linguistics - Volume 2,1994, pp.949-955.
    [115]. L.W. Ku, Y.T. Liang, H.H. Chen. Opinion Extraction, Summarization and Tracking in News and Blog Corpora. AAAI Spring Symposia 2006 on Computational Approaches to Analyzing Weblogs, 2006.
    [116]. G. Mishne and M.D. Rijke. Language Model Mixtures for Contextual Ad Placement in Personal Blogs. In Proceedings of 5th International Conference on NLP (FinTAL), 2006, pp.435-446.
    [117]. B. RIBEIRO-NETO, R. BARBOSA. Query performance for tightly coupled distributed digital libriaries. Proceedings of 3rd ACM conference on digital libraries, 1998, pp. 182-190.
    [118]. A. MAC, J.A. MCCANN, S.E. ROBERTSON. Parallel search using partitioned inverted files. Proceedings of 7th international symposium on string processing and information retrieval, 2000, pp.209-220.
    [119]. L.A BARROSO, J. DEAN, U. HOLZLE. Web search for a planet: the google cluster architecture. IEEE Micro, 2003.23(2):22-28.
    [120]. S. MELNIK, S. RAGHAVAN, B. YANG, et al. Building a distributed full-text index for the web. Proceedings of the 10th international conference on World Wide Web, 2001, pp.396-406.
    [121]. S. BUTTCHER, C.L.A. CLARKE, B. LUSHMAN. Hybrid index maintenance for growing text collections. Proceedings of the 29th ACM SIGIR conference on research and development in information retrieval, 2006, pp.356-363.
    [122]. N. LESTER, A. MOFFAT, J. ZOBEL. Fast on-line index construction by geometric partitioning. Proceedings of the 14th ACM international conference on information. and knowledge management, 2006, pp.776-783.
    [123]. B.S. JEONG, E. OMIECINSKI. Inverted file partitioning schemes in multiple disk systems. IEEE transaction on parallel and distributed systems, 1995. 6(2):142-153.
    [124]. J. ZOBEL, A. MOFFAT. Inverted files for text search engines. ACM computing surveys, 2006. 38(2):No.6.
    [125]. C. BADUE, B. RIBEIRO-NETO, R. BAEZA-YATES, et al. Distributed query processing using partitioned inverted files. Proceedings of 8th international symposium on string processing and information retrieval, 2001, pp. 20-20.
    [126]. A. TOMASIC, H. GARCIA-MOLINA. Performance of inverted indices in shared-nothing distributed text document information retrieval systems. Proceedings of 2nd international conference on parallel and distributed information systems, 1993, pp.8-17.
    [127]. R. LEMPEL, S. MORAN. Optimizing result prefetching in web search engines with segmented indices. ACM transaction on Internet technology, 2004. 4(1):31-59.
    [128]. A. MOFFAT, W. WEBBER, J. ZOBEL, et al. A pipelined architecture fordistributed text query evaluation. Information retrieval, 2006.10(3):205-231.
    [129]. A. MOFFAT, W. WEBBER, J. ZOBEL. Load balancing for term-distributedparallel retrieval. Proceedings of the 29th ACM SIGIR conference on research anddevelopment in information retrieval, 2006, pp.348-355.
    [130]. H.E. WILLIAMS, J. ZOBEL, D. BAHLE. Fast phrase querying with combinedindexes. ACM transaction on information systems, 2004. 22(4):573-594.
    [131]. M.S. KIM, K.Y. WHANG, J.G LEE, et al. n-gram/2L: a space and time efficienttwo-level n-gram inverted index structure. Proceedings of the 31st internationalconference on very large databases, 2005, pp.325-336.
    [132]. W. Webber, A. Moffat. In search of reliable retrieval experiments. Proc. 10thAustralasian Document Computing Symposium, 2005, pp.26-33.
    [133]. 潘云鹤,王金龙,徐从富.数据流频繁模式挖掘研究进展,2006.32(4):594-602.
    [134]. GS. Manku, R. Motwani. Approximate Frequency Counts over Data Streams.Proceedings of the 28th international conference on Very Large Data Bases, 2002,pp.346-357.
    [135]. A. Arasu, GS. Manku. Approximate Counts and Quantiles over Sliding Windows.Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium onPrinciples of database systems, 2004, pp.286-296.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700