A better approach for web page classification using Multi Conditional Markov Random Walks and graph partitioning algorithm.
详细信息   
  • 作者:Betheli ; Avinash Reddy.
  • 学历:Master
  • 年:2011
  • 导师:Hao,Wei-Da,eadvisor
  • 毕业院校:Texas A&M University
  • ISBN:9781267241931
  • CBH:1507876
  • Country:USA
  • 语种:English
  • FileSize:2766280
  • Pages:63
文摘
Identification of the importance of a webpage or a document by a search engine is given by a rank. Important and relevant page should receive higher ranks for a given query. Different techniques of webpage classification have unique web crawling preferences. Most of the time user preferences for quality and diverse results are given least importance. Many search engines use hyperlinked relevancy criteria and link analysis algorithms,like Googles PageRank,HITS,and others. Due to the search algorithm used by search engines,the web pages of least user importance are listed for the searched keyword in the query. Based on a thorough literature review,we proposed a Multi Conditional Markov Random Walk Model which adds weighted factor to the existing Markov Model,while ranking the webpages and improves overall quality as well as the diversified results. Random walk is applied to the results to study the user random walk behavior,and the probability of visiting each page. The experimental evaluation is carried out with the help of a text based search engine,where we apply our algorithm and compare the result set to that of Googles PageRank. Keywords- Webpage classification,Webpage ranking,PageRank,random surfer,Markov Random Walk.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700