摘要
贝页斯数学模型在文本分类计算中得到广泛应用。过滤模型原理简单、运算效率高,保证了文本分类准确,但同时也产生一定偏差。利用贝页斯数学模型[1],针对测试样本集合的变化,分析研究贝页斯过滤规则的变化规律。为设计一种过滤方案提供理论依据。
Computational Bayesian models have been widely used in text classification,filtering models are simple in principles,high efficiency in calculation,and filtering results are correct, However,a certain deviation also have done. Using Bayesian model, based on the change of test sample sets, analysis of the variation of Bayesian filtering rules,to provide a theoretical basis for the design of a filter.
引文
[1]Kush Erick,N.Learning to Remove Internet Advertisements.Proceedings of the 3rd International conference on autonomous agents,pp.175-181,seattle,Washington,1999
[2]Cohen,W.W.Learning rule that Classify E-Mail.Proceedings of the AAAI Spring Symposium on Machine learning In information Access,Stanford,California,1996
[3]Hall,R.J.how to avoid unwanted e-mail.Communication of ACM,1998,41(3):88-95
[4]王宁,张建忠.基于改进贝叶斯模型的中文邮件分类算法[J].计算机工程与应用,2006,31(8):75-78
[5]段宏斌,张健.改进的Naive Bayes技术在反垃圾邮件系统中的应用[J].西北大学学报:自然科学版,2006,36(5):737-740
[6]PAUL Graham.Better Bayesian Filtering[OL].http://paulgraham.com/better.html 2005
[7]RICHARD O D.贝叶斯决策论模式分类[M].北京:机械工业出版社,2003:80-95