用户名: 密码: 验证码:
电子商务环境中信息快速加密及内容安全管理相关技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在电子商务应用飞速发展的今天,电子商务环境中的安全问题研究具有现实的经济及社会意义。
     通过对电子商务环境中计算机系统层面、电子商务应用层面以及内容安全层面等层面的安全威胁进行整理和分析,并重点分析电子商务在实际应用过程中会存在的安全问题,包括B2B应用模式中信息交互的实时性要求为电子商务应用带来的信息安全加解密速度问题;作为商业交易的虚拟社区,电子商务环境中的企业关注的用户消费行为分析、第三方运营商关注的用户恶意评论控制以及政府关心的反洗钱等信息内容安全问题。最终定位出本文的研究重点,即电子商务环境中加解密快速实现方法和网络用户反馈信息的挖掘分析算法实现。
     本文的主要工作包括:
     1、提出了一种有限域乘法快速实现方法。对于电子商务B2B应用模式中用户之间信息交互实时性强和安全性高的需求,虽然通过使用ECC和AES进行对称密码和非对称密码混合加密的方法,即利用ECC进行会话密钥加密,但是实际应用发现,当信息交互实时性要求高的应用场景下,ECC的模乘速度依然是影响实际应用效率的重要因素。为提高椭圆曲线密码应用系统中有限域上乘法计算速度,在Ⅱ类最优正规基的基础上,本文提出了一种改进的基域乘法实现算法,完成一次基域乘法,只需要进行2m+1次循环移位和1.5m次的向量XOR和m+1次向量AND运算。通过软件仿真及FPGA工程实践表明,与现有的算法相比,使用本算法能够显著提高模乘算法的效率。本算法成功用于某B2B电子商务网站的实际应用中;
     2、提出了一种基于随机游走的文本情感分类方法(SCG)。针对电子商务环境中Web用户的反馈信息管理,特别是恶意舆论的管理问题,文本情感分类技术具有重要的应用价值。本文提出了一种自动标注文本中词语的情感倾向性算法,该算法首先根据文本训练集,比如产品评价,建立词语的依赖关系图,图中的每个节点对应一个词,如果两个节点之间存在边,则表示相连的两个词在同一个句子中出现;然后利用随机游走算法一次性计算出图中所有词的情感倾向值;最后在得到的词的情感倾向值的基础上,用于计算实际文本集的情感类别。通过真实数据集进行实验表明,SCG这种对于文本情感分类的新算法比传统SVM以及SO-PMI算法具有更好的效果;
     3、提出了一种对混合数据进行聚类分析的新算法(E-ROCK)。针对电子商务B2C应用模式中客户挽留、个性化产品推荐的问题,一般的做法是通过对用户的行为信息进行聚类分析,以实现个性化服务的目的。但是目前的聚类算法研究主要都集中在对数值型数据或者分类型数据进行聚类分析,而不能准确地处理包括用户ID、访问时间、用户访问网页的URL链接、交易记录、商品类型以及消费数量等在内的混合数据。通过对现有聚类算法进行综述和对比分析,ROCK算法虽然只能处理分类型数据,但是其效率及适应性存在优势,在ROCK算法的基础上进行扩展,提出了一种能够同时处理数值型数据和分类型数据的混合数据聚类分析的新算法(E-ROCK算法),实验结果表明新算法对真实的用户数据具有很好的聚类效果。最后,介绍了分别采用SCG算法和E-ROCK算法的用户反馈分析系统和产品推荐系统在实际B2C电子商务平台中的应用结构。
Along with rapid development of Electronic Commerce today, the study on the security technology in E-Commerce environment is absolutely necessary and significant.
     The security threat in E-Commerce environment is revieweded from the point of view of computer system level, E-Commerce application level and content security level. The security issue in the process of real application of E-Commerce is emphatically analyzed. It includes the fast information encryption issue base on the high real-time performance in the application of B2B E-Commerce model; the analysis of customer behavior in the E-Commerce environment which is concerned by enterprise in the application of B2C E-Commerce model; the other content security issues, such as the management of the malicious comment from web user which is concerned by the third-part carrier of E-Commerce applications platform, the anti-money laundering issue which is concerned by government. At the end, studying on the technology of fast information encryption and content security management in E-Commerce envirment is located as the study points.
     The attributions of the paper are:
     (1) An implementation method of fast modular multiplication in finite fields is proposed. With the requirement of high real-time performance and security during the information is exchanged in the application of B2B E-Commerce model, the hybrid encryption of AES and ECC usually is adopted for the application. In this solution, ECC is used to management the session key. When the real-time performance is especially emphasized, the process of real application in B2B E-Commerce model see the efficiency of ECC process is the key influencing factor. To improve the efficiency of ECC process, a faster modular multiplication in finite fields is requested. An improved algorithm base on the optimal normal basis (ONB) of type II is proposed in this paper. The proposed multiplier only requires (2m+1) cyclic shift operations,15m XOR gates and (m+1) AND gates to vectors.The results of stimulation with software and implementation on hardware show that the proposed method highly improves the modular multiplication efficiency compared with existing methods. The prospoed method was used in the process of real application of B2B E-Commerce model successfully;
     (2) A random walk method for sentiment classification (SCG) is presented. With the management of feedback information, especially those malicious comments, from web user in the E-Commerce environment, the technology of sentiment classification is very useful in many applications. In this paper, a novel method to tag words sentiment automatically is proposed. In this method, a word association graph is firstly constructed from text corpus, i.e. product reviews, in which each node is a word and if there is an edge between two words, it means the two words co-occur in the same sentence. And then, with a random walk algorithm, the sentiment score is calculated for all the words in the graph at one time. To show the effectiveness of our method, the sentiment tagging results are then used for sentiment classification on real data set. The experimental results show that the sentiment classification results with SCG are better than the compared methods, such ad SVM and SO-PMI;
     (3) A clustering method (E-ROCK) based on mixed data for customer behavior pattern discovering is presented. To deal with the issue of retaining customers and product recommendation in the application of B2C E-Commerce model, clustering is a reliable and efficient technology which used to discover customer behavior pattern and improve the personalization of E-Commerce systems. However, current research on clustering algorithm usually based on numeric data or categorical data, and is not suitable for mixed data set which including both numeric data and categorical data, such as the user ID, access time, the customer visited pages'URL, record of trades, commodity type, consumption etc..According the analysis of those current mainstream clustering methods, ROCK is choosed as the prototype algorithm in the research. As ROCK is only suitable for handling categorical data, to analysis customer behavior, mixed data set must be handled. With extending the ROCK algorithm, a novel method (E-ROCK) to deal with mixed data set is proposed in this paper. Experiment with real application data shows the E-ROCK algorithm is efficient and successful. At the end, the framework of a real existing B2C E-Commerce platform is introduced. The platform include two background sub-systems, feedback information analysis system and product recommendation system, where the SCG algorithm and E-ROCK algorithm are applied.
引文
[1]何潇.2009-2010年中国电子商务市场回顾与展望[R].中国,赛迪资讯:中国电子商务研究中心,2010
    [2]杨义先,钮心忻.入侵检测理论与技术[M].北京:高等教育出版社(长江学者论丛),2006
    [3]周亚建,郑康锋,杨义先,钮心忻.网络安全加固技术[M].北京:电子工业出版社,2007
    [4]牛少彰,崔宝江,李剑.信息安全概论(第2版)[M].北京:北京邮电大学出版社,2007
    [5]Julia-Barcelo.R. A new legal framework for electronic contracts:the EU electronic commerce proposal [J]. Computer Law & Security Report,1999, 15(3):23-28
    [6]Zuccato. Holistic security requirement engineering for electronic commerce [J]. Computers and Security,2004,23(1):123-132
    [7]Juang,Wen-Shenq. A practical anonymous payment scheme for electronic commerce [J]. Computers and Mathematics with Applications,2005, 46(12):86-92
    [8]张福德.电子商务安全技术[M].北京:中国城市出版社,2001
    [9]方美琪.电子商务概论[M].北京:清华大学出版社,1999
    [10]李红,梁晋.电子商务技术[M].北京:人民邮电出版社,2001
    [11]李广建.电子商务技术[M].北京:师范大学网络教育学院,2004
    [12]屈云波.电子商务[M].北京:企业管理出版社,1999
    [13]李荣涛,刘听辉.电子商务安全策略[J].辽宁师专学报,2004(3)
    [14]李霞,王建民.电子商务中的安全技术研究[J].福建电脑,2005(12)
    [15]赵立平.电子商务概论[M].上海:复旦大学出版社,2000
    [16]汪传雷等.电子商务[M].北京:中国商业出版社,2003
    [17]劳帼龄.电子商务的安全技术[M].北京:中国水利水电出版社,2005
    [18]李洪心.电子商务概论[M].大连:东北财经大学出版社,2000
    [19]雷·海蒙德著.周东,倪正东,吴威译.数字化商业[M].北京:中国计划出版社,1998
    [20]宋玲,王小延.电子商务战略[M].北京:中国金融出版社,2000
    [21]李琪.电子商务概论[M].北京:高等教育出版社,2004
    [22]杨坚争.电子商务基础与应用[M].西安:西安电子科技大学出版社,2004
    [23]龙涛.组合公钥密码的网格身份认证机制[J].华中科技大学学报:自然科学’版,2008,36(11):40-43
    [24]IEEE Std 1363-2000.IEEE Standard specifications for public-key cryptography[S],January 2000
    [25]C.K.Koc and B.Sunar.Low-complexity bit-parallel canonical and normal basis multipliers for a class of finite fields[C].IEEE Trans.Computers,1998,47(3): 353-356
    [26]B.Sunar and C.K.Koc.An efficient optimal normal basis type Ⅱ multiplier[C]. IEEE Trans.Computers,2001,50(5):83-87
    [27]B.Sunar.A generalized method for constructing sub-quadratic complexity GF(2k)multipliers[C].IEEE Trans.Computers,2004,53(9):1097-1105
    [28]王友波.Ⅱ类正规基快速模乘算法的设计与实现[J].计算机应用研究,2005,48(9):206-210
    [1]程明光等.电子商务数据交换标准与应用[M].北京:人民邮电出版社,2006
    [2]吕廷杰.电子商务教程[M].北京:电子工业出版社,2000
    [3]阮传概,孙伟.近世代数及其应用[M].北京:北京邮电大学出版社,2001
    [4]A.R.Masoleh and M.A.Hasan. A new construction of massey-omura parallel multiplier over GF(2m) [C].IEEE Trans. Computers,2002,51 (5):511-520
    [5]IEEE P1363, Standard Specifications for Public Key Cryptography, Draft 9.2001 [S]. http://grouper. ieee.org/groups/1363/
    [6]A. R.Masoleh. Efficient algorithms and architectures for field multiplication using gaussian normal bases [C]. IEEE Transactions on Computers,2006,55(1): 34-47
    [7]Douglas R.Stinson著,冯登国译.密码学原理与实践[M].北京:电子工业出版社,2003
    [8]Diffie W., Hellman M., New directions in cryptography [J]. Information Theory, IEEE Transactions, vol.22, no.6, pp.644-654, Nov.1976.
    [9]M. E. Hellman, DES will be totally insecure within ten years[S]. IEEE Spectrum vol.16, no.7, pp.32-39, July 1979.
    [10]徐茂智,游林编著.信息安全与密码学[M].北京:清华大学出版社,2007
    [11]R. Riverst, A. Shamir, L. Adleman, A method for obtaining digital signatures and public-key cryptosystems [J]. Communications of the ACM, vol.21, 120-126,1978
    [12]T. ElGamal, A public key cryptosystem and a signature scheme based on discrete logarithms[J]. IEEE Transactions on Information Theory, vol.31, 469-472,1985.
    [13]V Miller, Uses of elliptic curves in cryptography[C]. Proc. Advances in Cryptology (CRYPTO'85),417-426,1986.
    [14]N Koblitz, Elliptic curve cryptosystems [J]. Math. Comput., vol.48, no.177, 203-209, Jan.1987
    [15]俞承杭主编.信息安全技术[M].北京:科学出版社,2005
    [16]Darrel Hankerson等著,张焕国等译.椭圆曲线密码学导论[M].北京:电子工业出版社,2005
    [17]Michael A. Nielsen, Isaac Chuang, Quantum Computation and Quantum Information[J], American Journal of Physics, Vol 70,558-559, May 2002
    [18]Additional ECC Groups for IKE[EB/01], Mar.2001, http://www.ietf.org/proceedings/Oldec/1-D/draft-ietf-insec-ike-eccgrouns-03.txt
    [19]ANSI X9.62, Public key cryptography for financial service industry:The elliptic curve digital signature algorithm(ECDSA) [S],1999
    [20]ANSI X9.63, Public key cryptography for financial service industry:Elliptic curve key agreement and key transport protocols. working draft[S],2000
    [21]National Institute of Standards and Technology, FIPS Publication 180-1:Secure Hash Standard (SHS) [S]. April 1995. http://cscr.nist.gov/fips.
    [22]方滨兴.信息安全四要素:诠释信息安全[EB/01].哈尔滨工业大学计算机网络与信息安全技术研究中心.2006.http://pact518.hit.edu.cn/viewpoint/index.htm
    [23]方滨兴.信息安全及其关键技术探讨[EB/01].国家网络与信息安全中心.2005.http://pact518.hit.edu.cn/viewpoint/index.htm
    [24]北京大学公共政策研究所.我国互联网信息内容安全及治理模式研究研究报告[EB/01].2007.http://wwwpkuppi.com/Upfiles/20074477579.doc
    [25]中国电子学会.电子信息学科发展研究报告中文简本[EB/01].2007. http://www.cie-xh.cn:8000/cie/viewContent.jsp?tableName=1 &id=1155
    [26]TIPSTER Text Program [EB/01].2001. http://www.itl.rust.gov/iaui/894.02/related-projects/tipster/
    [27]TREC Home Page [EB/01]. 2007. http://trec.nist.gov/
    [28]国家计算机网络与信息内容安全重点实验室[EB/01].2007. httn://nact518.hit.edu.cn/
    [29]国家计算机网络应急处理协调中心[EB/01].2005.http://www.cert.org.cn
    [30]Jiawei Han, Micheline Kamber. DataMining:Concept and Techniques[M]. Morgan Kaufmann Publishers, Inc.2001
    [31]陆伟,吴朝晖.知识发现方法的比较研究[J].计算机科学,2000,27(3):80-84
    [32]数据挖掘青年[EB/01].2007.http://dmman.blogger.org.cn/
    [33]AnandS S, Bell D A, Hughs J G. A General Framework for Data Mining Based on Evidence Theory [C].Data&Knowledge Eng.,1996,(18):189-223
    [34]王静,孟小峰.半结构化数据的模式研究综述[J].计算机科学,2001,Vol.28
    [35]饶文碧,柯慧燕.Web文本分类技术研究及其实现[J].计算机技术与发展,16(3),2006
    [36]唐蓄.源于知识发现系统内在机理的Web文本挖掘系统的结构模型及算法研究[D].博士学位论文.北京科技大学,2003
    [37]尹世群.Web文本分类关键技术研究[D].博士学位论文.西南大学,2008
    [38]F.Crimmins, A.Smeaton, T.Dkaki, etal. Information discovery on the Internet[J], IEEE Intell. Syst.1999(14):55-62
    [39]李振龙.Web信息检索的技术分析与发展策略研究[J].计算机科学,2006(33):181-184
    [40]周自力,王仁武.Web数据自动采集及其应用研究[J].电子商务,2006,(4):58-63
    [1]程明光等.电子商务数据交换标准与应用[M].北京:人民邮电出版社,2006
    [2]龙涛.组合公钥密码的网格身份认证机制[J].华中科技大学学报:自然科学版,2008,36(11):40-43
    [3]沈昌祥.关于加强信息安全保障体系的思考[J].计算机安全,2002,13(19):7-10.
    [4]R. E. Crandell, Method and Apparatus for Public Key Exchange in a Cryptographic System:US,5,159,632 [P].1992-10-27.
    [5]N. Koblitz, A. Menezes and S. Vanstone. The state of elliptic curve cryptography [J]. Designs, Codes and Cryptography,2000,19:173-193
    [6]D.Hankerson, A.Menezes and S.Vanstone. Guide to elliptic curve cryptography [J]. Springer-Verlag Professional Computing Series,2004
    [7]周玉洁,冯登国.公开密钥密码算法及其快速实现[M].北京:国防工业出版社,2002
    [8]Bruce Schneier著,吴世忠,祝世跃,张文政等译.应用密码学:协议,算法与C源程序[M].北京:机械工业出版社,2000
    [9]顾纯祥,祝跃飞.SEA算法及安全椭圆曲线的有效选择[J].信息工程大学学报,2000,1(4):1-4.
    [10]IEEE P1363, Standard Specifications for Public Key Cryptography, Draft 9.2001 [S]. http://grouper. ieee.org/groups/1363/
    [11]王庆先.有限域运算和椭圆曲线数乘运算研究[D].博士学位论文,电子科技大学,2005
    [12]Omura J, Massey J. Computational method and apparatus for finite field arithmetic [EB]. US Patent No.4587627,1984.
    [13]V.Miller.Uses of elliptic curves in cryptography [C]. Advances in Cryptology-CRYPTO'85. Santa Barbara, Calif:Springer Verlag,1986:417-426
    [14]N.Koblitz. Elliptic Curve Crypto System [J]. Mathematics of Computation. 1987,48(177):203-209
    [15]ISO/IEC 14888-3 Information technology-security techniques-digital signatures with appendix-part3:certificate based mechanisms[S],1998
    [16]ISO/IEC 15946 Information technology-security techniques-cryptographic techniques based on elliptic curves, Committee Draft(CD) [S],1999
    [17]ANSI X9.62, Public key cryptography for financial service industry:The elliptic curve digital signature algorithm(ECDSA) [S],1999
    [18]ANSI X9.63, Public key cryptography for financial service industry:Elliptic curve key agreement and key transport protocols. working draft[S],2000
    [19]Junquan Li. Design and analysis of elliptic curve cryptosystem[D]. Chinese Academy of Science,2001
    [20]张方国,陈晓峰,王育民.椭圆曲线离散对数的攻击现状[J].西安电子科技大学学报(自然科学版),2002,29(3):398-403
    [21]A.R.Masoleh and M.A.Hasan. A new construction of massey-omura parallel multiplier over GF(2m) [C].IEEE Trans. Computers,2002,51(5):511-520
    [22]M.A.Hasan, M.Z.Wang and V.K.Bhargava. A modified massey-omura parallel multiplier for a class of finite fields [C]. IEEE Trans. Computers,1993,42(10): 1278-1280
    [23]C.K.Koc and B.Sunar. Low-complexity bit-parallel canonical and normal basis multipliers for a class of finite fields [C]. IEEE Trans. Computers,1998,47(3): 353-356
    [24]R.C.Mullin, I.M.Onyszchuk, S.A.Vanstone. Optimal Normal Bases in GF(pm) [J]. Discrete Applied Math.,1988/1989,22:149-161
    [25]B.Sunar and C.K.Koc. An efficient optimal normal basis type II multiplier [C]. IEEE Trans. Computers,2001,50(5):83-87
    [26]B.Sunar. A generalized method for constructing sub-quadratic complexity GF(2k) multipliers [C]. IEEE Trans. Computers,2004,53(9):1097-1105
    [27]A.R.Masoleh and M.A.Hasan. Low complexity word level sequential normal basis multipliers [C]. IEEE Trans. Computers,2005,54(2):98-109
    [28]H.Fan and Y.Dai. Key function of normal basis multipliers in GF(2n) [C]. Electronics Letters,2002,38(23):1431-1432
    [29]鲁俊生,张文样,王新辉.一种基于有限域的快速乘法器的设计与实现[J].计算机研究与发展,2004,41(4):755-760
    [30]廖群英.有限域上最优正规基的乘法表[J].数学学报,2005,48(5):947-954
    [31]王友波.Ⅱ类正规基快速模乘算法的设计与实现[J].计算机应用研究,2005,48(9):206-210
    [32]陈韬,郁滨.GF(2n)域上基于ONB的ECC运算单元设计与实现[J].计算机工程,2007,33(9):160-170
    [1]Na J.C., Khoo C.S.G., Chan S., and Hamzah N.B.. Sentiment-based search in digital libraries[C]. In Proceeding of JCDL-05, the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, Dencer, US. ACM Press,2005:143-144
    [2]Morinaga S., Yamanishi K., Tateishi K. and Fukushima T..Mining product reputations on the Web [C]. In Proceedings of KDD-02, the 8th ACM International Conference on Knowledge Discovery and Data Mining. Edmonton:ACM Press,2002:341-349
    [3]Hu Mingqing and Liu Bing. Mining and summarizing customer reviews [C]. In Proceedings of the Tenth ACM SIGKDD 2004:168-177
    [4]Hu Mingqing and Liu Bing. Mining opinion features in customer reviews [C]. In AAAI.2004:755-760
    [5]Wang Chao, Lu Jie, Zhang Guangquan. A semantic classification approach for online product reviews [C]. In Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'S).2005
    [6]Breck E. and Cardie C.. Playing the telephone game:Determining the hierarchical structure of perspective and speech expressions [C]. In Proceeding of COLING-04, the 20th International Conference on Computational Linguistics. CH:Geneva,2004:120-126
    [7]刘永丹,曾海泉,李荣陆,胡运发.基于语义分析的倾向性过滤[J],通信学报,2004.25(7):78-85
    [8]Agrawal R., Rajagopalan S., Rikant R., and Xu YR.. Mining newsgroups using networks arising from social behavior [C]. In Proceeding of the 12 WWW Conference.2003:529-535
    [9]Chambers N., Tetreault J., and Allen J.. Approaches for automatically tagging affect [C]. In Proceeding of the AAAI Spring Symposium on Exploring Attitude and Affect in Text:Theories and Applications. US. Stanford:2004
    [10]Boiy Erik, Hens Pieter, Deschacht Koen and Moens Marie-Francine. Automatic sentiment analysis in On-line text [C]. In Proceedings ELPUB2007 Conference on Electronic Publishing-Vienna, Austria:2007:349-360
    [11]Ye Qiang, Shi Wen, Li Yijun. Sentiment classification for movie reviews in Chinese by improved semantic oriented approach [C]. In Proceedings of the 39th Hawaii International Conference on System Sciences,2006
    [12]Mishne Cx, Carmel D. and Lempel R.. Blocking Blog spam with language model disagreement [C]. Chiba, Japan:2005
    [13]Picard RW. Affective computing [M]. MIT Press, Cambridge, MA,1997.
    [14]Ye Qiang, Shi Wen, Li Yijun. Sentiment classification for movie reviews in Chinese by improved semantic oriented approach [C]. In Proceedings of the 39th Hawaii International Conference on System Sciences,2006
    [15]Finn A, Kushmerick N, Smyth B. Genre classification and domain transfer for information filtering[C]. In Proc. of the 24th BCS-IRSG European Colloquium on IRResearch:Advances in Information Retrieval,2002,353-362.
    [16]Wiebe J, Bruce R, Bell M, et al. A corpus study of evaluative and speculative language[C]. In Proc. of the 2nd SIGdial Workshop on Discourse and Dialogue, Vol.16,2001,1-10
    [17]Turney PD, Littman ML. Unsupervised learning of semantic orientation from a hundred-billion-word corpus [C]. Technical Report EGB-1094, National Research Council Canada,2002.
    [18]Turney PD, Littman ML. Measuring praise and criticism:inference of semantic orientation from association [C]. ACM Transactions on Information Systems, 21(4),2003,315-346
    [19]V. Hatzivassiloglou and K. B. McKeown. Predicting the semantic orientation of adjectives[C]. The 35th Annual Meeting of the Association for Computational Linguistics.
    [20]Andreevskaia A, Bergler S. Mining wordnet for a fuzzy sentiment:sentiment tag extraction from wordnet glosses[C]. In Proc. of the 11th Conf. of the European Chapter of the Association for Computational Linguistics,2006, 209-216
    [21]Kennedy A, Inkpen D. Sentiment classification of movie reviews using contextual valence shifters[C]. Computational Intelligence,22 (2),2006, 110-125.
    [22]P. Stone, D. Dumphy, M. Smith, and D. Ogilvie. The general inquirer:a computer approach to content analysis[C]. International conference of M.IT, 1966.
    [23]V. Hatzivassiloglou and K. B. McKeown. Predicting the semantic orientation of adjectives[C]. The 35th Annual Meeting of the Association for Computational Linguistics.
    [24]Esuli A. and Sebastiani F. Sentwordnet:A publicly available lexical resource for opinion mining [C]. The 5th Conference on Language Resources and Evaluation. Genova:IT,2006
    [25]P. D. Turney. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews [C]. The 40th Annual Meeting of the Association for Computational Linguistics (ACL)
    [26]Pang, L. Lee, and S. Vaithyanathan. Thumbs up? sentiment classification using machine learning techniques[C]. International conference of EMNLP-02, the Conference on Empirical Methods in Natural Language processing
    [27]Lin WH, Wilson T, Wiebe J, et al. Which side are you on? Identifying perspectives at the document and sentence levels [C]. In Proc. of the 10th Conf. on Computational Natural Language Learning,2006,109-116
    [28]Whitelaw C, Garg N, Argamon S. Using appraisal groups for sentiment analysis[C]. In Proc. of the 14th ACM Int. Conf. on Information and Knowledge Management,2005,625-631
    [29]Bruce R, Wiebe J. Recognizing subjectivity:a case study in manual tagging[C]. NaturalLanguage Engineering,5(2),1999,1-16.
    [30]Wiebe J, Riloff E. Creating subjective and objective sentence classifiers from unannotated texts[C]. In Proc. of the 6th Int. Conf. on Computational Linguistics andIntelligent Text Processing,2005,486-497
    [31]Yi J, Nasukawa T, Bunescu R, et al. Sentiment analyzer:extracting sentiments about a given topic using natural language processing techniques[C]. In Proc. of the 3rd IEEE Int. Conf. on Data Mining,2003,427-434.
    [32]Ni X, Xue G, Ling X, et al. Exploring in the Weblog space by detecting informative and affective articles[C]. In Proc. of the 16th Int. Conf. on World Wide Web,2007,281-290.
    [33]Mei Q, Ling X, Wondra M, et al. Topic sentiment mixture:modeling facets and opinions in Weblogs[C]. In Proc. of the 16th Int. Conf. on World Wide Web, 2007,171-180.
    [34]Wang Suge, Wei Yingjie, Zhang Wu, Li Deyu, Li Wei. A hybrid method of feature selection for Chinese text sentiment classification [C]. In Proceedings of the 4th International Conference on Fuzzy Systems and Knowledge Discovery. IEEE Computer Society.2007:435-439
    [35]Li Shoushan, Zong Chengqing and Wang Xia. Sentiment classification through combining classifiers with multiple feature sets [C]. In Proceedings of 2007 IEEE International Conference on Natural Language Proceeding and Knowledge Engineering.2007:135-140
    [36]Li Jun and Sun Maosong. Experiment study on sentiment classification of Chinese review using machine learning technique [C]. In Proceedings of 2007 IEEE International Conference on Natural Language Proceeding and Knowledge Engineering.2007:393-400
    [37]邹嘉彦.评述新闻报道或文章色彩一正负两极性自动分类的研究[C].自然语言理解与大规模内容计算一全国第八届计算语言学联合学术会议.北京:清华大学出版社,2005:21-23
    [38]徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[C],第三届学生计算语言学研讨会论文集.2006:91-100
    [39]王根,赵军.中文褒贬义词语倾向性的分析[c].第三届学生计算语言学研讨会论文集.2006:81-85
    [40]姚天防,聂青阳,李建超,李林林,娄德成,陈坷,付宇.一个用于汉语汽车评论的意见挖掘系统[C].中文信息处理前沿进展.中国中文信息学会成立二十五周年学术会议.北京:清华大学出版社,2006:260-281
    [41]时达明,林鸿飞.基于内容相关度和情感分析的Blogger声誉度研究[C].第三届全国信息检索与内容安全学术会议.苏州:2007:656-662
    [42]张军,于浩,内野宽治.UGC中产品评论信息的挖掘[C].第九届全国计算语言学学术会议论文集.孙茂松,陈群秀主编.内容计算的研究与应用前沿.北京:清华大学出版社,2007:10-11
    [43]姚天防,娄德成.汉语语句主题语义倾向分析方法的研究[C].第九届全国计算语言学学术会议论文集.孙茂松,陈群秀主编.内容计算的研究与应用前沿.北京:清华大学出版社,2007:582-587
    [44]唐慧丰,谭松波,程学旗.监督学习方法在语气挖掘中的应用研究[C].大连:第九届全国计算语言学学术会议论文集.孙茂松,陈群秀主编.内容计算的研究与应用前沿.北京:清华大学出版社,2007:606-611
    [45]章剑锋,张奇,吴立德,黄首蓄.中文评论挖掘中的主观性关系抽取[C].第三届全国信息检索与内容安全学术会议.苏州:2007:675-681
    [46]倪茂树,林鸿飞.基于关联规则和极性分析的商品评论挖掘[C].第三届全国信息检索与内容安全学术会议.苏州:2007:635-642
    [47]H. Cui, V. O. Mittal, and M. Datar. Comparative experiments on sentiment classification for online product reviews[C].Iinternational conference of AAAI 2006
    [48]S. Brin and L. Page, The anatomy of a large-scale hypertextual Web search engine[C].7th International World Wide Web Conference,1998.
    [1]陈正明.从“尿布和啤酒”到数据挖掘[J].软件工程师,2005.18(1):59-59.
    [2]陈良维.数据挖掘领域中的聚类方法[J].微计算机信息,2006,Vol.22(7):209-211
    [3]Robin B. Hybrid recommender system.-survey and experiments[J]. User Modeling and User Adapted Interaction,2002.12 (4):331-370
    [4]KLEINBERG J. An impossibility theorem for clustering [C]. Advances in Neural Information Processing Systems(NIPS). Vancouver, British Columbia, Canada:MIT Press,2002:446-453.
    [5]杨小兵.聚类分析中若干关键技术的研究[D].杭州:浙江大学,2005.
    [6]JAIN A K, DUBES R C. Algorithms for Clustering Data [M]. NJ:Prentice-Hall 1988.
    [7]Jiawei Han, Micheline Kamber. Data Mining Concepts and Techniques. (2nd edition).China Machine Press,2006.
    [8]牛琨.聚类分析中若干关键技术及其在电信领域的应用研究[D].北京:北京邮电大学,2007
    [9]JAIN A K, MURTY M N, FLYNN P J. Data Clustering:A Review [J]. ACM Computing Surveys,1999,31(3):264-323.
    [10]罗会兰.聚类集成技术研究[D].杭州:浙江大学,2007
    [11]MCQUEEN J. Some methods for classification and analysis of multivariate observations [C]. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability.1967:281-297.
    [12]KAUFMAN L, ROUSSEEUW P J. Finding Groups in Data:an Introduction to Cluster Analysis [M]. New York:John Wiley&Sons,1990.
    [13]HAN J, KAMBER M. Data Mining:Concepts and Techniques [M]. San Francisco:Morgan Kaufmann Publishers,2001.
    [14]HARTIGAN J A. Clustering Algorithms [M]. New York:John Wiley and Sons, Inc.,1975.
    [15]BALL G H, HALL D J. Some fundamental concepts and synthesis procedures for pattern recognition preprocessors [C]. International Conference on Microwaves, Circuit Theory, and Information Theory.1964:281-297.
    [16]NG R, HAN J. Efficient and effective clustering method for spatial data mining [C]. Proc. of the 20th VLDB Conference. Santiago, Chile,1994:144-155.
    [17]ZHANG T, RAMAKRISHNAN R, LIVNY M. BIRCH:An Efficient Data Clustering Method for Very Large Databases [C]. Proc. of ACM SIGMOD International Conference on Management of Data.1996:103-114.
    [18]GUHA S, RASTOGI R, SHIM K. Cure:An efficient clustering algorithm for large databases [C]. In Proceedings of the ACM SIGMOD conference on Management of Data. Seattle, WA,1998:73-84.
    [19]KARYAPIS G, HAN E H, KUMAR V. CHAMELEON:A Hierarchical Clustering Algorithm Using Dynamic Modeling [J]. IEEE Computer,1999, 32(8):68-75.
    [20]ESTER M, KRIEGEL H-P, SANDER J, et al. A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Nocies [C]. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). Portland, OR:AAAI Press,1996:226-231.
    [21]ANKERST M, BREUNIG M, KRIEGEL H-P, et al. OPTICS:Ordering Points to Identify the Clustering Structure [C]. Proc. ACM SIGMOD Int. Conf:on Management of Data (SIGMOD'99). Philadelphia, PA,1999:49-60.
    [22]HINNEBURG A, KEIM D A. An Efficient Approach to Clustering in Large Multimedia Databases with Noise [C]. Int. Conf. Knowledge Discovery and Data Mining (KDD'98). New York,1998:58-65.
    [23]WANG W, YANG J, MUNTZ R R. STING:A Statistical Information Grid Approach to Spatial Data Mining [C]. Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB). Athens, Greece:Morgan Kaufmann Publishers,1997:186-195.
    [24]SHEIKHOLESLAMI G, CHATTERJEE S, ZHANG A. WaveCluster:A Multi-Resolution Clustering Approach for Very Large Spatial Databases [C]. Proceedings of 24th International Conference on Very Large Data Bases (VLDB'98). New York:Morgan Kaufmann Publishers,1998:428-439.
    [25]AGRAWAL R, GEHRKE J, GUNOPULOS D, et al. Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications [C]. Proc. ACM SIGMOD Int. Conf. on Management of Data. Seattle, WA,1998: 94-105.
    [26]FISHER D H. Knowledge acquisition via incremental conceptual clustering [J]. Machine Learning,1987,2(2):139-172.
    [27]MCKUSICK K B, THOMPSON K. COBWEB/3:A portable implementation [R]. Moffett Field, CA:NASA Ames Research Center,1990.
    [28]REICH Y, FENVES S. The formation and use of abstract concepts in design [G]. Concept Formation:Knowledge and Experience in Unsupervised Learning. Los Altos, CA:Morgan Kaufmann,1991:323-353.
    [29]CHEESEMAN P, STUTZ J. Bayesian classification (AutoClass):Theory and results [G]. Advances in Knowledge Discovery and Data Mining. Cambridge: AAAI Press/The MIT Press,1996:153-180.
    [30]HUANG Z. Extensions to the k-means algorithm for clustering large data sets with categorical values [J]. Data Mining and Knowledge Discovery,1998, 2(3):283-304
    [31]HUANG Z. Clustering Large Data Sets with Mixed Numeric and Categorical Values [C]. Proceedings of the 1st Pacific-Asia Conference on Knowledge Discovery and Data Mining, (PAKDD). Singapore,1997:21-34.
    [32]GUHA S, RASTOGI R, SHIM K. ROCK:A Robust Clustering Algorithm for Categorical Attributes [C]. Proceedings of the 15th International Conference on Data Engineering. Sydney:IEEE Computer Society Press,1999:512-521.
    [33]LI C, BISWAS G Unsupervised Learning with Mixed Numeric and Nominal Data [J]. IEEE Trans. Know. Data Eng.,2002,14(4):673-690.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700