摘要
目前的网络计算平台包括专用平台和不成熟的开放式平台。针对它们的不足,
“一种开放式网络计算平台”采用 Linux 虚拟服务器架构、应用系统与核心软件分
离机制及人性化的志愿机软件设计,构建了一种高性能的开放式网络计算平台。
客户向服务器系统提交大型分布式应用,服务器系统将应用分解成大量的独立任
务,并分配给 Internet 上的众多志愿机计算。
服务器系统采用应用系统与核心软件分离机制,使多个应用可以透明的利用
本平台进行计算,克服了专用平台的不足;服务器系统建立在 Linux 虚拟服务器架
构上,自然继承了高可用性、可伸缩性和高性价比等优良特性;自适应容错备份
系统的设计,有效提高了系统的容错性;通过采用空闲服务器申请应用算法以及
任务过重服务器申请资源算法,较好的实现了服务器之间的负载均衡。
人性化的志愿机软件设计便于吸引更多的志愿者贡献空闲资源,具体体现在:
“以人为本”的志愿者接口系统使志愿者方便的控制志愿机,尊重了志愿者对志
愿机的最终控制权;以最低优先级启动的计算进程完全不会干扰志愿者正常工作;
基于Windows和Linux的志愿机软件使Internet上大部分计算机成为潜在志愿机;
志愿机软件本身不含任何应用的二进制代码,而是根据需要向服务器索取,保证
了志愿机能够透明的为服务器运行不同应用。
实例运行表明,本平台在三天时间内利用志愿机空闲资源完成的工作量相当
于一台普通 PC 机全速运行 96 天。当更多志愿机加入时,本平台能够获得相当于
1500 台 PC 机的计算能力。
There are two kinds of network computing platforms at present: specialized
platforms and immature open platforms. Aimed at the problems of which, an open
network computing platform realized open and high performance by adopting improved
Linux virtual server architecture, mechanism of separation between the application
system and core software, friendly designation of the volunteer software. After
receiving large and distributed applications submitted by clients, the server system will
divide them into a large amount of independent tasks and dispatch them to volunteers on
Internet for computing.
The server system makes multiple applications could run on the network
computing platform by adopting the mechanism of separation between the application
system and core software. Building on Linux virtual server architecture, the server
system inherits good traits such as high availability, flexibility and high performance
low cost naturally. Self-adaptive fault-tolerant backup system based on Linux virtual
server improved fault-tolerance of the platform efficiently. Adoption of arithmetic of
idle servers apply for applications and arithmetic of busy server apply for resources
realized load balance between servers, and utilize computing resources efficiently.
Friendly designation of the volunteer software could attract more volunteers to join
the platform. By the volunteer interface system, volunteers could control the running of
the volunteer software conveniently, which respects the control right of volunteers to
their computer. The tasks would always start up with the lowest priority in the
background and would not affect the volunteer at all. The volunteer software based on
Windows and Linux could utilize most computers on Internet. The volunteer software
will get binary code of applications when needed, which ensures the volunteers could
run different applications for servers transparently.
The test reveals that the power of the platform in three days is equal to a personnal
computer running 96 days in full speed. When more volunteers join, the platform could
get the power equal to 1500 personnal computers.
引文
[1] Korpela. E, Werthimer. D, Anderson. D, et al. SETI@Home-Massively Distributed
Computing for SETI. IEEE Computational Science and Engineering, 2001, 3(1):
78~83
[2] Taylor. J, Rana. F, Philp. R, et al. Supporting Peer-2-Peer Interactions in the
Consumer Grid. In: Proceedings of the 8th International High-Level Parallel
Programming Models and Supportive Environments (HIPS-8). San Francisco, USA.
2003. USA: Prentice Hall, 2003. 3~12
[3] Kondo. D, Casanova. H, Wing. E, et al. Models and Scheduling Mechanisms for
Global Computing Applications. In: Proceedings of the 16th Parallel and Distributed
Processing Symposium (IPDPS 2002). Florida, USA. 2002. USA: Ferguson Hall,
2002. 79~86
[4] Fedak. G, Germain. C, Neri. V, et al. XtremWeb: A Generic Global Computing
System. In: Proceedings of the First IEEE/ACM International Symposium on
Cluster Computing and the Grid (CCGRID 2001). California, USA. 2001. USA:
IEEE Computer Society Press, 2001. 582~587
[5] Haesun Shin, Sook-Heon Lee, Myong-Soon Park. Multicast-based Distributed LVS
(MD-LVS) for Improving Scalability and Availability. In: Proceedings of 2001 IEEE
Conference on Parallel and Distributed Systems (ICPADS 2001). Kyonju, Korea.
2001. USA: IEEE Computer Society Press, 2001. 748~754
[6] Li Chunlin, Li Layuan. An Agent-based Approach for Grid Computing. In:
Proceedings of the Fouth International Conference on Parallel and Distributed
Computing, Applications and Technologies (PDCAT 2003). Springer, Verlag. 2003.
USA: IEEE Computer Society Press, 2003. 608~611
[7] 都志辉, 陈渝, 刘鹏. 网格计算. 北京:清华大学出版社, 2002. 194~196.
[8] Ian Foster, Carl Kesselman, Steven Tuecke. The Anatomy of the Grid: Enabling
Scalable Virtual Organizations. Supercomputer Applications, 2001, 15(3): 5~8
[9] Jiang Du, Niansheng Zhou, Zhihui Du, et al. A WS-Inspection Based Decentralized
Service Discovery Service in OGSA. In: Proceedings of 2003 International
Conference on Communication Technology (ICCT 2003). Beijing, China. 2003.
USA: HighWire Press, 2003. 691~697
56
[10] M Govindaraju, S Krishnan, K Chiu, et al. Merging the CCA Component Model
with the OGSI Framework. In: Proceedings of the Third IEEE/ACM International
Symposium on Cluster Computing and the Grid (CCGrid 2003). Tokyo, Japan.
2003. USA: IEEE Computer Society Press, 2003. 182~189
[11] Ian Foster. The Grid: A New Infrastructure for 21st Century Science. Physics Today,
2002, 55(2): 42~47
[12] Cho-Li Wang, Lau Li. M-JavaMPI: a Java-MPI Binding with Process Migration
Support. In: Proceedings of the Second IEEE/ACM International Symposium on
Cluster Computing and the Grid (CCGRID2002). Berlin, Germany. 2002. USA:
IEEE Computer Society Press, 2002. 240~247
[13] 章勤, 鄢娟, 金海, 韩宗芬. 昊宇网络计算平台体系结构研究. 计算机研究与
发展, 2003, 40(12): 1725~1730
[14] 鄢娟, 金海, 韩宗芬, 章勤. 昊宇:基于 LVS 的开放式网络计算平台. 计算机
工程与科学, 录用编号: 23651
[15] Miecznikowski J, Hendren L. Decompiling Java Using Staged Encapsulation. In:
Proceedings of the Eighth Reverse Engineering Conference. San Francisco, USA.
2001. USA: The Scott Hall, 2001. 368~374
[16] Barry D, Stanienda T. Java at Middle Age: Enabling Java for Computational
Science. IEEE Computational Science and Engineering, 2002, 4(1): 74~84
[17] Hardin D. Crafting a Java Virtual Machine in Silicon. IEEE Instrumentation &
Measurement Magazine, 2001, 4(1): 54~56
[18] John L, Radhak R, Vijaykris N, et al. Java Runtime Systems: Characterization and
Architectural Implications. IEEE Computers Transactions, 2001, 50(1): 131~146
[19] D Schuehler, J Lockwood. TCP-Splitter: A TCP/IP Flow Monitor in
Reconfigurable Hardware. In: Proceedings of 10th Symposium on High
Performance Interconnects. Paris, France. 2002. USA: IEEE Computer Society
Press, 2002. 127~131
[20] George Coulouris, Jean Dollimore, Tim Kindberg. Distributed Systems: Concepts
and Design (Third Edition). USA: Addison-Wesley Pub Co, 2000. 13~55.
[21] Garg V, Mittal N. On Slicing a Distributed Computation. In: Proceedings of 21st
International Conference on Distributed Computing Systems. Beijing, China. 2001.
USA: Prentice Hall, 2001. 322~329
57
[22] Zhiwei Xu, Ninghui Sun, Dan Meng, et al. Cluster and Grid Superservers: the
Dawning Experience in China. In: Proceedings of the Third IEEE International
Conference on Cluster Computing. Beijing, China. 2001. USA: IEEE Computer
Society Press, 2001. 16~19.
[23] Chunlin Li, Layuan Li. Apply Agent to Build Grid Service Management. Journal of
Network and Computer Applications, 2003, 26(4): 323~340
[24] Rheinheimer Randal, Beiriger Judy, Bivens Hugh. The ASCI Computational Grid:
Initial Deployment. Concurrency Computation Practice and Experience, 2002,
14(6): 1351~1363
[25] Chunlin Li, Layuan Li. Agent Framework to Support the Computational Grid.
Journal of Systems and Software, 2004, 70(1): 177~187
[26] Chunlin Li, Layuan Li. Integrate Software Agents and CORBA in Computational
Grid. Computer Standards and Interfaces, 2003, 25(4): 357~371
[27] D.Abramson, R.Buyya and J.Giddy. Nimrod/g: Architecture of a Resource
Management and Scheduling System in a Global Computational Grid. In:
Proceedings of 4th International Conference on High Performance Computing in
Asia-Pacific Region (HPC Asia’ 2000). Beijing, China. 2000. USA: IEEE
Computer Society Press, 2000. 211~215.
[28] Park Hyo, Sung Chang. Distributed Structural Analysis of Large-Scale Structures
on a Cluster of Personal Computers. Computer-Aided Civil and Infrastructure
Engineering, 2002, 17(6): 409~422
[29] Xu Yingyue, Qi Hairong, Kuruganti Phani. Distributed Computing Paradigms for
Collaborative Processing in Sensor Networks. In: Proceedings of 3rd IEEE Global
Telecommunications Conference. Beijing, China. 2003. USA: IEEE Computer
Society Press, 2003. 531~535
[30] Adhikari Sameer, Paul Arnab. D-Stampede: Distributed Programming System for
Ubiquitous Computing. In: Proceedings of the fourth International Conference on
Distributed Computing Systems. Tokyo, Japan. 2002. USA: Prentice Hall, 2002.
209~216
[31] Z.H. Du, S.L. Li, Q.S. Ma, et al. Approaches to Performance Improvement and
Experiments on a Cluster System THNPSC-1. In: Proceedings of the fourth
International Conference/Exhibition on High Performance Computing in
Asia-Pacific Region. Beijing, China. 2000. USA: IEEE Computer Society Press,
58
2000. 131~133.
[32] Richard Rabbat, Tom McNeal, Tim Burke. A High-Availability Clustering
Architecture with Data Integrity Guarantees. In: Proceedings of 2001 IEEE
International Conference on Cluster Computing. California, USA. 2001. USA:
IEEE Computer Society, 2001. 178~182
[33] K.C. Huang, H.Y. Chang, C.Y. Shen, et al. Benchmarking and Performance
Evaluation of NCHC PC Cluster. In: Proceedings of the fourth International
Conference/Exhibition on High Performance Computing in Asia-Pacific Region.
Beijing, China. 2000. USA: IEEE Computer Society Press, 2000. 115~119.
[34] B Cristian, K Porlin, I Deepa, et al. Cooperative Computing for Distributed
Embedded Systems. In: Proceedings of the second International Conference on
Distributed Computing Systems. Phoenix, USA. 2002. USA: Prentice Hall, 2002.
227~236
[35] Jiannong Cao, Bennett Graeme, Kang Zhang. Direct Execution Simulation of Load
Balancing Algorithms with Real Workload Distribution. Journal of Systems and
Software, 2000, 54(3): 227~237
[36] Lindqvist U and Phillip P. eXpert-BSM: A Host-based Intrusion Detection Solution
for Sun Solaris. In: Proceedings of the 17th Annual Computer Security
Applications Conference (ACSAC 2001). New Orleans, USA. 2001. USA: IEEE
Computer Society Press, 2001. 240~251
[37] M.Petkac and B.Lee. Security Agility in Response to Intrusion Detection. In:
Proceedings of the Applied Computer Security Associates Conference. Louisiana,
USA. 2000. USA: IEEE Computer Society Press, 2000. 11~15
[38] A. Curtis, J. Carver and M.John. A Methodology for Using Intelligent Agents to
Provide Automated Intrusion Response. In: Proceedings of the 2000 IEEE
Workshop on Information Assurance and Security. New York, USA. 2000. USA:
IEEE Computer Society, 2000. 110~120
[39] 章文嵩. 可伸缩网络服务的研究与实现: [博士学位论文]. 湖南长沙: 国防科
技大学图书馆, 2001.
[40] Yair Amir, Baruch Awerbuch, Amnon Barak, et al. An Opportunity Cost Approach
for Job Assignment in a Scalable Computing Cluster. IEEE Transactions on
Parallel and Distributed Systems, 2000, 11(7): 760~768