Adaptive learning support system has been the focus topic in the research field of artificial intelligence in education in recent years, which is a cross areas of education, cognitive science and computer science. The theory and techniques in Multi-agent system can be used as a novel method to analyze, design and implement distributed open system, so it is also applied in learning support system. With the rapid development of computer and network technology, people rely more and more on network in communication. This causes the change of learning style. The result is that the learning support system has to be changed and require higher demands on its implemental techniques.
     In this paper, we start our research on Multi-agent reinforcement learning system, includes the reinforcement learning algorithms of Agent and Multi-agent, from analyzing the change of learning style. The paper also focuses on the critical application technologies such as user profile, personalized learning environment. The paper has completed the following tasks:
     (1) We summarize the being intelligent process of computer-aided instruction based on literature reviews and analyze the strengths and weaknesses of intelligent tutoring system, and hold that adaptive learning support system is the current trend of the e-learning platform.
     (2)The biasing reinforcement learning algorithm is presented based on detailed analysis of TD algorithm and Q-learning algorithm. The bias information is incorporated to boost learning process with priori knowledge to affect the action selection strategies in reinforcement learning. The error in priori knowledge has been modified during the learning process and the learning speed is also accelerated.
     (3) The Semi-Markov Game Model is presented which can express the hierarchical learning tasks of Multi-Agent system effectively and temporal and sequence characteristic of joint action. This kind of model can be used to Multi-Agent hierarchical reinforcement learning on the continuous state space. Then the paper gives the collaborative framework of MAHRL based on SMG model. This framework describes the collaborative and non-collaborative tasks among agents respectively, and elaborates the work flow of MAHRL system. Finally, the paper gives the HRL algorithm based on Pareto optimal solutions. And this algorithm is the kernel of the collaborative framework of MAHRL. The experiment testifies the validity and superiority of those kinds of model, framework and algorithm.
     (4) An algorithm is presented based on the sixteen personality factor questionnaire to obtain the key personality value of trainee.
     (5) Personalized rendering algorithm for terrorist scene is also presented. Combining with the reinforcement learning algorithm presented in this paper, it can be used to adjust the difficulty level of the practices.
     Finally, a prototype of system on the mine accident rescue training is realized. The system can obtains the user's personality data, retrieve the matching knowledge according the related rules, and then provide personalized learning environment to users.
