It has been seen that the rapid development or World Wide Web has broughtdramatic explosion of information. In the meanwhile, as the amount of informationgrows day by day which is mostly essential for highly effectively management ofinformation. As a matter of fact, by taking intelligent computational algorithms todiscover new and useful information and knowledge especially in the field ofinformation retrieval and data mining has been widely focused on and as a hot topic tobe deeply researched.
     This thesis mainly concentrates on the issues that would occur in the process ofconstructing personal knowledge network that covers the discovery, searching,ranking and recommendation for knowledge based on specific domain. The source ofknowledge more derives from textual form of data, audio form of data, video form ordata where we more focus on textual form. The research issues we work on can beconcerned as text mining, also be including lots of interesting and more challengedproblems and applications which is one branch of information retrieval field.Generally, it means that discover useful patterns, structures and other valuableinformation from unconstructed natural language text. For knowledge discovery, topicdiscovery process is much more concerned about. After the emergence of latentsemantic analysis approach, topic analysis has become one popular hot spot byscholars in computer and statistics fields. The simple idea behind the topic analysis isto deal with the collections of topic instead of the ones of knowledge units. Each topiccontains the terms which form the uncertain possibility distribution. So we cantransform the dimension of large scale of collections of terms into lower dimension ofrepresentative collections of topics. Dynamic topic correlation model has been proposed to analyze the topics of knowledge units over time. The model is inspired byhierarchical Gaussian process latent variable model. It makes high dimensionality ofobserved space of terms to become lower dimension of latent space of topics. Theaforementioned condition is to suppose that there is no exchangeability between terms.And all variants exist dynamically at different time points. This non-parameter modelshows faster convergence rate than others. The posterior inference distributionbetween the topic and correlation exist in dynamic topic correlation model is helpfulfor discovering the dynamic changing between the frequency changing among termsin certain topic. And to predict the trend in the topic and the relations reside in whichlows the dimension of topic space and improves the classification performance ofknowledge units.
     In personal knowledge network where users can build their own information base,build the relationship with each other and the store the personal preference intoindividual profile. Users go interaction with each other in a collaborative way in theknowledge network. Personal basic information, behaviour preference and coorelatedsimilar user information will be denoted as concepts. The reason of relations amongconcepts described by ontologies is for improving the semantic relation in users. Theontologies are in further divided into perosnal ontology and knowledge domainontology. We more refer to the current existing ontology base when it comes toknowledge domain ontology. Perosnal ontologies are constructed and annotatedmainly by knowledge experts and knowlege workers. As a result, the enrichingsemantic information of users have already been existed or derived. The informationcan be collaborated to improve the users’ online experience. We also need onecontext-aware data management mechanism to support user-centric data analysis. Wedenote the goal and challenge exist in collaborative knowledge network and proposecollaboratively searching based on knowledge which includes scores on knowledgeunits. The method of scoring comes from the collaborated knowledge network.Besides, we describe the top-k processing algorithm and consider how to balancebetween the query time and space using. The procedure we take is to apply the tag on knowledge units to make further improvement about the efficiency of searching andranking.
     For the recommendation of knowledge units, we propose the semanticallyrecommended method which integrates the domain ontology and usage mining. Ithighly increases the efficiency of searching process and saving time of knowledgeunits in knowledge network. By modeling users' latent interests (mine users usagedaily log, calculate the how the portion that interested knowledge units take inknowledge units collection) and making recommendation for next target knowledgeunits which is for saving users' time. Semantic recommendation includes the semanticdistance combing semantic information to enrich the usage log. At the same time,semantic distance matrix works with transit possibility matrix coming from Markovmodel. The semantic sequence pattern mining combines with Markov model into theprocess of recommendation. At last, we propose the vector space model to constructknowledge units possibility and correlation matrix combines with the tags usersprovide to produce the top-n recommendation.
