摘要
Automatic diagnosis is one of the most important parts in the expert system of traditional Chinese medicine (TCM), and in recent years, it has been studied widely. Most of the previous researches are based on well-structured datasets which are manually collected, structured and normalized by TCM experts. However, the obtained results of the former work could not be directly and effectively applied to clinical practice, because the raw free-text clinical records differ a lot from the well-structured datasets. They are unstructured and are denoted by TCM doctors without the support of authoritative editorial board in their routine diagnostic work. Therefore, in this paper, a novel framework of automatic diagnosis of TCM utilizing raw free-text clinical records for clinical practice is proposed and investigated for the first time. A series of appropriate methods are attempted to tackle several challenges in the framework, and the Na茂ve Bayes classifier and the Support Vector Machine classifier are employed for TCM automatic diagnosis. The framework is analyzed carefully. Its feasibility is validated through evaluating the performance of each module of the framework and its effectiveness is demonstrated based on the precision, recall and F-Measure of automatic diagnosis results.