详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
【Objective】As for repeated-measured qualitative data on ischemia stroke, mostprevious studies focused on the cross-sectional analysis. However, the longitudinalanalysis for this kind of data is lacking, so are systematic and comprehensivestatistical methods. This paper, is intended to explore the syndrome variationregularity of ischemia stroke to provide new tactics for data analysis by researchingthe repeated-measured qualitative data on ischemia stroke, which can help unveil themechanism of ischemia stroke and guide clinical intervention. This paper also aims tosummarize relevant analytical methods for studying the variation regularity ofrepeated-measured qualitative data on other diseases. In addition, this paper conductsitem selection and optimization of the stroke syndrome factor evaluation scale. Theoverall purpose of this study is to provide a basis and support for statistical methodsfor clinical research and practice.
     【Content】This study performs a large-scale statistical analysis in two aspects:the syndrome variation regularity and item selection of the evaluation scale. To bespecific, this paper explores the variation regularity of six different syndromes ofischemia stroke (including the wind syndrome, fire syndrome, phlegm syndrome,blood stasis syndrome, qi-deficiency syndrome and yin-deficiency syndrome) andseeks influential factors based on continuous and dynamic data. The patients areclassified by the first syndromes and6syndromes at each time point respectively inorder to explore the syndrome variation regularity and the influential factors ofischemia stroke in different classes, help clinicians find the best time for medicalintervention, and explore analytical methods for variation regularity of repeated-measured qualitative data. In addition, this paper conducts item selection of the strokesyndrome factor evaluation scale which contains97symptoms, including dizziness,upset, fever, and discusses the application of item response theory in item selection ofthe evaluation scale.
     This study, focused on the drawbacks of variation regularity research onrepeated-measured qualitative data of ischemia stroke, conducts analysis of variationregularity of repeated-measured qualitative data on ischemia stroke, and appliesitem response theory to the item selection of stroke syndrome factor evaluation scalethrough programming language of SAS software and Mplus software.
     【Methods】The paper makes full use of various analytical methods of statistics,especially the generalized estimating equation, latent class analysis, latent transition analysis, item response theory. Based on the basic research conducted by DongzhimenHospital Affiliated to Beijing University of Chinese Medicine (“Research on theDiagnostic Criteria and the Efficacy Evaluation System of the symptoms of IschemicStroke”, the project supported by the National Basic Research Program of China,Grant No.:2003CB517102, and “Research on the key techniques in clinical efficacyevaluation that displays the therapeutic advantages of traditional Chinese medicine”,the project supported by the National Science and Technology Major Project,“Significant New Drug Creation”, Grant No.:2009ZX09502-028), we have come toregard the observation time as a classification variable and a continuous variablerespectively in the research on the syndrome variation regularity of ischemia stroke,and adopt the GEE to explore the influential factors and syndrome variation regularityof993ischemia stroke patients. Afterwards, according to the tactic of combiningLCA with GEE, and combining LTA with GEE, we consider the observation time aclassification variable and a continuous variable respectively to explore both theinfluential factors which cause different syndromes of ischemia stroke patients indifferent classes and the syndrome variation regularity. In the study of item selectionof stroke syndrome factor evaluation scale, the item response theory is adopted. Theitem information function, discrimination parameter, item characteristic curve andpractical theory of TCM are used to select items, eliminate items that are relativelyless informative, construct a logistic curve regression equation, compare the predictedprobabilities gained by adopting the two main parameter estimation methods of IRT(the maximum likelihood estimation method and the Bayes estimation method) withthe real frequency, and find the best parameter estimation method according to theresidual sum of squares and the correlation coefficient.
     【Results】The paper attempts to eliminate the drawbacks in the existinganalytical methods of repeated-measured qualitative data on ischemia stroke, proposesome tactics for research on syndrome variation regularity of ischemia stroke and itemselection of stroke syndrome factor evaluation scale, and present them in the mostsuitable way through programming by SAS and Mplus. To be specific, the results andmajor innovations of this paper are summarized as follows.
     (1) The inner correlation of subjects is taken into consideration, andrepeated-measured qualitative data on ischemia stroke are analyzed using GEE. Theobservation time is considered a classification variable which focuses on the influenceof each time point compared with the starting point (standing in the local point ofview) and a continuous variable which considers the variation regularity of occurrenceprobability with time (standing in the global point of view) respectively. GEE isadopted to explore the influential factors, discuss the syndrome variation regularity ofischemia stroke, and predict the chance that a patient may develop a specificsyndrome at each time point through a fitting equation. GEE can be adopted to analyze repeated-measured qualitative data on ischemia stroke which lacksindependence.
     (2) An analytical strategy for syndrome variation regularity of ischemia stroke ispresented based on the fact that the first syndromes are of great clinical significance.By combining LCA with GEE, we classify the patients based on the first6syndromesby LCA. According to the fitting indexes, patients are preferably classified into twogroups, with379and614patients in each group respectively. Afterwards, we considerthe observation time a classification variable and a continuous variable respectively,explore the influential factors and the syndrome variation regularity of ischemiastroke in different classes using GEE, and predict the prospects that a patient in eachclass may develop a certain syndrome at each time point through a fitting equation.The result shows that the syndrome incidence and the variation regularity of the twogroups are different.
     (3) An analytical strategy for syndrome variation regularity of ischemia stroke ispresented based on the6syndromes at each time point. By combining LTA with GEE,we classify the patients by the6syndromes at each time point using LTA. The fittingindexes show that it is the best when seven classes are classified, with498,251,87,63,52,26and16patients in each class respectively. This paper regards theobservation time as a classification variable and a continuous variable respectively,afterwards, explore the influential factors, the syndrome variation regularity and thetransition probability by analyzing the two classes that account for the largestproportions using GEE, and predict the probability that a patient in each class maydevelop a particular syndrome at each time point through a fitting equation. The resultsuggests that the syndrome incidence and the variation regularity in each class vary.
     (4) The IRT is adopted to acquire difficulty parameter, discrimination parameter,information function, test scores of each syndrome and ability parameter estimates ofpatients, and to draw the item characteristic curve and the test characteristic curve forstroke syndrome factor evaluation scale. Items are selected by item informationfunction, discrimination parameter, item characteristic curve and practical theory ofTCM. As a result, eight items that provide little information (including f6, f13, h24,h25, t10, q18, y11and y12) are eliminated, which account for8.25%of the total. Alogistic curve regression equation is constructed using item parameters so that thechance of each patient having each item can be obtained by inputting the abilityparameter estimate of each patient. Afterwards, the predicted probabilities acquired byadopting the two most-frequently-used parameter estimation methods of IRT (themaximum likelihood estimation method and the Bayes estimation method) arecompared with the real frequency. According to the residual sum of squares and thecorrelation coefficient, we are able to draw the conclusion that the results of the abovetwo methods are consistent, and the MLE method is marginally better than the Bayes method.
     【Conclusions】The paper explores repeated-measured qualitative data ofischemia stroke and achieves some desirable results by resolving the issue of sampleclassification for qualitative data and repeatedly-measured quantitative data withmultiple response variables that are syndromes. The inner correlation of subjects istaken into consideration in this study so that the statistical inference is highly reliable.What’s more, it lays the foundation for studying the variation regularity ofrepeated-measured qualitative data of other diseases. This paper discusses thevariation regularity of each class after sample clustering, which helps guide clinicalintervention at different stages for patients suffering from ischemia stroke in differentclasses, and improve the curative effect. In addition, this paper uses the item responsetheory, which is mainly used in the field of psychological measurement presently, initem selection of stroke syndrome factor evaluation scale. It is proved that the result isfeasible, which indicates that the application of IRT is extended.
    [23] Nelder JA, Wedderburn RWM. Generalized linear models. Journal of theRoyal Statistical Society,SeriesA,1972,135:370–384.
    [24] Sutradhar BC. An overview on regression models for discrete longitudinalresponses. Statistical Science,2003,18(3):377-393.
    [25] Fitzmaurice G, Davidian M, Verbeke G, et al. Longitudinal data analysis.BocaRaton,FL:Chapman&Hall/CRC Press,2009:1-78.
    [26] Liang KY, Zeger ST. Longitudinal data analysis using generalized linearmodels. Biometrika,1986,73(1):13-22.
    [27] Zeger ST, Liang KY. Longitudinal data analysis for discrete and continuousoutcomes. Biomeics,1986,42:121-130.
    [29] SAS Institute Inc. SAS/STAT9.2user’s guide. Cary,NC:SAS Institute Inc.,2008:1982-2070.
    [31] Liang KY, Zeger SL, Quaqish B. Multivariate regression analysis forcategorical data. Journal of the Royal Statistical Society,Series B,1992,54:3-40.
    [33] Horton NJ, Lipsitz SR. Review of software to fit generalized estimatingequation regression models. The American Statistician,1999,53:160-169.
    [36] Carey VC, Zeger SL, Diggle P. Modelling multivariate binary data withalternating logistic regressions. Biomerika,1993,80(3):517-526.
    [37] Zeger SL, Liang KY, Albert PS. Models for longitudinal data: a generalizedestimating equation approach. Biometrics,1988,44(4):1049-1060.
    [39] Akaike H. Factor analysis and the AIC. Psychometrika,1987,52:317–332.
    [40] Raftery AE. Bayesian model selection in social research. Sociological Metho-dology,1995,25:111–163.
    [41] Park T. A comparison of the generalized estimating equation approach withthe maximum likelihood approach for repeated measurements. Stat Med,1993,12(18):1723-1732.
    [42] Pan W. Goodness-of-fit tests for GEE with correlated binary data.Scandinavian Journal of Statistics,2002,29:101-110.
    [43] Pan W. Akaike's information criterion in generalized estimating equations.Biometrics,2001,57(1):120-125.
    [44] Williamson JM, Lin HM, Barnhart HX. A classification statistic for GEEcategorical response models. Journal of Data Science,2003,1:149-165.
    [45] Lin DY, Wei LJ, Ying Z. Checking the cox model with cumulative sums ofmartingale-based residuals. Biometrika,1993,80:557-572.
    [46] Lin DY, Wei LJ, Ying Z. Model-checking techniques based on cumulativeresiduals. Biometrics,2002,58:1-12.
    [49] Lazarsfeld PF. The logical and mathematical foundations of latent structureanalysis. NJ:Princeton University Press,1950:362-412.
    [50] Formann AK, Kohlmann T. Latent class analysis in medical research. StatMethods Med Res,1996,5:179–211.
    [54] Collins LM, Lanza ST. Latent class and latent transition analysis: withapplications in the social, behavioral and health sciences. Hoboken,NJ:JohnWiley&Sons,2010:3-224.
    [55] Lanza ST, Bary BC. Transitions in drug use among high-risk women: anapplication of latent class and latent transition analysis. Adv Appl Stat Sci,2010,3(2):203-235.
    [56] Smith RA, Barclay VC, Smith JLF. Investigating preferences for mosquito-control technologies in mozambique with latent class analysis. Malaria Journal,2011,10:200.
    [57] Nylund KL, Asparouhov T, Muthén Bo. Deciding on the number of classes inlatent class analysis and growth mixture modeling: a Monte Carlo simulationstudy. Struct Equ Modeling,2007,14(4):535-569.
    [58] Sotres-Alvarez D, Herring AH, Siega-Riz AM. Latent class analysis is usefulto classify pregnant women into dietary patterns. The Journal of Nutrition,2010,140:2253-2259.
    [59] Trivedi RB, Ayotte BJ, Thorpe CT, et al. Is there a nonadherent subtype ofhypertensive patient? A latent class analysis approach. Patient Preference andAdherence,2010,4:255-262.
    [60] Magidson J, Vermunt JK. Latent class models for clustering: a comparisonwith k-means. CJMR,2002,20:37–44.
    [61] Beeber AS, Thorpe JM, Clipp EC. Community-based service use by elderswith dementia and their caregivers: a latent class analysis. Nurs Res,2008,57(5):312–321.
    [62] Thorpe JM, Thorpe CT, Kennelty KA, et al. Patterns of perceived barriers tomedical care in older adults: a latent class analysis. BMC Health ServicesResearch,2011,11:181.
    [63] Lanza ST, Collins LM, Lemmon DR, et al. PROC LCA: a SAS procedure forlatent class analysis. Structural Equation Modeling,2007,14(4):671-694.
    [64] Magidson J, Vermunt JK. Latent class factor and cluster models, bi-plots andtri-plots and related graphical displays. Sociological Mehtodology,2001,31:223-264.
    [67] Heitzler C, Lytle L, Erickson D, et al. Physical activity and sedentary activitypatterns among children and adolescents: a latent class analysis approach. JPhys Act Health,2011,8(4):457-467.
    [68] Boyko EJ, Doheny RA, McNeely MJ, et al. Latent class analysis of themetabolic syndrome. Diabetes Res Clin Pract,2010,89(1):88-93.
    [70] Lanza ST, Dziak JJ, Huang L, et al(2011). PROC LCA&PROC LTA User'sGuide(Version1.2.7). University Park: The Methodology Center, Penn State.Retrieved from http://methodology.psu.edu.
    [72] Hagenaars JA, McCutcheon AL. Applied latent class analysis. Cambridge:Cambridge University Press,2002:89-106.
    [73] Dean N, Raftery AE. Latent class analysis variable selection. Ann Inst StatMath,2010,62(1):11-35.
    [74] Goodman LA. Exploratory latent structure analysis using both identifiable andunidentifiable models. Biometrika,1974,61(2):215-231.
    [75] Formann AK. Linear logistic latent class analysis for polytomous data. Journalof the American Statistical Association,1992,87:476-486.
    [76] Lin TH, Dayton CM. Model selection information criteria for non-nestedlatent class models. Journal of Education and Behavioral Statistics,1997,22(3):249-264.
    [77] Silverwood RJ, Nitsch D, Pierce M, et al. Characterizing longitudinal patternsof physical activity in mid-adulthood using latent class analysis: results from aprospective cohort study. American Journal of Epidemiology,2011,174(12):1406-1415.
    [78] Goodman LA. On the assignment of individuals to latent classes. SociologicalMethodology,2007,37(1):1-22.
    [80] Chung H, Lanza ST, Loken E. Latent transition analysis: inference andestimation. Statistics in Medicine,2008,27(11):1834–1854.
    [81] Lanza ST, Patrick ME, Maggs JL. Latent transition analysis: benefits of alatent variable approach to modeling transitions in substance use. J DrugIssues,2010,40(1):93-120.
    [82] Lanza ST, Collins LM. A new SAS procedure for latent transition analysis:transitions in dating and sexual risk behavior. Developmental Psychology,2008,44(2):446-456.
    [83] Roberts TJ, Ward SE. Using latent transition analysis in nursing research toexplore change over time. Nurs Res,2011,60(1):73-79.
    [84] Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incompletedata via EM algorithm(with discussion). Journal of the Royal StatisticalSociety,Series B,1977,39:1–38.
    [85] Connell A, Bullock BM, Dishion TJ, et al. Family intervention effects on co-occurring early childhood behavioral and emotional problems: a latenttransition analysis approach. J Abnorm Child Psychol,2008,36(8):1211-1225.
    [86] Alagumalai S, Curtis DD, Hungi N. Applied rasch measurement: a book ofexemplars. Netherlands:Springer,2005:1-14.
    [87] Novick MR. The axioms and principal results of classical test theory. Journalof Mathematical Psychology,1966,3:1–18.
    [90] Embretson SE, Reise SP. Item response theory for psychologists. Mahwah,New Jersey:Lawrence Erlbaum Associates,2000:3-246.
    [91] Bock RD. Estimating item parameters and latent ability when responses arescored in two or more nominal categories. Psychometrika,1972,37:29–51.
    [92] Mellenbergh GJ. Conceptual notes on models for discrete polytomous itemresponses. Applied Psychological Measurement,1995,19:91–100.
    [94] Kosinski M, Bayliss MS, Bjorner JB, et al. A six-item short-form survey formeasuring headache impact: the HIT-6. Quality of Life Research,2003,12(8):963-974.
    [95] Green DR, Yen WM, Burket GR. Experience in the application of itemresponse theory in test construction. Applied Measurement in Education,1989,2(4):297-312.
    [96] Weiss DJ. Adaptive testing by computer. Journal of Consulting and ClinicalPsychology,1985,53:774-789.
    [97] Ware JE, Kosinski M, Bjorner JB, et al. Applications of computerizedadaptive testing (CAT) to the assessment of headache impact. Quality of LifeResearch,2003,12(9):935-952.
    [99] Van der Linden WJ, Hambleton RK. Handbook of modern item responsetheory. New York:Springer-Verlag,1997:1-28.
    [101] Lai JS, Cella D, Chang CH, et al. Item banking to improve, shorten andcomputerize self-reported fatigue: an illustration of steps to create a core itembank from the FACIT-Fatigue Scale. Quality of Life Research,2003,12(5):485-501.
    [102] Robitail S, Erhart M, Tebe C, et al. Person fit across European countries:results from the kidscreen field survey. Quality of Life Research,2004,13(9):1547.
    [104] McDonald RP. A basis for multidimensional item response theory. AppliedPsychological Measurement,2000,24:99-114.
    [105] Reckase MD. The past and future of multidimensional item response theory.Applied Psychological Measurement,1997,21(1):25-36.
    [106] Reckase MD, Ackerman TA, Carlson JE. Building a unidimensional test usingmultidimensional items. Journal of Educational Measurement,1988,25:193-203.
    [108] Reeves BB, Hays RD, Bjorner JB, et al. Psychometric evaluation andcalibration of health-related quality of life item banks: plans for the patient-reported outcomes measurement information systems(PROMIS). Med Care,2007,45:22-31.
    [109] Davier MV, Carstensen CH. Multivariate and mixture distribution raschmodels: extensions and applications. New York:Springer,2007:1-55.
    [112] Muthén LK, Muthén BO. Mplus user’s guide.6thedition. Los Angeles,CA:Muthén&Muthén,2010:60-61.
    [113] Baker FB. The basics of item response theory.2ndedition. College Park,MD:ERIC Clearinghouse on Assessment and Evaluation,2001:5-128.
    [114] Baker FB. Methodology reviews: item parameter estimation under the one-,two-, and three-parameter logistic models. Applied Psychological Measure-ment,1987,11:111-142.
    [115] Bock RD, Aitkin M. Marginal maximum likelihood estimation of itemparameters: an application of the EM algorithm. Psychometrika,1981,46:443–459.
    [116] Kingston NM, Dorans NJ. The analysis of item-ability regression: anexploratory IRT model fit tool. Applied Psychological Measurement,1985,9:281-288.
    [117] Ludlow LH. A strategy for the graphical representation of Rasch modelresiduals. Educational and Psychological Measurement,1985,45:851-859.
    [118] Yen WM. Using simulation results to choose a latent trait model. AppliedPsychological Measurement,1981,5:245-262.
    [120] Cox DR. The analysis of binary data. London:Methuen,1970:14-29.
    [128] Hosmer DW, Lemeshow S. Applied logistic regression.2ndedition.John Wiley&Sons Inc.,2000:288-308.
    [131] Kleinbaum DG, Klein M. Logistic regression: a self-learning text.2ndedition.New York:Springer-Verlag Inc.,2002:227-265.
    [133] Cheng PE, Liou M, Aston JAD. Likelihood ratio tests with three-way tables.Journal of the American Statistical Association,2010,105:490,740-749.
    [135] Agresti A著.张淑梅,王瑞,曾莉,译.属性数据分析引论.第2版.北京:高等教育出版社,2007:187-189.
    [139] Hardin JW, Hilbe JM. Generalized estimating equations. New York:Chapman&Hall/CRC Press,2003:55-180.
    [1] Fitzmaurice G, Davidian M, Verbeke G, et al. Longitudinal data analysis. BocaRaton,FL:Chapman&Hall/CRC Press,2009:1-78.
    [2] Zeger ST, Liang KY. Longitudinal data analysis for discrete and continuous out-comes. Biomeics,1986,42:121-130.
    [10]Nelder JA, Wedderburn RWM. Generalized linear models. Journal of the RoyalStatistical Society,SeriesA,1972,135:370–384.
    [12]Zeger SL, Liang KY, Albert PS. Models for longitudinal data: a generalizedestimating equation approach. Biometrics,1988,44(4):1049-1060.
    [13]Pan W. Goodness-of-fit tests for GEE with correlated binary data. ScandinavianJournal of Statistics,2002,29:101-110.
    [14]Lin DY, Wei LJ, Ying Z. Checking the cox model with cumulative sums ofmartingale-based residuals. Biometrika,1993,80:557-572.
    [15]Formann AK, Kohlmann T. Latent class analysis in medical research. StatMethods Med Res,1996,5:179–211.
    [16]Magidson J, Vermunt JK. Latent class factor and cluster models, bi-plots and tri-plots and related graphical displays. Sociological Mehtodology,2001,31:223-264.
    [17]Collins LM, Lanza ST. Latent class and latent transition analysis: withapplications in the social, behavioral and health sciences. Hoboken,NJ:John Wiley&Sons,2010:3-224.
    [18]Heitzler C, Lytle L, Erickson D, et al. Physical activity and sedentary activitypatterns among children and adolescents: a latent class analysis approach. J PhysAct Health,2011,8(4):457-467.
    [19]Hagenaars JA, McCutcheon AL. Applied latent class analysis. Cambridge:Cambridge University Press,2002:89-106.
    [20]Lin TH, Dayton CM. Model selection information criteria for non-nested latentclass models. Journal of Education and Behavioral Statistics,1997,22(3):249-264.
    [21]Goodman LA. On the assignment of individuals to latent classes. SociologicalMethodology,37(1):1-22.
    [22]Silverwood RJ, Nitsch D, Pierce M, et al. Characterizing longitudinal patterns ofphysical activity in mid-adulthood using latent class analysis: results from aprospective cohort study. American Journal of Epidemiology,2011,174(12):1406-1415.
    [23]Chung H, Lanza ST, Loken E. Latent transition analysis: inference and estimation.Statistics in Medicine,2008,27(11):1834–1854.
    [24]Lanza ST, Dziak JJ, Huang L, et al(2011). PROC LCA&PROC LTA User'sGuide(Version1.2.7). University Park: The Methodology Center, Penn State.Retrieved from http://methodology.psu.edu.
    [25]Lanza ST, Patrick ME, Maggs JL. Latent transition analysis: benefits of a latentvariable approach to modeling transitions in substance use. J Drug Issues,2010,40(1):93-120.
    [26]Lanza ST, Collins LM. A new SAS procedure for latent transition analysis:transitions in dating and sexual risk behavior. Developmental Psychology,2008,44(2):446-456.
    [27]Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete datavia EM algorithm(with discussion). Journal of the Royal Statistical Society,SeriesB,1977,39:1–38.
    [28]Connell A, Bullock BM, Dishion TJ, et al. Family intervention effects on co-occurring early childhood behavioral and emotional problems: a latent transitionanalysis approach. J Abnorm Child Psychol,2008,36(8):1211-1225.
    [31]Embretson SE, Reise SP. Item response theory for psychologists. Mahwah,NewJersey:Lawrence Erlbaum Associates,2000:3-246.
    [32]Baker FB. The basics of item response theory.2ndedition.College Park,MD:ERICClearinghouse on Assessment and Evaluation,2001:5-128.
    [34]Baker FB. Methodology reviews: item parameter estimation under the one-, two-,and three-parameter logistic models. Applied Psychological Measurement,1987,11:111-142.
    [35]Bock RD, Aitkin M. Marginal maximum likelihood estimation of item parameters:an application of the EM algorithm. Psychometrika,1981,46:443–459.
    [36]Kingston NM, Dorans NJ. The analysis of item-ability regression: an exploratoryIRT model fit tool. Applied Psychological Measurement,1985,9:281-288.
    [37]Ludlow LH. A strategy for the graphical representation of Rasch model residuals.Educational and Psychological Measurement,1985,45:851-859.
    [38]Yen WM. Using simulation results to choose a latent trait model. AppliedPsychological Measurement,1981,5:245-262.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700