现代测量理论在慢性病患者生命质量测定量表体系共性模块研制中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
[背景]
     慢性病生命质量量表的开发研究是近年来健康相关生命质量研究领域的研究热点,是对慢性病患者进行生命质量评价的一项基础性和关键性的工作。目前,慢性病生命质量的量表虽已有多种,但在量表开发方面普遍存在以下问题:(1)量表研究各自为政,缺乏系统性;(2)国外专家开发的相关量表没有完全体现中国文化背景,急需开发具有中国特色的慢性病量表;(3)量表评价筛选多建立在经典测量理论基础上,现代测量理论鲜见应用于生命质量测定领域。
     鉴于此,本课题组从2003年即开始进行慢性病生命质量量表体系的研究,并申请了国家自然科学基金课题(30360092),课题组在借鉴现有的慢性病量表基础上,以共性模块与特异性模块结合的量表开发方式,系统、独立地开发了我国慢性病患者生命质量测定量表体系(Quality of life instruments for chronic disease,QLICD)。该体系包括一个可以用于各种慢性病患者生命质量测定的共性模块(QLICD-GM)以及在此基础上开发的8种慢性病的特异测定量表。
     在量表的开发工作受到高度关注的同时,量表及其条目的筛选与评价方法研究成为基础性工作。以往研究慢性病生命质量量表评价与筛选方法多建立在传统的经典测量理论(Classical test theory,CTT)基础上,该方法简便易懂,比如对量表的信度、效度和反应度、克朗巴赫a系数等系列指标进行计算评价。CTT是一套完整的测量理论与统计分析方法,是占据测量学统治地位的测量理论。但是该理论存在样本依赖性、测验平行假设难以实现及难以保证测验结果拓广的有效性等明显不足之处,使该理论的深入发掘与应用受到一定限制。在CTT研究缺陷的基础上,研究者提出了用现代测量理论(Modern test theory)来指导量表的开发。
     项目反应理论(Item response theory,IRT)和概化理论(Generalization theory,GT)是两种重要的现代测量理论。IRT具有下列特点:深入微观领域,将被试特质水平与被试在项目上的行为关联起来并将其参数化、模型化,可以精确估计测量误差;对被试潜在特质的估计不依赖特定的测验题目;参数的估计独立于被试样本;测验信息函数的概念代替了CTT的信度理论等。上世纪70年代以后,IRT得到充分发展,解决了经典测量理论未能解决的许多问题。项目反应理论在生存质量研究中的应用开始于20世纪末期,Haley和McHorney等用IRT分别评价了SF-36躯体功能的一维性,Cella和Chin-hung讨论了IRT在健康状况评价中的应用,使IRT深入到生存质量中。2004年于香港召开的国际生存质量会议有多数议题是与IRT在生存质量中的应用有关系的。目前中山大学也在开展IRT在残疾人生存质量量表中的应用研究。虽然目前IRT在国外发展很快,也有专家应用于研究生命质量相关量表的评价研究,但是在国内用于生命质量的研究较少。
     GT运用了实验技术和方差分析的基本原理,将经典测量理论与方差分析结合起来。提出了相对误差、绝对误差、概化系数、可靠性指数等一系列新的指标,取代了经典测量理论的信度、效度等传统指标,在研究测量误差方面具有更大的优越性,更加侧重于测量评价误差与决策需要间直接的关系,能够从宏观领域,不同的侧面针对不同测量情境估计测量误差的多种来源,以提高测验质量。GT理论相关的研究在我国还处于起步阶段,目前在面试、考核等领域有一些应用,少见将其应用到慢性病生命质量研究领域的报道。采用项目反应理论和概化理论两种现代测量理论方法相结合来分析评价慢性病生命质量量表研究尚未见报道。考虑到两种现代测量理论的诸多优势,及其在生命质量量表开发研究中的应用潜力,本研究拟采用项目反应理论、概化理论相结合从微观和宏观两个层面对QLICD-GM (V1.0)进行分析评价并与经典测量理论进行研究比较。
     [目的]
     1.尝试用项目反应理论和概化理论两种现代测量理论方法分析评价慢性病患者生命质量测定量表体系共性模块(QLICD-GM V1.0)。对共性模块进行微观和宏观层面的评价,为进一步修订模块条目,改进模块的结构提出建议;
     2.将项目反应理论、概化理论和经典测量理论三种测量方法进行比较,指出各自在慢性病生命质量量表研究中的优势与不足,为进一步研究开发其他类型疾病共性和特异性量表提供科学的方法借鉴。
     [内容]1.用项目反应理论,对QLICD-GM (V1.0)条目进行逐一的分析刻画,拟合其难度参数、区分度参数及信息量函数,结合项目特征曲线图,筛选出信息量较高的条目,剔除信息量过低的条目;
     2.用概化理论分G研究和D研究两个阶段进行评价。在G阶段,从宏观(量表不同领域)分析,反映不同误差来源的变异对总变异的影响;在D阶段,计算不同数量的条目下体现不同侧面影响的概化系数、可靠性指数和各种误差,对模块的信度进行评价,对不同领域条目的数量提供参考性建议,为不同的决策提供理论的依据。
     3.总结对比项目反应理论、概化理论和经典测量理论三大测量理论在生命质量研究中各自的优缺点及提出应用注意事项。
     [方法]
     1.调查方法以昆明医学院附属医院和云南省人民医院为主要调查点,调查包括高血压、冠心病、慢性胃炎等8种疾病在内的慢性病患者。要求患者有一定的读写能力。调查者以医生的身份出现,对共性模块的量表进行简单的解释和说明后将QLICD (V1.0)发给患者填写,等其完成后收回量表并检查有无漏项。调查分两次,入院时进行一次,出院之前进行一次重复调查。
     2.项目反应理论用Semejima等级反应模型对慢性病生命质量测定量表体系第一版QLICD-GM (V1.0)的每一个条目进行分析刻画,首先进行单维性假设的检验,然后从微观层面分析每个条目的信息量、信息函数并计算条目的难度、区分度,绘制其概率函数曲线和项目特征曲线。
     3.概化理论从宏观层面分析评价QLICD (V1.0)共性模块的整体有效性和可信性,并从不同的侧面和领域进行分析。根据资料的特点和设计方案类型,选用随机双面交叉(嵌套)设计的G研究和随机双面面交叉(嵌套)设计D研究方法,以患者作为测量目标,以不同的共性模块条目作为一个测量侧面,运用实验设计和方差分析的基本原理进行评价。将G研究中测量的效应或者变异的来源分为七个部分,一部分是被调查的不同疾病的患者p,第二部分是三个不同领域的各个条目i,第三部分是不同的测量时间t,其他部分是患者和条目、时间的交互效应P×i、p×t、i×t、p×i×t。采用两因素析因设计的ANOVA程序进行处理。D研究阶段,分3个领域,分别计算生理、心理和社会功能领域各自的变异分量估计值的相对误差、绝对误差、概化系数和可靠性指数等指标。
     4.提出项目反应理论和概化理论在慢性病生命质量量表研究中的应用注意事项及优缺点,对比经典测量理论,为以后进行新的共性和特异性量表的研制和开发提供方法学借鉴。
     5.统计学方法用数据库软件Excel、Foxpro进行数据的录入管理,采用统计分析软件SPSS15.0、MULTILOG7.03等对资料进行统计分析。
     [结果]
     第一部分项目反应理论
     1.单维性本研究分别按生理功能、心理功能和社会功能三个领域进行IRT分析。结果:治疗前,生理功能:第一特征与第二特征值之比2.6,基本满足单维性;心理功能5.7,完全满足单维性;社会功能社会影响侧面2.3,社会功能社会支持侧面:3.0,满足单维性的要求。治疗后,生理功能:第一特征与第二特征值之比2.9,基本满足单维性;心理功能:6.0,完全满足单维性;社会功能社会影响侧面2.9,社会功能社会支持侧面3.26,满足单维性。两次调查的单维性检验结果说明本量表可以采用项目反应理论进行分析。
     2.难度与区分度共性模块30个条目,3个领域(躯体功能、心理功能和社会功能)进行分析。两次慢性病生命质量测定量表共性模块不同领域的难度和区分度结果显示,time1条目第一次测定难度在-2.88~2.27之间。time 2条目S04、S05的难度最小值小于-3.0,PH5条目的难度最大值大于3.0,除了这3个条目其他所有条目的难度范围均在-2.93~2.93之间。说明QLICD量表体系共性模块难度适中。另外,30个条目的区分度都在0.63-1.88之间,均大于0.3,每个条目从1-4级呈单向递增,说明慢性病生命质量测定量表共性模块30个条目的区分度均较好。每个条目呈单向递增,均不存在逆反阈值。
     3.条目信息量平均信息量范围为0.37-0.99,其中生理功能领域信息量平均为0.38,心理功能领域平均信息量是0.80,社会功能领域平均信息量为0.48。其中,生理功能领域的平均信息量最小,心理领域平均信息量均较高,社会功能领域的11个条目中,SO1、SO3、SO6的信息量偏低,不能直接入选。根据每个条目的信息量,结合条目特征,从30个条目中选出24个好的条目。其中,信息量为0.47以上的条目有17个,直接入选。为保证共性模块各领域的完整性,保留PH2、PH6、PH7、PH8、SO1、S09、SO11。
     4.项目特征曲线图形显示,生理功能领域PHI-PH8条目概率曲线的概率值比心理领域的均要小,峰值普遍偏低,有少数项目的峰值接近重合,说明不同选项的区分能力不是太强,对于第一版的共性模块生理领域条目的选项还需要进一步研究改进。心理领域PS1-PS11,峰值之间层次感强,峰值范围相对较大,说明选择的概率较大,而且信息量均在0.47以上,这11项可以直接纳入量表中。社会功能领域SO1-SO11的概率曲线中,SO1、S03和S06、S09的区分能力偏低,其余曲线峰值均相对较高。
     第二部分概化理论
     1.总量表概化全域总的概化全域中,G研究表明:研究对象的变异效应α2(p)最大,为4.82,在总方差中占的比重为68%,说明被试研究对象的贡献最大,与预期构想的结果较吻合,拟合结果较理想。条目因素所占的比重较小,说明不同的条目具有较高的一致性,时间因素t的变异α2(t)仅为0.01,比重占0.14%,说明两次调查的时间因素对总的结果并未产生很大的影响,患者对两次调查总的反应性比较好。
     总的概况全域D研究表明:当尝试总量选取不同条目数(20、25、30、35、40)时,被试者与题目之间的交互作用、被试与时间交互作用、被试与时间、条目之间的交互作用及相对误差σ2(δ)和绝对误差σ2(△)均小于1,并且被试样本观测分均值估计和被试总体全域分均值的误差变异都比较小,概化系数Eρ2和可靠性指数Φ均大于0.9,说明QLICD-GM(V1.0)的测量信度和效度比较高。同时当概化全域中的题目样本容量逐步增大的时候,除了被试者变异分量没有发生变化以外,其余各种效应的变异分量都逐渐减小,概化系数和可靠性指数都增大。即使题目的样本容量为20题时,概化系数也是0.9905>0.9,但是当样本量逐渐增加,从35到40例的时候,概化系数无明显变化,仅提高0.0001。因此,如果要达到较好的信度,实际工作中建议共性模块选用35个左右的条目就比较好。
     2.生理领域生理领域G研究结果表明:研究对象的变异效应最大,为14.61,在总方差中占的比重为81%,生理领域的8个条目的相对误差范围是0.2203~0.2698,绝对误差取值范围0.2313~0.2894,均小于0.3,概化系数和可靠性指数均大于0.98,说明拟合效果较理想,生理领域各个条目的信度均较好,该结果与基于经典测量理论的重测信度、分半信度、克朗巴赫α系数结果是一致的。
     3.心理领域心理领域G研究结果显示,研究对象、条目和测量时间之间交叉作用的变异效应最大,占到48.96%的比例,而研究对象的效应仅占40%,与生理领域的结果有所不同。D研究结果显示,随着条目数的增加,概化系数和可靠性逐渐增大,当条目数达到11时,概化系数达到0.9886,条目数从11增加到13条时,概化系数增加到0.9897,13条之后调高的幅度较小,说明心理领域,条目数11-13较好,可以适当增加条目,使量表的信度更高。可靠性指数均大于0.95,说明心理领域各条目信度比较好。
     4.社会功能领域社会功能领域,患者与条目之间的交叉作用变异效应最大,占37.14%,其次为患者、条目和时间的交叉效应,占33.9%,再次患者效应为27.10%。条目拟合效果尚可,但是不同患者与条目的交叉作用太大,社会领域共性模块的部分条目需要进一步修订,使不同疾病类型的患者能够对条目保持较高的一致性的反应。
     [结论]
     1.项目反应理论和概化理论分析均可以较好地拟合应用于慢性病生命质量量表体系的开发研究。能够综合评价生命质量量表共性模块,具有较大的开发潜力和较好的应用前景;
     2.经典测量理论分析表明,QLICD-GM(V1.0),总的信度、效度和反应度均较好,难度和区分度适中;
     3.项目反应理论和概化理论结果表明,在慢性病共性模块3个领域中,项目分析生理功能领域的条目拟合结果相对信息量偏差,概率曲线偏低,说明条目不能够直接进入下一步新版本的研究中,需要进行适当的修订,但是该领域的概化系数和可靠性指数均较大;心理功能领域条目信度、效度、信息量、概化系数和可靠性指数等均较大,相对和绝对误差均较小,11个条目建议可以直接入选到下一个版本,社会功能领域项目拟合结果尚可,部分条目的信息量偏低需要调整。
     4.项目反应理论和概化理论两种方法相较于经典测量理论各自有其优点和不足之处,可以与经典测量理论方法相结合开发共性模块和特异性量表新的版本。
[Background]
     The Study on Quality of Life Instruments for chronic disease is a hot spot, also a basic and critical work. There are varied kinds of health related QOL measuring scales at present time. However, the available specific instruments for different types of chronic disease have several problems:(1) Available instruments have been developed by different research groups, leading to a multitude of assessment tools for the same disease. As a result, many investigators are at loss as to which ones to use for their studies, hampering the research progress in chronic diseases and related research areas. (2) Developing instruments for each individual disease independently is not only inefficient, but also of limited comparability. Such an approach focuses on individual symptoms rather than a core that offers a common structure that applies to different diseases. Further, given the number of diseases, it is not practical to develop an instrument(s) for each individual disease. (3) Measuring scales developed abroad do not reflect the Chinese cultural background sufficiently; (4) Modern test Theory few applying in the QOL measurement field.
     In order to overcome these problems, our team workers have been devoted into the study of Quality of Life Instruments System for chronic disease since 2003. Supported by the natural science foundation of China, by combining a general module and disease-specific modules, we have developed the Chinese QOL instruments system called QLICD (Quality of Life Instruments for Chronic Diseases). This system includes a general module (QLICD-GM), which can be used with all types of chronic disease patients, and specific modules for different diseases, with each module being used for only the relevant disease. The work receive widely cited and comment.
     At the time of the scale development being highly concerned, the research on the screening and evaluation methods of scales and items is the groundwork. Most of the previous researches have been based on the traditional classical test theory (CTT) which is simple and easy for understanding, such as the calculation and evaluation on scale reliability, validity and responsiveness, Cronbach coefficientαand other indexes. CTT is a complete set of test theory and statistical analysis method, which occupies the dominant position of surveying. But the theory has obvious defects such as sample dependence, difficult achieving of test parallel assumption and difficult guarantee of the validity of the test results extension, etc., which limits the further development and application of the theory to a certain extent. Based on CTT research defects, the researchers propose to guide the development of scales with modern test theory.
     Item response theory (IRT) and generalization theory (GT) are two important modern test theories. IRT has the following features:go deep into the micro field, associate the trait level of the subjects with the actions of the subjects on items, and create their parameters and models, which can accurately estimate the test error; the estimate on the latent trait of the subjects is independent of specific test items; the estimate on parameters is independent of the tested samples; the concept of test information function replaces the CTT reliability theory. Since 1970s, IRT has been sufficiently developed and solved many problems that the classical test theory fails to solve. The application of item response theory in the research of Quality of Life began in the late 20th century, during which Hale and McHorney et al. respectively evaluated the one-dimensional nature of SF-36 physical function with IRT, Cella and Chin-hung discussed the application of IRT in health evaluation, which brought IRT into the quality of life. Many topics of the International Quality of Life Conference held in Hong Kong in 2004 were associated with the application of IRT in the quality of life. At present, SUN YAT-SEN UNIVERSITY is also carrying out the research on the application of IRT in the quality of life scale of persons with disabilities. Although IRT is currently developed rapidly in foreign countries and also some experts use it to evaluate and research quality of life related scales, its domestic application in the research of quality of life is very little. Using the basic principles of experimental technique and variance analysis,
     GT combines the classical test theory with variance analysis, and put forwards relative error, absolute error, generalization coefficient, reliability index and other new indexes to replace the reliability, validity and other traditional indexes of classical test theory, which enables it to have more advantages in the research of test error. It focuses more on the direct relationship between test & evaluation error and decision-making needs, and has the ability to estimate the various sources of test error aiming at different test situations from macro field and different aspects, so as to improve the test quality. GT related research in China is still in its infancy. It is currently applied in the interview, assessment and other fields, with little application in the research of quality of life for chronic diseases, not to mention adopting the combination of IRT and GT the two modern test theory methods to analyze and evaluate the research of Quality of Life Instruments for Chronic Diseases. Considering the numerous advantages of the two modern test theories and their application potential in the research and development of quality of life scale, this research plans to adopt the combination of IRT and GT to analyze and evaluate QLICD-GM (V1.0) from the micro-level and macro-level, and compare with CTT.
     [Aims]
     1. This paper tries to adopt IRT and GT the two modern test theory methods to analyze and evaluate the General Module of Quality of Life Instruments for Chronic Diseases (QLICD-GM V1.0). Evaluate the general module from micro-level and macro-level, so as to make proposals for further modification of module items and improvement of module structure.
     2. Comparing IRT, GT and CTT, this paper points out their respective advantages and disadvantages in the research of Quality of Life Instruments for Chronic Diseases, so as to provide scientific methodological references for further researching and developing the generality and specificity scales of other types of diseases.
     [Contents]
     1. Analyze QLICD-GM (V1.0) items one by one with IRT to fit into the difficulty parameter, discrimination parameter and information quantity function; Combining with item characteristic curve, choose items with higher information quantity and eliminate items with too low information quantity。
     2. Conduct the evaluation in two stages G and D with GT. In the stage G, analyze from macro-level (different fields of the scale) to reflect the influences of the variation with different error sources on the total variation; in the stage D, calculate the generalizability coefficient, reliability index and various errors which reflect different influences under different number of items, evaluate the reliability of module, provide references for number of items in different fields and provide theoretical basis for different decisions。
     3. Summarize and compare the advantages and disadvantages of IRT, GT and CTT in life quality research and propose the application notes.
     [Methods]
     1. Investigation Methods Collect data in the Affiliated Hospital of Kunming Medical College and Yunnan People's Hospital among chronic patients with hypertension, coronary heart disease, chronic gastritis or other diseases (total 8 kinds of diseases). The patients must have certain ability to read and write. The investigators act as doctors and distribute QLICD (V1.0) to the patients for completion after briefly explaining the general module scale. After the patients complete, the investigators take them back and check if there is any omission. The patients will be investigated twice, one conducted on admission and the other before leaving hospital.
     2. Item Response Theory Analyze each item of QLICD-GM (V1.0) with Semejima Graded Response Model. Firstly test the one-dimensional assumption, then analyze the information quantity and information function of each item from the micro-level, calculate the difficulty and discrimination of each item, and draw its probability function curve and item characteristic curve.
     3. Generalizability Theory Analyze and evaluate the overall effectiveness and credibility of QLICD (V1.0) general module from macro-level and analyze from different aspects and fields. According to the characteristics of data and design type, choose G and D research methods of random double-sided-cross-over (nested) design, taking the patients as objects of measurement and different general module items as a side of measurement, conduct the evaluation with the basic principles of experimental design and variance analysis. The effects of measurement or variation sources in G research are divided into seven parts. The first part is p, the investigated patients with different diseases; the second part is i, each item in the three different fields; the third part is t, different measuring times; other parts are p×i, p×t, i×t, p×i×t, the interactive effects among patients, items and time. Use the two-factor factorial design ANOVA procedure for processing. There are three fields in D research stage to calculate the relative error, absolute error, generalizability coefficient and reliability index, etc. of the estimated variance components in the physiological, psychological and social function fields.
     4. Propose the application notes and advantages and disadvantages of IRT and GT in the research of Quality of Life Instruments for Chronic Diseases, comparing with CTT, provide methodological references for future research and development of new generality and specificity scales.
     5. Statistical Methods Use database software Excel and Foxpro to input and manage data, and use statistical analysis software SPSS15.0 and MULTILOG7.03, etc. to conduct statistical analysis on the data.
     [Results]
     Part I Item Response Theory
     1. One-dimensional Nature In this research, IRT analysis is conducted respectively in three fields:physiological function, psychological function and social function. The results:before the therapy, physiological function:the ratio of the first characteristic value and the second characteristic value is 2.6, basically meeting the one-dimensional nature requirement; psychological function:5.7, completely meeting the one-dimensional nature requirement; social impact of social function:2.3 and social support of social function:3.0, meeting the one-dimensional nature requirement. After the therapy, physiological function:the ratio of the first characteristic value and the second characteristic value is 2.9, basically meeting the one-dimensional nature requirement; psychological function:6.0, completely meeting the one-dimensional nature requirement; social impact of social function:2.9 and social support of social function:3.26, meeting the one-dimensional nature requirement. One-dimensional test results in the two surveys indicate that this scale can be analyzed adopting Item Response Theory.
     2. Difficulty and Discrimination Thirty items of general module are analyzed on three fields (physical function, psychological function and social function). The results of difficulty and discrimination in different fields of general module of Quality of Life Instruments for Chronic Diseases in the two tests indicate that the time 1 difficulty of item is-2.88~2.27; the time 2 minimum difficulty of items SO4 and SO5 is less than-3.0; the maximum difficulty of PH5 item is more than 3.0. Except these three items, the difficulty range of all other items is-2.93-2.93. Thus, it indicates that the difficulty of the general module of QLICD is moderate. In addition, the discrimination of 30 items is 0.63-1.88, more than 0.3, each item appears one-way increasing from level 1-4, which indicates that the discrimination of 30 items of general module of QLICD is good. Each item appears one-way increasing, without reverse threshold value.
     3. Item Information Quantity The average information quantity range is 0.37~0.99, of which the average information quantity of physiological function field is 0.38; that of psychological function field is 0.80; that of social function field is 0.48. Thus, that of physiological function field is the least, and that of psychological function field is relatively higher; among the 11 items of social function field, the information quantity of SO1, SO3 and SO6 is too low to be directly chosen. According to the information quantity and characteristics of each item,24 good items are chosen from the 30 items, of which 17 items have the information quantity over 0.47 and are chosen directly. To ensure the integrity of each field of general module, PH2, PH6, PH7, PH8, SO1, SO9, SO11 are reserved.
     4. Item Characteristic Curve The curve shows that the probability values of probability curve of physiological function field PHI-PH8 items are less than those of psychological function field items, and the peak values are generally low, the peak values of a small number of items nearly coincide, which indicates that the discrimination of different options is not strong and the options of items in physiological field of the first version of general module need to be further revised. Peak values PS1-PS11 of psychological field have obvious hierarchy and a relatively large peak range, which indicates that the probability of selection is relatively large, with the information quantity all above 0.47, so these 11 items can be directly chosen into the scale. As to the probability curves SO1-SO11 of social function field, the discrimination of SO1 and SO3, SO6 and SO9 is a little low, the peak values of other curves are relatively higher.
     PartⅡGeneralizability Theory
     1. Total Universe of Generalizability In the total universe of generalizability, the G research indicates that:the variation effectα2(p)of the research objects is the largest, which is 4.82, accounting for 68% of the total variance, which shows that the contribution of the research objects is the largest, matching the expected results; the fitting results are ideal. The proportion of item factor is small, which indicates that different items have very high consistency. The variationα2 (t) of time factor t is only 0.01, accounting for 0.14%, which indicates that the time factor in the two investigations has little influence on the overall results and the overall reactivity of the patients to the two investigations is good.
     The total universe of generalizability D research indicates that:When choosing different number of items (20,25,30,35,40), the interaction between subjects and items, between subjects and time, among the subjects, time and items, and the relative errorσ2(δ)and absolute errorσ2(△) are all less than 1. Besides, the error variance of the estimated mean of the subject samples observed scores and the mean of all subjects'universe scores are small; both the generalizability coefficient Ep2 and reliability indexΦare more than 0.9, which indicates that QLICD-GM (V1.0) has good reliability and validity. When the sample size in the universe of generalizability increases gradually, all variance components of other effects gradually decrease, except that the variance components of the subjects have no change. Even when the same size is 20, the generalizability coefficient is 0.9905>0.9. But when the sample size gradually increases from 35 to 40, there is no obvious change in generalizability coefficient, only increasing by 0.0001. Thus, to reach good reliability, in practical work, it is suggested to choose about 35 items for general module.
     2. Physiological Function Field Physiological field G research results indicate that:the variation effect of research objects is the largest,14.61, accounting for 81% of the total variance. The relative error range of the eight items of physiological field is 0.2203-0.2698, absolute error range 0.2313-0.2894, all less than 0.3. Both the generalizability coefficient and reliability index are more than 0.98, which indicates that the fitting result is ideal and the reliability of each item of physiological field is good. This result is consistent with the results of the test-retest reliability, split-half reliability and Cronbach coefficientαbased on CTT.
     3. Psychological Function Field Psychological field G research results indicate that:the variation effect of the cross effects among the research objects, items and measuring time is the largest, accounting for 48.96%, while the research objects'effect only accounts for 40%, different from the results of the physiological field. D research results indicate that with the increasing of the number of items, the generalizability coefficient and reliability index increase gradually. When the number of items reaches 11, the generalizability coefficient will reach 0.9886; when the number of items increase from 11 to 13, the generalizability coefficient will increase to 0.9897; the increasing amplitude is small after the number is more than 13, which indicates that in the psychological field, the number 11-13 is better; the number of items may be appropriately increased, so that the scale reliability will be higher. All reliability indexes are more than 0.95, which indicates that the reliability of all items in psychological field is very good.
     4. Social Function Field In the social function field, the variation effect of the cross effects between the patients and items is the largest, accounting for 37.14%, followed by the cross effects of the patients, items and time, accounting for 33.9%, thirdly the patients'effect, accounting for 27.10%. The item fitting result is acceptable, but the cross effects between different patients and items are too large, some items of the general module in social field need to be further revised, so as to enable patients with different types of diseases to keep high and consistent response to items.
     [Conclusions]
     1. Both the IRT and GT analysis can be better fitted and applied into the development and research of Quality of Life Instruments for Chronic Diseases. IRT and GT analysis are able to comprehensively evaluate the general module of quality of life instruments, with great development potential and good application prospects。
     2. The CTT analysis indicates that the total reliability, validity and responsiveness of QLICD-GM(V1.0) are good, the difficulty and discrimination are moderate。
     3. The results of IRT and GT indicate that among the three fields of QLICD-GM, the relative information quantity of item fitting result in physiological function field is a little poor, and the probability curve is a little lower than that of other two fields, which indicates that the items can not be directly fitted into the research of the next new version, and some of the items should be revised appropriately, but both the Eρ2 andΦof this field are relatively large; in the psychological function field, the item reliability, validity, information quantity, Eρ2 andΦare relatively large, both the relative error and absolute error are small, the 11 items can directly be fitted into the next version; the item fitting results of social function field are acceptable, some of the items have low information quantity, which needs to be adjusted。
     4. Comparing with CTT, IRT and GT the two methods have their own advantages and disadvantages. IRT and GT can be combined with CTT to develop the new version of general module and specificity scale。
引文
[1]陈娜萦,蒙晓宇,王海涛.我国慢性病防治现状[J].广西预防医学,2000,6(3):184-186.
    [2]李鹏,杨文秀.慢性病现状流行趋势国际比较及应对策略[J].天津医药,2009,37(4):254-257.
    [3]孙梅珍.糖尿病的健康指导[J].临床合理用药杂志,2011,1(54):225-229
    [4]万崇华.《生命质量测定与评价方法》[M].昆明:云南大学出版社,1999
    [5]Johnson JR, Temple R. Food and drug administration requirements for approval of new anticancer drugs[J]. Cancer Treat. Rep,1985,69:1155-1157.
    [6]万崇华,李晓梅,赵旭东等.慢性病患者生命质量测定量表体系研究[J].中国行为医学科学,2005,14(12):1130-1131.
    [7]张广恩,丁元林.糖尿病特异性生存质量量表的研究进展[J].中国慢性病预防与控制,2005,13(6):313-315.
    [8]张妮娅,唐伟,刘超.糖尿病患者健康相关生命质量研究进展[J].医学综述,2009,15(9):3002-3004.
    [9]胡明,孙振球.生活质量测评在糖尿病患者疗效评价中的应用[J].中南大学学报(医学版),2004,29(1):99-101.
    [10]Bradley C, Speight J. Patient perceptions of diabetes and diabetes therapy: assessing quality of life[J]. Diabetes Metab Res Rev,2002,18(3):864-869.
    [11]Tranos PG, Topuzis F, Stanggos NT, et al. Effect of photocoagula-tion treatment for diabeteic macular oedema on patient s vision—related quality oflife[J]. Curr Eye Res,2004,29(1):41-49.
    [12]Kolotkin RL, Crosby RD, Williams GR. Assessing weight related quality of life in obese persons with type 2 diabetes I[J]. Diabetes Res Clin Pract,2003, 61(2):125-132.
    [13]Kamoi K, Miyakoshi M, Maruyama R, et al. A quality of life assessment of intensive insulin therapy using insulin lispro switched from short-acting insulin and measured by an ITR-QOL questionnaire:aprospective comparison of multiple daily insulin inieetions and continuous subcutaneous insulin infusion [J]. Diabetes Res ClinPract,2004,64(1):19-25.
    [14]姜林娣.关节炎的生存质量评价[J].中国临床康复,2002,6(1):13-15.
    [15]孔丹莉,张广恩,潘海燕,胡利人,丁元林.糖尿病特异性生存质量量表的引进及文化调适[J].中国行为医学科学,2007,16(8):758-759.
    [16]Montazeri A, Gillis CR, McEwen J. Quality of life in patients with lung cancer: a review of literature from 1970 to 1995[J]. Chest 1998,113(2):467-481.
    [17]丁元林,孔丹莉,倪宗瓒等.糖尿病特异性生存质量量表的文化调适与修订[J].中国行为医学科学,2004,13(1):102-103.
    [18]孔丹莉,张广恩,潘海燕,胡利人,丁元林.糖尿病特异性生存质量量表的信度与效度初探[J].中国慢性病预防与控制,2007,15(3):202-204.
    [19]Sprangers MA, Cull A, Groenvold M, Bjordal K, Blazeby J, Aaronson NK. The European Organization for Research and Treatment of Cancer approach to developing questionnaire modules:an update and overview[J]. Qual Life Res, 1998,7(4):291-300.
    [20]Cella DF, Tulsky DS, Gray G, Sarafian B, Linn E, Bonomi A, et al. The functional assessment of cancer therapy scale:Development and validation of the general measure[J]. J Clin Oncol,1993,11(3):570-9.
    [21]孔丹莉,张广恩,潘海燕.结构方程模型及其在慢性病患者生存质量研究中的应用[J].中国卫生统计,2007,24(4):280-282.
    [22]丁元林,孔丹莉等.探讨2型糖尿病不同发展阶段影响因素的多状态Markov模型[J].中华预防医学杂志,2002,36(6):369-369.
    [23]潘海燕,丁元林,胡利人,孔丹莉.隐Markov模型及其在慢性病流行病学 研究中的应用[J].中国卫生统计,2009,26(1):38-40.
    [24]万崇华,李晓梅,赵旭东,等.慢性病患者生命质量测定量表体系研究[J].中国行为医学科学,2005,14(12):1130-1131.
    [25]万崇华,高丽,李晓梅,等.慢性病患者生命质量测定量表体系共性模块研制方法:条目筛选及共性模块的形成[J].中国心理卫生,2005,19(11):723-726.
    [26]杨静.三种教育与心理测量理论的比较研究[J].中国考试,2006,6:33-35.
    [27]黄丹媚.项目反应理论与经典测验理论之异同比较[J].考试周刊,2007,33,146-147.
    [28]黎光明,张敏强.基于概化理论的方差分量变异量估计[J].心理学报,2009,41(9):889-901.
    [29]詹沐清,卢荣华.论项目反应模型[J].科技信息,2009,8(17):28/78.
    [30]张蕙芬,迟家敏,王瑞萍.实用糖尿病学.第二版.北京:人民卫生出版社,2001:159-171.
    [31]WHO. The development of the WHO quality of life assessment instrument[J], Geneva, WHO,1993.
    [32]万崇华,高丽,李晓梅等.慢性病患者生命质量测定量表体系共性模块研制方法(一)--条目筛选及共性模块的形成[J].中国心理卫生杂志,2005,19,(11):723-726.
    [33]Oleson M. Subjectively perceived quality of life[J]. Image,1991,2:187-190.
    [34]杨岫岩.如何识别和控制临床研究中的混杂与偏倚[J].中华风湿病学杂志.2000,4,(2):114-115.
    [35]万崇华,杨铮,杨玉萍等.慢性病患者生命质量测定量表体系共性模块的考评[J].中国行为医学科学,2007,16,(6):559-561.
    [36]胡维芳.论项目反应理论[J].高等理科教育,2005,3期:64-66.
    [37]程艳,曾繁建,许维胜.项目反应理论模型及其问题分析[J].井冈山学院学 报(自然科学版),2007,28(6):39-41.
    [38]Bock RD. Estimating item parameters and latent ability when responses are scord in two or more nominal caregories[J]. Psychometrika,1972,37:29-31.
    [39]Lord FM, A theory of test scores[J]. Psycometric Monographs,1952.
    [40]张厚粲.《心理与教育统计学》[M].北京:北京师范大学出版社,1988.
    [41]Orlando M,Reeve BB. Applying irem response theory(IRT) modeling to questionnaire development,evaluation, and refinement[J], Qual Life Res 2007, 16:5-18.
    [42]唐宁玉,戴忠恒,项目反应理论在编制现代性量表中的应用[J].心理科学,1995,18:144-148.
    [43]廖丽.项目反应理论下等值方法-在Samejima模型下用比准则求解等值系数[J].科技广场,2009,3,(11):29-31.
    [44]Laurie Laughlin Davis,Strategies for Controlling Item Exposure in Computerized Adaptive Testing With the Generalized Partial Credit Mode[J]. Applied Psychological Measurement,2004,28:165-185.
    [45]Chang,H.H., Qian,J, & Ying,Z. A-stratified multistage CAT with b-blocking Applied [J].Psychological Measurement,2001,25:333-341.
    [46]DanielO.Segall, Principles of Multidimensional Adaptive Testing, Computerized Adaptive Testing [J].Theory and Praeitce,2000,53-73.
    [47]Hays RD, Morales LS, Reise SP:Item response theory and health outcomes measurement in the 21st century[J]. Med Care,2000,38(Suppl 9):28-42.
    [48]Reise SP, Waller NG:Item response theory and clinical measurement[J], Annu Rev Clin Psychol,2009,5:25-46.
    [49]Birnbaum A. Some latent trait models and their use in inferring an examince's ability. In:Lord FM,Norvick MR,eds. Statistical theories of mental test scores[J]. Reading,Mass:Addison Wesley,1968:397-479.
    [50]Rasch G.Probabilistic models for some intelligence and attainment tests. Gopcahagen, Denmark:Danmarks Pacdogogiske Institut,1960.
    [51]Haley SM,Mchorney CA, Ware JE, Evaluation of the MOS SF-36 Physical Functioning Scale(PH-10). I:Unidimensinonality and reproducibility of the Rasch Irem Scale[J], J Clin Epidemiol,1994,47:671-684.
    [52]Thomas Uttaro,Anthony Lehman. Graded response modeling of the Quality of Life [J]Interview.Evaluation and Program Planning,1999,22:41-52.
    [53]韩耀风,郝元涛,方积乾.项目反应理论及其在生存质量研究中的应用[J].中国卫生统计,2006,23(6):562-565.
    [54]Vander Linden,W.J. Hambleton, R. K. Handbook of Modern Item Response Theory[J], Springer-Verlag,1996:238-248.
    [55]R. Bock & Murray Aitkin. Marginal maximum likelihood estimation of item parameters:Application of an EM algorithm. Psychometrika, Springer,1981, 46(4):443-459.
    [56]Robert Tsutakawa & Hsin Lin. Bayesian estimation of item response curves. Psychometrika, Springer, vol.1986,51(2):251-267.
    [57]余嘉元.项目反应理论及其应用.南京:江苏教育出版社,1992.7.
    [58]Reise SP, Waller NG:Item response theory and clinical measurement[J]. Annu Rev Clin Psychol,2009,5:25-46.
    [59]Cella D, Chang CH. A discussion of item response theory and its application in health status assessment[J].Med-Care,2000,38(9 suppl):1166-1172.
    [60]De Champlain AF.A primer on classical test theory and item response theory for assessments in medical education[J]. Med Educ.2010 Jan,44(1):109-17.
    [61]敖勇前.概化理论研究综述[J].皖西学院学报,2008,24(2):49-52.
    [62]Luis Prieto MD, & Juan-Ramon Malagelada MD. Classical Test Theory versus Item Response Theory to Shorten the Inflammatory Bowel Disease Questionnaire (IBDQ) [J]. The American Journal of Gastroenterology,2004, 99:2068-2069.
    [63]Hays, RD;& Crall, JJ. Classical Test Theory and Item Response Theory Analyses of Multi-Item Scales Assessing Parents'Perceptions of Their Children's Dental Care[J]. Medical care,2006,44(11 suppl 3).
    [64]杜洪飞.经典测量理论与项目反应理论的比较研究[J].社会心理医学,2006,21(6):655-657.
    [65]Mehorney CA. Generic health measurement. Past accomplishments and a measurement paradigm for the 21st century [J].Ann Intern Med,1997,127: 743-750.
    [66]Cella D, Chang CH. A discussion of item response theory and its applications in health status assessment[J]. Med-Care,2000,38(9 suppl):1166-1172.
    [67]Mchorne CA, Cohen AS. Equating health status measures with item response theory. Illustrations with functional status items[J], Med Care,2000, 38(supplell):1143-1159.
    [68]陈新林.,用于量表项目功能差异分析的累积Logistic混合效应模型的研究和应用.硕士毕业论文,导师方积乾.
    [69]谭文艳.残疾人生存质量量表残疾模块反应尺度研究-5点或3点.,博士毕业论文,导师方积乾.
    [70]施得宝,孙步宽,基于项目反应理论的人才测评软件优势与开发[J].人力资源管理,2010,(12):88-89.
    [71]Baker F B. Item Response Theory:parameter estimation techniques[M] NewYork:Marcel Dekker, c2004
    [72]Sebille Veronique,Falissard Bruno. Methodological issues regarding power of classical test theory (CTT) and item response theory (IRT)-based approaches for the comparison of patient-reported outcomes in two groups of patients-a simulation study[J]. BMC Medical Research Methodology,2010,10(1): 1186-1191.
    [73]李伟明,丁元,庞晓亮.项目反应理论(IRT)模拟研究中的优良设计和混合效应模型[J].心理学报,1998,21(4):297-230.
    [74]郭庆科,房洁.经典测量理论与项目反应理论的对比研究[J].山东师范大学学报(自然科学版),2000,15(3):264-266.
    [75]何立国,周爱堡.青少年学生生活满意度量表”的概化理论研究[J].心理科学,2006,29(5):1199-1202.
    [76]陈社育,余嘉元.经典真分数理论与概化理论信度观评析[J].心理学动态,2001,9(3):258-263.
    [77]Cronbach L J, Rajaratnam N, Gleser G C. Theory of Generalizibility:A liberalization of reliability theory [J], British journal of Statistical psychology, 1963,16:137-163.
    [78]熊江玲.经典测量理论、概化理论及项目反映理论比较研究.求索,2004,4:99-100.
    [79]Linda Crocker James Algina. Introduction to classical and modern tert theoty[J], Brace Jovanovich College Publisher,1996.
    [80]Noreer, M,Webb, Richard J. Shavelson Rating of jobs in the United States [J], Journal of Applied Psychology,1981,66:186-192.
    [81]漆书清,戴海崎,丁树良.现代教育与心理测量学原理[M],江西教育出版社,1998.
    [82]Xiaohong Gaom, Richard j, Shavelson, Gail P, Baxter. Generalizability of large-scale performance assessments—in science:promise and problem [J]. Applied Measurement in Education,1994,7 (4):323-342.
    [83]Suzanne Lane, Mei Liu, Robert D, Ankenmann, Clement A. Stone, Generalizability and validity of a mathematics performance assessment [J], Journal of Educational Measurement,1996,33 (1):77-92.
    [84]孙晓敏,张厚粲.表现性评价中评分者信度估计方法的比较研究-从相关法、百分比法到概化理论[J].心理科学,2005,28(3):646-649.
    [85]Linn, R, L, Burton, E. Performance-based assessment:Implications of task specificity [J], Educational Measurement:Issues and Practice,1994,11(1): 3-9,20.
    [86]Baker, E. L. The role of domain specifications in improving the technical quality of performance assessment (Technical Report) [J], Los Angeles, CA: UCLA, Center for Research on Evaluation, Standards and Student Testing, 1992.
    [87]罗发友,王记志,刘友金.概化理论在教学水平测评中的应用[J],理工高教研究,2002,21(5)98-100.
    [88]扬秀君,苏永华.概化理论在国家公务员面试评分中的应用研究[J].人类工效学,2001, (3):20-23
    [89]王小慧,金瑜.概化理论在《超常儿童心理发展与教育》课程成绩评估中的应用研究[J].中国特殊教育,2004,5:91-94.
    [90]Breannan R L. (Mis)conceptions about generalizability, EM:ip, Spring 2009.
    [91]Ping Yin. A Multivariate Generalizfdbility Analysis of The Muhistate Bar Examination [J]Educational and PsychologicalM easurement,2005,65(4): 668-686.
    [92]Robert L. Brennan, An NCME instructional module on generalizability theory[J], Educational Measurement:Issues and Practice,1992:27-34.
    [93]Dany Laveault, Zumbo B D, Gessaroli M E, Boss M W. Modern Theories of Measurement:Problems and Issues[J]. University of Ottawa, Faculty of Education,1994.
    [94]Shavelson R J, Webb N M. Generalizability Theory:1973-1980[J]. British Journal of Mathematical and Statistical Psychology,1981,34:133-166.
    [95]刘桔.概化理论研究及应用前景[J].心理科学,2003,26(3):433-437.
    [96]陶琼霞.概化理论在人才测评中的应用[J],现代商业,2009(3):104.
    [97]杨志明,张雷.用多元概化理论对普通话测试的研究[J].心理学报,2002,34(1):50-55,34(3):332.
    [98]Brennan, R. L. Elements of generalizability theory (rev.ed).Iowa City. IA: American College Testing 1992.
    [99]Brennan, R. L. Gao. X & Cotton D. A Generalizability analyses of Work Keys listening and writing tests. Educational and Psychological Measurement.1995, 55(2):ⅰ57-176.
    [100]杨志明,张雷.从多元概化理论看高考综合能力测试的改进.心理学报,2004,36(2):195-200.
    [101]姜洪志.HSK(高等)口试评分时间的概化理论分析.硕士毕业论文,北京师范大学,2005.
    [102]Wickel EE; Welk GJ. Applying generalizability theory to estimate habitual activity levels[J]. Medicine and science in sports and exercise.2010,42(8): 145-156.
    [103]杨志明,张雷.韦氏儿童智力量表能否测量第3因子-WISC-CR的多元概化理论研究[J].心理科学,2003,26(2):305-307.
    [104]唐宁玉.三种测量理论的信度观[J].心理科学.1994,17(1):33-38.
    [105]Shavelson R. J., Webb N. M. Generalizability theory:a primer[J]. Californina: California:SAGE Publications, Inc.1991.
    [106]Shavelson R J, Bolus R. Self-concept:The interplay of theory and methods[J]. Journal of Educational Psycholgy,1982,74:3-17.
    [107]黄春霞.概化理论及其在HSK测试中的应用[J].云南师范大学学报(对外汉语教学与研究版),2004,2(2):42-46.
    [108]Woods R. & Lovie-Kitchin J. The reliability of visual performance measures in low vision, SuD2-1:246-249.
    [109]罗鸿.我国经典测量理论研究现状评述.安阳师范学院学报[J],2007(5):134-137
    [110]施德宝,孙步宽.基于项目反应理论的人才测评软件优势与开发[J].管理研究,2010(12):88-89
    [1]Bock RD. Estimating item parameters and latent ability when responses are scord in two or more nominal caregories[J]. Psychometrika,1972,37:29-31.
    [2]张敏强.20世纪教育测量学发展的回顾与现状评析[J].教育研究,1999,2(11):32-37.
    [3]Cronbach L J, Rajaratnam N, Gleser G C. Theory of Generalizibility:A liberalization of reliability theory[J], British journal of Statistical psychology, 1963,16:137-163.
    [4]敖勇前.概化理论研究综述[J].皖西学院学报,2008,24(2):49-52.
    [5]Shavelson R. J., Webb N. M. Generalizability theory:a primer[J]. Californina: California:SAGE Publications, Inc.1991.
    [6]陈社育,余嘉元.经典真分数理论与概化理论信度观评析[J].心理学动态,2001,9(3):258-263.
    [7]Shavelson R J, Bolus R. Self-concept:The interplay of theory and methods[J]. Journal of Educational Psycholgy,1982,74:3-17.
    [8]刘桔.概化理论研究及应用前景[J].心理科学,2003,26(3):433-437.
    [9]陶琼霞.概化理论在人才测评中的应用[J],现代商业,2009(3):104.
    [10]杨志明,张雷.用多元概化理论对普通话测试的研究[J].心理学报,2002,34(1):50-55,34(3):332.
    [11]Brennan, R. L. Elements of generalizability theory (rev.ed).Iowa City. IA: American College Testing 1992.
    [12]Brennan, R. L. Gao. X & Cotton D. A Generalizability analyses of Work Keys listening and writing tests. Educational and Psychological Measurement.1995, 55(2):i57-176.
    [13]杨志明,张雷.从多元概化理论看高考综合能力测试的改进.心理学报,2004,36(2)195-200.
    [14]姜洪志.HSK(高等)口试评分时间的概化理论分析.硕士毕业论文,北京师范大学,2005.
    [15]杨志明,张雷.韦氏儿童智力量表能否测量第3因子-WISC-CR的多元概化理论研究[J].心理科学,2003,26(2):305-307.
    [16]张厚粲.《心理与教育统计学》[M].北京:北京师范大学出版社,1988.
    [17]Xiaohong Gaom, Richard j, Shavelson, Gail P, Baxter. Generalizability of large-scale performance assessments in science:promise and problem[J]. Applied Measurement in Education,1994,7 (4):323-342.
    [18]杨志明.标准参照测验及其等级线信度的概化理论分析.心理学探新. 2003,23(30):52-56.
    [19]Ping Yin. A Multivariate GeneraIizfdbility Analysis of The Muhistate Bar Examination [J]Educational and PsychologicalM easurement,2005,65(4): 668-686..
    [20][20]杨志明,张雷.用多元概化理论对普通话测试的研究[J].心理学报,2002,34(1):50-55,34(3):332.
    [21]罗发友,王记志,刘友金.概化理论在教学水平测评中的应用[J],理工高教研究,2002,21(5)98-100.概化理论在教学水平测评中的应用[J],理工高教研究,2002,21(5)98-100.
    [22]Knut-Andreas Christophersen, Sφlvi Helseth and Thorleif Lund. A generalizability study of the Norwegian version of KINDLR in a sample of healthy adolescents[J], Quality of Life Research.2008,1 (17):87-93
    [23]Pierre Valois, Gaston Godin and Richard Bertrand. The reliability of constructs derived from attitude-behavior theories:an application of generalizability theory in the health sector[J].Quality & Quantity.1992,26 (3):291-305.
    [24]何立国,周爱堡.青少年学生生活满意度量表”的概化理论研究[J].心理科学,2006,29(5):1199-1202.
    [25]Beate P. Winterstein, John T. Willse, Thomas R. Kwapil and Paul J. Silvia. Assessment of Score Dependability of the Wisconsin Schizotypy Scales Using Generalizability Analysis [J]. Journal of Psychopathology and Behavioral Assessment,2010,32(4):575-585.
    [26]Cherdsak Iramaneerat, Rachel Yudkowsky, Carol M. Myford and Steven M. Downing. Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement[J]. Advances in Health Sciences Education, 2008,13(4):479-493.
    [27]Chirayu Auewarakul, Steven M. Downing, Rungnirand Praditsuwan and Uapong Jaturatamrong. Item Analysis to Improve Reliability for an Internal Medicine Undergraduate OSCE [J]. Advances in Health Sciences Education, 2005,10 (2),105-113.
    [28]Celh D. ChangCH. A discussionofitem respo nsetheo ryan dits apphca-tions in health status assessment J]. Med-Care,2000,38(9 Suppl):1166.1172.
    [29]Coyne K, Laj JS, M atzaL, etal. Developm ent ofth e over active Madder questionnaire short form (OAB-q SF):A brief measure of am ptom bother and health-related quality of life[J]. Quality of Life research,2004,13 (9):1549
    [30]Kosinski M, Bayliss MS, Bjorner JB, et al. A six-item short-form survey for measuring headache impact:the HIT-6[J]. Quality of Life Research.2003, 12(8):963-974
    [31]Jakob B. Bjorner, Mark Kosinski and John E. Ware Jr. Using item response theory to calibrate the Headache Impact Test (HITTM) to the metric of traditional headache scales [J]. Quality of Life Research,2003,12 (8): 981-1002
    [32]叶懿谆.以Rasch模式分析世界卫生组织生活品质问卷简明版在社匾老人的心理计量特质.硕士毕业论文,2006,中国医药大学
    [33]McHorney CA. Co hen AS. Euating health status measures with item response theory:Illustrations with fun ctional status items[J]. Med Care.2000,38(supplⅡ):ⅱ-43-ⅱ-59.
    [34]Haley SM, M chorn ey CA.W are JE. Evaluation of the MOS SF-36Physical Functioning Scaje (PH.10), Ⅰ:Unidimensionality and reproducibility of the Rasch Item Scale[J]. J Clin Epidemiol.1994,47:671-684.
    [35]Dennis L. Hart, Ying-Chih Wang, Paul W. Stratford and Jerome E. Mioduski. Computerized adaptive test for patients with foot or ankle impairments produced valid and responsive measures of function [J]. Quality of Life Research,2007,17 (8):1081-1091
    [36]Morten Aa. Petersen, Mogens Groenvold, Neil K. Aaronson, Wei-Chu Chie, Thierry Conroy, Anna Costantini, Peter Fayers, Jorunn Helbostad, Bernhard Holzner and Stein Kaasa, et al. Development of computerized adaptive testing (CAT) for the EORTC QLQ-C30 physical functioning dimension[J]. Quality of Life Research.1972,20 (4):479-490
    [37]杜洪飞.经典测量理论与项目反应理论的比较研究[J].社会心理医学,2006,21(6):655-657.
    [38]黄春霞.概化理论及其在HSK测试中的应用[J].云南师范大学学报(对外汉语教学与研究版),2004,2(2):42-46.
    [39]Shavelson R J, Bolus R. Self-concept:The interplay of theory and methods[J]. Journal of Educational Psycholgy,1982,74:3-17.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700