大学英语听力能力认知诊断评估模型的构建与验证
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
心理测量理论的发展使得准确有效地对学习者知识和技能进行过程化和个性化测量的时代已经到来。教育测量和认知心理学专家们终于能设计出来专门提供学习者某学科认知能力优缺点的试题(Leighton&Gierl,2007b)。认知诊断作为新一代心理与教育测量理论的核心逐步被外语界发现和利用,并显示出巨大的优势和潜力。它能挖掘学习者认知加工过程中潜在的特性和属性掌握模式,有助于针对性地提供补救措施和策略指导。
     EFL听力因其涉及的属性复杂一直是英语教育者和学习者试图突破的瓶颈。传统的评估与教学两张皮以及其粗颗粒的评估模式已经难以满足个性化和过程化的新需要,如何为大学生提供一个细颗粒度的听力能力自我诊断模型、并获得对症的学习策略指导干预,最终达到对不同水平的学生进行有个别的指导已经成为现代外语教育的新目标。
     该研究基于EFL听力理解理论、语言测试学理论和认知诊断理论,用心理测量学的方法构建英语听力诊断模型并进行分析和验证,同时讨论如何呈现诊断报告和补救性指导。具体问题如下:
     1. EFL听力理解中涉及到的属性和属性之间的层级关系是什么?
     2.如何构建并验证基于认知诊断CDA的EFL听力能力诊断试题模型?
     3.呈现网络诊断报告应该考虑的因素是什么?
     4.如何根据诊断报告指导学习者补救他们学习过程中问题?
     根据研究问题,本研究包括理论探索、模型开发和验证、诊断报告和针对报告的指导干预,该论文侧重点前两个阶段。首先对前期编写的初步诊断试题(preliminarydiagnostic tests)进行项目分析和模型验证,然后按照认知诊断理论重新确定听力属性和矩阵(Q-matrix),构建了听力认知诊断试题假设模型(hypothetic model)。为了验证这一假设模型,首先组织534个学生试用,获取他们的答题结果(responses)。同时根据三方不同的主体对该试题假设模型的认识建立了三个版本的假设模型:试题设计者对假设模型试题重新标注后的模型(H1),学生有声思维结果得出来的模型(H2)和7个学科专家对假设模型试题标注的结果(H3)。然后用G-DINA分析对比,找出三个中与学生做题结果最拟合的。研究结果证明7个学科专家模型(H3)具有最好的数据—模型拟合度(data-model fit)。之后,确定的模型被应用到新的数据中进行再次验证。结果证明H3仍旧是拟合度最好的模型。研究发现如下:
     1. EFL听力认知诊断模型的属性包括:辨音语调和重音、词汇和口语表达、语法结构、细节捕获、主旨的抓取、上下文和文化背景推测、记忆和记笔记能力。前6个为听力语言语篇能力,第7为策略能力。
     2.模型构建方法除了属性确定和定义、试题矩阵构架和试题编写、认知诊断分析验证和诊断报告呈现外,补救性干预指导是诊断必然的目标。
     3.网络诊断报告呈现必须要考虑五点:对报告使用者进行引导、结合传统的分数、提供多样的诊断分数、尽量做到多维动态化分析和加入描述性语言以方便使用者充分理解。
     4.对学习者的指导必须基于听力理解和学习理论,并针对学习者的问题和诊断报告。指导方式可以是网络互动问答式、成功学习者案例式、有声思维式心理微观过的呈现和通过对国际语言学专家有针对性的采访等文本和音视频结合式。
     该认知诊断模型把教、学与评三方有机地融为一体。首先它能评估群体和个体在听力认知过程中的属性掌握模式及其优劣特点,依托网络更可以提供即时网络化诊断报告和针对性指导。其次教师可根据诊断结果发现个体和群体的特点和需求,调整和优化自己的教学方法,真正做到因材施教。更重要的是学生可以通过自我诊断发现问题,并获取有效的指导和干预,最终帮助他们走向自主学习之路,这也是现代语言教育的最终极目标。从这个角度讲,本研究也有助于开拓未来语言研究的新方向。
Cognitive diagnostic assessment (CDA) has great potentials in large scale testing. Withit, researchers in educational measurement and cognitive psychology are finally in a positionto design tests targeted specifically for providing valuable information about students'cognitive strengths and weaknesses (Leighton&Gierl,2007b).
     CDA had not been adopted in second language assessment until in recent several years,but very few cases in EFL listening research have been found. The dissertation makes a boldattempt to employ Cognitive Diagnostic Approaches (CDAs) to develop a listeningdiagnostic test model for fine-grained assessment and classification of the learner‘sknowledge state and presence or absence of certain attributes. The study also touches on thediagnostic report and the corresponding remedial guidance. The addressed questions are asfollows:
     1. What attributes count in EFL listening and how are they related hierarchically?
     2. How can we construct and validate the EFL listening CDA model?
     3. What are the considerations in presenting the online diagnostic reports?
     4. How can we guide the learners based on their diagnostic reports?
     The methodology falls into two phases. Phase I is analyzing the preliminary diagnostictest model (PDT). This includes attribute identification, Q-matrix construction, IRT andCDA analysis. In Phase II, the hypothetic model (H) is proposed based on Phase I and it isthen validated by comparing its3versions of hypothetic models, i.e. the test developer‘smodel (H1), students‘Think Aloud Protocol (TAP) model (H2) and7domain experts‘judgment model (H3).534test responses and the above three versions of hypothetic modelsare psychometrically analyzed using G-DINA model to find out their absolute and relativemodel-data fit statistics. The best-fit sub-model proves to be H3. After these exploratoryprocedures, the best-fit model is further verified with new data and its goodness of fit isconfirmed.
     Considerations for diagnostic online reports are also proposed, followed by four kindsof guidance for learners to cope with their diagnosed problems.
     The findings come as follows:
     To answer the first question, seven attributes were identified: phonological level (sounddiscrimination, stress and intonation); lexical level (less frequent vocabulary and oralexpressions); syntactic level (less frequent vocabulary and oral expressions); facts anddetails; main ideas; contextual and cultural related inference; note-taking and working memory. They are mainly independent attributes.
     How to develop the CDA model? It generally follows the4-step procedure of―attributeidentification‖→―Q-matrix construction and test item writing‖→―Psychometric analysisand verification‖→―Diagnostic reporting‖. However,the research demonstrated that―remedial guidance‖is a very appropriate and desirable destination for a diagnostic model.They five constitute a perfect system to empower learners in pursuing autonomy in EFLlearning.
     About the online diagnostic report,5considerations were proposed:1) offeringorientation information to the diagnostic score users;2) integrating the total scores into thediagnostic report;3) making diagnostic report files and scores interpretable, such as in theform of graphs, tables or other interactive ways;4) offering multi-dimensional analysis;5)including narrative interpretations for the unconventional diagnostic scores.
     Guidance is often given according to learner problems and diagnostic reports. It can bequestion-answer form of listening strategies, cases of successful learners, learners‘TAPreports and interview aural and visual of some world linguists.
     In the context of language learning and instruction, the availability of such diagnosticfeedback and guidance at a finer grain size would allow the instructor to identify thelearner‘s specific deficiencies and to plan instruction that is tailored to the needs of theparticular learner. It also facilitates learners on their journey to autonomy. In thelearner-centered and technology-assisted age of education, this study is especiallysignificant in its great potentials.
引文
Ableeva, R.&Lantolf, J. P (2011). Mediated dialogue and the microgenesis of secondlanguage listening comprehension. Assessment in Education: Principles, Policy&Practice.18(2),133-149
    Afflerbach, P.(2000). Verbal reports and protocol analysis. Handbook of readingresearch,3,163-179.
    Alderson, J.(2005). Diagnosing foreign language proficiency: The interface betweenlearning and assessment: Continuum.
    Alderson, J.(2006). Implementing and evaluating a self-assessment mechanism for theweb-based language and style course. Language and Literature,15(3),291-306.
    Alderson, J.(2010).―Cognitive diagnosis and Q-matrices in language assessment‖: Acommentary. Language Assessment Quarterly,7(1),96-103.
    Alderson, J. C., Clapham, C.,&Wall, D.(1995). Language test construction andevaluation: Cambridge University Press.
    Alderson, J. C.,&Hamp-Lyons, L.(1996). TOEFL preparation courses: A study ofwashback. Language Testing,13(3),280-297.
    Alderson, J. C.,&Huhta, A.(2005). The development of a suite of computer-baseddiagnostic tests based on the common European framework. Language Testing,22(3),301-320.
    Alderson, J. C.,&Lukmani, Y.(1989). Cognition and reading: Cognitive levels asembodied in test questions. Reading in a foreign language,5(2),253-270.
    Alves, C. B.(2012). Making diagnostic inferences about student performance on theAlberta education diagnostic mathematics project: an application of the Attribute HierarchyMthod. University of Albertra.
    Anderson, J. R.(1995). Cognitive psychology and its implications: W.H. Freeman.
    Anderson, N. J.,&Vandergrift, L.(1996). Increasing metacognitive awareness in theL2classroom by using think-aloud protocols and other verbal report formats. LanguageLearning Strategies Around The World: Cross-Cultural Perspectives,3-18.
    Anderson, R. C., Reynolds, R. E., Schallert, D. L.,&Goetz, E. T.(1977). Frameworksfor comprehending discourse. American Educational Research Journal,14(4),367-381.
    Ashcraft, M.(2006). Cognition (4. Uppl.): New Jersey: Pearson Education.
    Benson, P.(2007). Autonomy in language teaching and learning. Language Teaching,40(1),21-40.
    Berg, C. A.(2000). Intellectual development in adulthood. Handbook of intelligence,117-137.
    Berne, J. E.(2004). Listening comprehension strategies: A review of the literature.Foreign Language Annals,37(4),521-531.
    Birenbaum, M., Nasser, F.,&Tatsuoka, C.(2005). Large-scale diagnostic assessment:Mathematics performance in two educational systems. Educational Research and Evaluation,11(5),487-507.
    Bloom, B. S.(1956). Taxonomy of educational objectives. Handbook I: Cogntivedomain. New York: Longman.
    Bloom, B. S.,&Krathwohl, D. R.(1956). Taxonomy of educational objectives:Cognitive domain: Longmans, Green.
    Bryant, R. A., Harvey, A. G., Dang, S. T., Sackville, T.,&Basten, C.(1998). Treatmentof acute stress disorder: A comparison of cognitive-behavioral therapy and supportivecounseling. Journal of consulting and clinical psychology,66(5),862.
    Buck, G.(1991). The testing of listening comprehension: An introspective study.Language Testing,8(1),67-91.
    Buck, G.(1994). The appropriacy of psychometric measurement models for testingsecond language listening comprehension. Language Testing,11(2),145-170.
    Buck, G.(2001). Assessing listening: Cambridge University Press.
    Buck, G.(2003). Assessing listening: Cambridge University Press (second version).
    Buck, G.,&Tatsuoka, K.(1998). Application of the rule-space procedure to languagetesting: Examining attributes of a free response listening test. Language Testing,15(2),119-157.
    Buck, G., Tatsuoka, K.,&Kostin, I.(1997). The subskills of reading: Rule-spaceanalysis of a multiple-choice test of second language reading comprehension. LanguageLearning,47(3),423-466.
    Burger, S.,&Doherty, J.(1992). Testing receptive skills within acomprehension-based approach. In Courchene, R. J., J. I. Gidden, J. St. John&C. Therien(Eds.), Comprehension-based second language teaching (pp.299-318). Ottowa: Universityof Ottowa Press
    Carrell, P. L.,&Eisterhold, J. C.(1983). Schema theory and ESL reading pedagogy.TESOL quarterly,17(4),553-573.
    Castello, E.(2008). Text complexity and reading comprehension tests: Peter LangPublishing, Incorporated.
    Chamot, A. U.(2005). Language learning strategy instruction: Current issues andresearch. Annual Review of Applied Linguistics,25(1),112-130.
    Chamot, A. U.,&O'malley, J. M.(1996). The cognitive academic language learningapproach: A model for linguistically diverse classrooms. The Elementary School Journal,259-273.
    Chang, A. C. S.,&Read, J.(2006). The effects of listening support on the listeningperformance of efl learners. TESOL Quarterly,40(2),375-397.
    Chang, A. C. S.,&Read, J.(2007). Support for foreign language listeners itseffectiveness and limitations. RELC journal,38(3),375-394.
    Chen, J.,&De La Torre, J.(2013a, Feb.1).[G-DINA estimation code program].Personal contact.
    Chen, J.,&De La Torre, J.(2013b). A general cognitive diagnosis model forexpert-defined polytomous attributes. Applied Psychological Measurement.
    Chen, J., Torre, J. D. L.,&Zhang, Z.(2013).(in press) relative and absolute fitevaluation in cognitive diagnosis modeling. Journal of Educational Measurement
    Chen, Y.-H.(2012). Cognitive diagnosis of mathematics performance between ruraland urban students in taiwan. Assessment in Education: Principles, Policy&Practice,19(2),193-209.
    Chiang, C. S.,&Dunkel, P.(1992). The effect of speech modification, prior knowledge,and listening proficiency on EFL lecture learning. TESOL Quarterly,26(2),345.
    Ching-Shyang Chang, A.(2007). The impact of vocabulary preparation on L2listeningcomprehension, confidence and strategy use. System,35(4),534-550.
    Ciekanski, M.(2007). Fostering learner autonomy: Power and reciprocity in therelationship between language learner and language learning adviser. Cambridge Journal ofEducation,37(1),111-127.
    Clement, J.(2007). The impact of teaching explicit listening strategies to adultintermediate-and advanced-level ESL university students.(PH.D.), Duquesne University.
    Close, C. N.(2012). An exploratory technique for finding the Q-matrix for the DINAmodel in cognitive diagnostic assessment: Combining theory with data. University OfMinnesota.
    Cohen, J.,&Cohen, P.(1983). Applied multiple regression/correlation analysis for thebehavioral sciences: Correlation analysis for the behavioral sciences: Lawrence Erlbaum.
    Crocker, L.,&Algina, J.(1986). Introduction to classical and modern test theory:ERIC.
    Cui, Y.(2007). The hierarchy consistency index: A person-fit statistic for the AttributeHierarchy Method: University of Alberta.
    Cui, Y., Gierl, M. J.,&Chang, H. H.(2012). Estimating classification consistency andaccuracy for cognitive diagnostic assessment. Journal of Educational Measurement,49(1),19-38.
    Cui, Y.,&Leighton, J. P.(2009). The hierarchy consistency index: Evaluating personfit for cognitive diagnostic assessment. Journal of Educational Measurement,46(4),429-449.
    Cui, Y., Leighton, J. P., Gierl, M. J.,&Hunka, S.(2006). A person-fit statistic for theattribute hierarchy method: The hierarchy consistency index. Paper presented at the AannualMeeting of the National Council on Measurement in Education,, San Francisco, CA.
    Das, J. P., Kirby, J.,&Jarman, R. F.(1975). Simultaneous and successive synthesis: Analternative model for cognitive abilities.
    Daugherity.(2008). A correlative study of the International English Language TestingSystem listening test and a new repetition test. Capella University.
    De La Torre, J.(2008a). An empirically based method of Q-matrix validation for theDINA model: Development and applications. Journal of educational measurement,45(4),343-362.
    De La Torre, J.(2008b). The generalized DINA model. Paper presented at theinternational meeting of the Psychometric Society, Durham, NH.
    De La Torre, J.(2009). DINA model and parameter estimation: A didactic. Journal ofEducational and Behavioral Statistics,34(1),115-130.
    De La Torre, J.(2011). The generalized DINA model framework. Psychometrika,76(2),179-199.
    De La Torre, J.,&Douglas, J. A.(2004). Higher-order latent trait models for cognitivediagnosis. Psychometrika,69(3),333-353.
    Decarlo, L. T.(2011). On the analysis of fraction subtraction data: The DINA model,classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement,35(1),8-26.
    Derry, S. J.,&Murphy, D. A.(1986). Designing systems that train learning ability:from theory to practice. Review of Educational Research,56(1),1-39.
    Dibello, L. V., Roussos, L. A.,&Stout, W.(2007). Review of cognitively diagnosticassessment and a summary of psychometric models. Handbook of Statistics Psychometrics,26,979-1030.
    Dobson, K. S.,&Shaw, B. F.(1986). Cognitive assessment with major depressivedisorders. Cognitive Therapy and Research,10(1),13-29.
    Doornik, J. A.(1994-2011). Ox console (Version version6.21). Retrieved fromhttp://www.doornik.com/download_oxcons.html
    Douglas, J., De La Torre, J., Chang, H., Henson, R.,&Templin, J.(2006, April). Skillsdiagnosis with latent variable models. Paper presented at the Symposium presented at theannual meeting of the National Council on Measurement in Education,, San Francisco, CA.
    Ehrman, M. E., Leaver, B. L.,&Oxford, R. L.(2003). A brief overview of individualdifferences in second language learning. System,31(3),313-330.
    Embretson, S. E.(1998). A cognitive design system approach to generating valid tests:Application to abstract reasoning. Psychological Methods,3(3),380.
    Fall, E.(2009). Applications of exploratory Q-matrix discovery procedures indiagnostic classification models.(Master‘s of Arts), University of Kansas.
    Field, J.(1998). Skills and strategies: Towards a new methodology for listening. ELTjournal,52(2),110-118.
    Field, J.(2000). Finding one's way in the fog: Listening strategies and second-languagelearners. Modern English Teacher,9(1),29-34.
    Field, J.(2002). The changing face of listening. Methodology in language teaching: Ananthology of current practice,242-247.
    Field, J.(2008). Listening in the language classroom: Cambridge University Press.
    Flowerdew, J.,&Miller, L.(2005). Second language listening: Theory and practice:Cambridge University Press.
    Francis, D. J., Snow, C. E., August, D., Carlson, C. D., Miller, J.,&Iglesias, A.(2006).Measures of reading comprehension: A latent variable analysis of the diagnostic assessmentof reading comprehension. Scientific Studies of Reading,10(3),301-322.
    Freedle, R.,&Kostin, I.(1999). Does the text matter in a multiple-choice test ofcomprehension? The case for the construct validity of TOEFL's minitalks. Language Testing,16(1),2-32.
    Fu, J.,&Li, Y.(2007). Cognitively diagnostic psychometric models: An integrativereview. Paper presented at the Annual Meeting of the National Council on Measurement inEducation, Chicago, IL.
    Gao, L.(2007). Cognitive psychometric modeling of the MELAB reading items.(PHD), University of Alberta.
    Gao, L.,&Rogers, T.(2007). Cognitive psychometric modeling of the MELABreading items. Paper presented at National Council on Measurement in Education.
    Gierl, M., Leighton, J. P.,&Hunka, S. M.(2007). Using the attribute hierarchy methodto make diagnostic inferences about examinees. In Leighton, Jacqueline P.&Mark J. Gierl(Eds.), Cognitive diagnostic assessment for education: Theory and application (pp.237-274):Cambridge University Press.
    Gierl, M. J.(2007). Making diagnostic inferences about cognitive attributes using therule‐space model and attribute hierarchy method. Journal of Educational Measurement,44(4),325-340.
    Gierl, M. J., Alves, C.,&Majeau, R. T.(2010). Using the attribute hierarchy method tomake diagnostic inferences about examinees‘knowledge and skills in mathematics: Anoperational implementation of cognitive diagnostic assessment. International Journal ofTesting,10(4),318-341.
    Gierl, M. J., Cui, Y.,&Zhou, J.(2009). Reliability and attribute‐based scoring incognitive diagnostic assessment. Journal of Educational Measurement,46(3),293-313.
    Gierl, M. J.,&Leighton, J. P.(2007). Directions for future research in cognitivediagnostic assessment. In Leighton, J.P.&Mark J. Gierl (Eds.), Cognitive diagnosticassessment for education: Theory and applications (pp.341-351): Cambridge UniversityPress.
    Gierl, M. J., Leighton, J. P.,&Hunka, S. M.(2000). An ncme instructional module onexploring the logic of Tatsuoka's rule‐space model for test development and analysis.Educational Measurement: Issues and Practice,19(3),34-44.
    Gierl, M. J., Leighton, J. P., Wang, C., Zhou, J., Gokiert, R.,&Tan, A.(2009).Validating cognitive models of task performance in algebra on the SAt Nueva York: TheCollege Board.
    Gierl, M. J., Roberts, M., Alves, C.,&Gotzmann, A.(2009). Paper presented at thesymposium―How to build a cognitive model for educational assessments‖.
    Gierl, M. J., Wang, C.,&Zhou, J.(2008). Using the attribute hierarchy method to makediagnostic inferences about examinees' cognitive skills in algebra on the SAT. The Journalof Technology, Learning and Assessment,6(6).
    Gierl, M. J., Zheng, Y.,&Cui, Y.(2008). Using the attribute hierarchy method toidentify and interpret cognitive skills that produce group differences. Journal of EducationalMeasurement,45(1),65-89.
    Gierl, M. J.,&Zhou, J.(2008). Computer adaptive-attribute testing: A new approach tocognitive diagnostic assessment. Zeitschrift für Psychologie/Journal of Psychology,216(1),29-39.
    Gilakjani, A. P.,&Ahmadi, M. R.(2011). A study of factors affecting EFL learners'english listening comprehension and the strategies for improvement. Journal of LanguageTeaching and Research,2(5).
    Goh, C.(2000). A cognitive perspective on language learners' listening comprehensionproblems. System,28(1),55-75.
    Goh, C. C. M.(2002). Teaching listening in the language classroom: SEAMEORegional Language Centre.
    Goodman, D. P.,&Hambleton, R. K.(2004). Student test score reports and interpretiveguides: Review of current practices and suggestions for future research. AppliedMeasurement in Education,17(2),145-220.
    Graesser, A. C., Millis, K. K.,&Zwaan, R. A.(1997). Discourse comprehension.Annual review of psychology,48(1),163-189.
    Graham, S.(2006). Listening comprehension: The learners‘perspective. System,34(2),165-182.
    Guerrero, A.(2001). Cognitively diagnostic perspectives on English and Spanishversions of a test of mathematics aptitute. Columbia University.
    Hambleton, R. K.(1991). Fundamentals of Item Response Theory (Vol.2): SagePublications, Incorporated.
    Hambleton, R. K.,&Slater, S. C.(1997). Reliability of credentialing examinations andthe impact of scoring models and standard-setting policies. Applied Measurement inEducation,10(1),19-28.
    Hartz, S.(2002). A bayesian framework for the unified model for assessing cognitiveabilities: Blending theory with practicality, unpublished doctoral dissertation.(PhD),University of Illinois at Urbana-Champaign.
    Hasan, A. S.(2000). Learners' perceptions of listening comprehension problems.Language Culture and Curriculum,13(2),137-153.
    Heaton, J. B.(2000). Writing English Language Tests. Beijing: Foreign LanguageTeaching and Research Press.
    Henson, R.,&Douglas, J.(2005). Test construction for cognitive diagnosis. AppliedPsychological Measurement,29(4),262-277.
    Henson, R. A., Templin, J. L.,&Willse, J. T.(2009). Defining a family of cognitivediagnosis models using log-linear models with latent variables. Psychometrika,74(2),191-210.
    Huebner, A.(2010). An overview of recent developments in cognitive diagnosticcomputer adaptive assessments. Practical Assessment, Research&Evaluation,15(3), n3.
    Huff, K.,&Goodman, D. P.(2007). The demand for cognitive diagnostic assessment.
    Hughes, A.(1989). Testing for Language Teachers. Cambridge: Cambridge UniversityPress.
    Huhta, A.(2008). Diagnostic and formative assessment In Spolsky, Bernard&FrancisM. Hult (Eds.), The handbook of educational linguistics: John Wiley&Sons
    Hung, H. C. M.(2009). Split attention in reading comprehension for ESL/EFL:Reducing extraneous cognitive load by using integrated format in reading comprehension:VDM Publishing.
    Im, S.,&Park, H. J.(2010). A comparison of us and korean students' mathematicsskills using a cognitive diagnostic testing method: Linkage to instruction. EducationalResearch and Evaluation,16(3),287-301.
    Imhof, M.(2010). What is going on in the mind of the listener? The cognitivepsychology of listening. In Wolvin, A.(Ed.), Listening and human communication in the21st century (pp.97): Blackwell Publishing Ltd.
    Jaeger, R. M.(1998). Reporting the results of the national assessment of educationalprogress: American Institutes for Research.
    Jang, E. E.(2005). A validity narrative effects of reading skills diagnosis on teachingand learning in the context of NG TOEFL: University of Illinois at Urbana-Champaign.
    Jang, E. E.(2008). A framework for cognitive diagnostic assessment. Towards adaptiveCALL: Natural language processing for diagnostic language assessment,117-131.
    Jang, E. E.(2009a). Cognitive diagnostic assessment of L2reading comprehensionability: Validity arguments for fusion model application to languedge assessment. LanguageTesting,26(1),031-073.
    Jang, E. E.(2009b). Demystifying a Q-matrix for making diagnostic inferences aboutL2reading skills. Language Assessment Quarterly,6(3),210-238.
    Joiner, E.(1997). Teaching listening: How technology can help. In Bush, M.&R.Terry (Eds.), Technology-enhanced language learning (pp.77-120). Lincolnwood, IL:National Textbook Company
    Jung, E. H.(2003). The role of discourse signaling cues in second language listeningcomprehension. The Modern Language Journal,87(4),562-577.
    Junker, B. W.,&Sijtsma, K.(2001). Cognitive assessment models with fewassumptions, and connections with nonparametric Item Response Theory. AppliedPsychological Measurement,25(3),258-272.
    Ketterlin-Geller, L. R.,&Yovanoff, P.(2009). Diagnostic assessments in mathematicsto support instructional decision making. Practical Assessment, Research,&Evaluation,14(16),1-11.
    Kintsch, W.,&Van Dijk, T. A.(1983). Strategies of discourse comprehension. NewYork: Academic Press.
    Kobeleva, P. P.(2012). Second language listening and unfamiliar proper names:Comprehension barrier? RELC Journal,43(1),83-98.
    Lantolf, J. P.,&Poehner, M. E. Computerized dynamic assessment of languageproficiency in French, Russian and Chinese. from http://language.la.psu.edu/pages/projects
    Lantolf, J. P.,&Poehner, M. E.(2008). Dynamic assessment. Encyclopedia oflanguage and education,7,273-284.
    Lee, Y.-S., La Torre, J.,&Park, Y. S.(2011). Relationships between CognitiveDiagnosis, CTT, and IRT indices: An empirical investigation. Asia Pacific EducationReview,13(2),333-345.
    Lee, Y.-W.,&Sawaki, Y.(2009a). Application of three cognitive diagnosis models toESL reading and listening assessments. Language Assessment Quarterly,6(3),239-263.
    Lee, Y.-W.,&Sawaki, Y.(2009b). Cognitive diagnosis and Q-matrices in languageassessment. Language Assessment Quarterly,6(3),169-171.
    Lee, Y.-W.,&Sawaki, Y.(2009c). Cognitive diagnosis approaches to languageassessment: An overview. Language Assessment Quarterly,6(3),172-189.
    Leighton, J.,&Gierl, M.(2007a). Why cognitive diagnostic assessment? In Leighton, J.&M. Gierl (Eds.), Cognitive diagnostic assessment for education. New York: CambridgeUniversity Press
    Leighton, J. P., Cui, Y.,&Cor, M. K.(2009). Testing expert-based and student-basedcognitive models: An application of the attribute hierarchy method and hierarchyconsistency index. Applied Measurement in Education,22(3),229-254.
    Leighton, J. P.,&Gierl, M. J.(2007b). Cognitive diagnostic assessment for education:Theory and applications: Cambridge University Press.
    Leighton, J. P.,&Gierl, M. J.(2007c). Defining and evaluating models of cognitionused in educational measurement to make inferences about examinees' thinking processes.Educational Measurement: Issues and Practice,26(2),3-16.
    Leighton, J. P., Gierl, M. J.,&Hunka, S. M.(2004). The attribute hierarchy method forcognitive assessment: a variation on Tatsuoka‘s rule-space approach. Journal of EducationalMeasurement,41(3),205-237.
    Leighton, J. P., Gokiert, R. J.,&Cui, Y.(2007). Using exploratory and confirmatorymethods to identify the cognitive dimensions in a large-scale science assessment.International Journal of Testing,7(2),141-189.
    Liao, Y. F.(2009). A construct validation study of the GEPT reading and listeningsections: Re-examining the models of L2reading and listening abilities and their relations tolexico-grammatical knowledge. Teachers College, Columbia University.
    Liu, H.,&Hu, X. Q.(2008). An investigation into listening comprehension difficultiesof more skilled and less skilled listener. Chinese EFL Journal,1(1).
    Lohman, D. F.(2000). Complex information processing and intelligence.
    Lonigan, C. J., Allan, N. P.,&Lerner, M. D.(2011). Assessment of preschool earlyliteracy skills: Linking children's educational needs with empirically supported instructionalactivities. Psychology In the Schools,48(5),488-501.
    Lynch, T.(1998). Theoretical perspectives on listening. Annual Review of AppliedLinguistics,18(1),3-19.
    Lynch, T.(2002). Listening: Questions of Level. In Kaplan, R. B.(Ed.), OxfordHandbook Of Applied Linguistics. Oxford: Oxford University Press
    Lynch, T.(2011). Academic listening in the21st century: Reviewing a decade ofresearch. Journal of English for Academic Purposes,10(2),79-88.
    Maris, E.(1999). Estimating multiple classification latent class models. Psychometrika,64(2),187-212.
    Matson, J. L., Rush, K. S., Hamilton, M., Anderson, S. J., Bamburg, J. W., Baglio, C. S.,Williams, D.,&Kirkpatrick–Sanchez, S.(1999). Characteristics of depression as assessedby the diagnostic assessment for the severely handicapped-ii (dash-ii). Research inDevelopmental Disabilities,20(4),305-313.
    Matthewson, G.(1985). Toward a comprehensive model of affect in the readingprocess. In H.Singer&Rb Ruddell (Eds.), Theoretical models and processes of reading (3ed., pp.841-856). New York, Delaware: International Reading Association
    Mayer, R. E.(1999). Designing instruction for constructivist learning.Instructional-design theories and models: A new paradigm of instructional theory,2,141-159.
    Mayer, R. E.(2003). Learning and instruction: Merrill.
    Mislevy, R. J.(1996). Test theory reconceived. Journal of Educational Measurement,33(4),379-416.
    Morley, J.(1999). Current perspectives on improving aural comprehension. ESLMagazine,2(1),16-19.
    Nakatani, Y.(2005). The effects of awareness‐raising training on oral communicationstrategy use. The Modern Language Journal,89(1),76-91.
    Nichols, P. D., Chipman, S. F.,&Brennan, R. L.(1995). Cognitively DiagnosticAssessment: Lawrence Erlbaum.
    Nicol, D. J.,&Macfarlane-Dick, D.(2006). Formative assessment and self‐regulatedlearning: A model and seven principles of good feedback practice. Studies in HigherEducation,31(2),199-218.
    Nielsen, J.(2000). Designing web usability: The practice of simplicity. Indianapolis:New Riders Publishing.
    Nissan, S.(1996). An analysis of factors affecting the difficulty of dialogue items intoefl listening comprehension. TOEFL Research Reports,51.
    Nunan, D.(1997). Designing and adapting materials to encourage learner autonomy. InBenson, P.&P. Voller (Eds.), Autonomy and independence in language learning (pp.192-203). New York: Addison Wesley.
    O'malley, J. M.,&Chamot, A. U.(1990). Learning strategies in second languageacquisition: Cambridge University Press.
    O'malley, J. M.,&Chamot, A. U.(2001). Learning strategies in second languageacquisition. Shanghai: Shanghai Foreign Language Education Press.
    O'malley, J. M., Chamot, A. U.,&Küpper, L.(1989). Listening comprehensionstrategies in second language acquisition. Applied linguistics,10(4),418-437.
    O‘callaghan, R., Morley, M.,&Schwartz, A.(2004). Developing skill categories forthe SAT math section. Paper presented at the annual meeting of the National Council onMeasurement in Education, San Diego, CA.
    Oller, J., W., Jr.(1979). Language tests at scholl. London: Longman.
    Osterlind, S. J.(2006). Modern measurement: Allyn&Bacon/Pearson.
    Oxford, R. L.(1990). Language learning strategies: What every teacher should know:Heinle&Heinle Boston.
    Oxford, R. L.(1993). Research update on teaching l2listening. System,21(2),205-211.
    Park, G.-P.(2004). Comparison of l2listening and reading comprehension byuniversity students learning english in korea. Foreign Language Annals,37(3),448-458.
    Petersen, R. C.,&Negash, S.(2008). Mild cognitive impairment: An overview. CNSspectrums,13(1),45.
    Peterson, P. W.(2001). Skills and strategies for proficient listening. In Celce-Murcia,M.(Ed.), Teaching english as a second or foreign language (pp.87-100). Boston: Heinle&Heinle
    Pishghadam, R., Barabadi, E.,&Kamrood, A. M.(2011). The differing effect ofcomputerized dynamic assessment of l2reading comprehension on high and low achievers.Journal of Language Teaching and Research,2(6),1353-1358.
    Pressley, M.,&Afflerbach, P.(1995). Verbal protocols of reading: The nature ofconstructively responsive reading: Routledge.
    Richards, J. C.(1983). Listening comprehension: Approach, design, procedure. TesolQuarterly,17(2),219-240.
    Richards, J. C.(2005). Second thoughts on teaching listening. RELC Journal,36(1),85-92.
    Richards, J. C.,&Schmidt, R. W.(2002). Dictionary of language teaching&appliedlinguistics: Longman.
    Riconscente, M. M., Mislevy, R. J.,&Hamel, L.(2005). An introduction to padi tasktemplates. PADI Techni CAl Report,3.
    Roberts, M.(2012). Developing and evaluating score reports for cognitive diagnosticassessment.(Phd), Univerisity of Alberta, Edmonton, Alberta.
    Roberts, M.,&Gierl, M. J.(2010). Developing score reports for cognitive diagnosticassessments. Educational Measurement: Issues and Practice,29(3),25-38.
    Roberts, M. R., Alves, C. B., Chu, M.-W., Thompson, M., Bahry, L. M.,&Gotzmann,A.(2012). Testing expert-based vs. Student-based cognitive models for a grade3diagnosticmathematics assessment.
    Rogers, C.(2002). Tradition And Technology In Language Teaching. Dimension,2002,17-32.
    Ronning, R. R.(1987). The Influence Of Cognitive Psychology On Testing.Buros-Nebraska Symposium On Measurement And Testing (3rd, lincoln, nebraska,1985).Volume3: ERIC.
    Rost, M.(2002). Teaching And Researching Listening. Harlow, UK: Longman.
    Rost, M.(2005). L2Listening Handbook Of Research In Second Language TeachingAnd Learning (pp.503-528).
    Roussos, L. A., Dibello, L. V., Stout, W., Hartz, S. M., Henson, R. A.,&Templin, J.(2007). The Fusion Model Skills Diagnosis System. Cognitive Diagnostic Assessment ForEducation: Theory And Applications,275-318.
    Rubin, J.(1994). A Review Of Second Language Listening Comprehension Research.The Modern Language Journal,78(2),199-221.
    Rumelhart, E.(1975). Notes on a schema for stories. In Bobrow, D.G&Collins. A(Eds), Preresenatioan dan Understanding: Studies in Cognitive Science. New York:Academick Press, P.211-236.
    Rupp, A., Templin, J.,&Henson, R.(2010). Diagnostic Measurement: Theory,Methods, And Applications: Guilford Press.
    Rupp, A. A.(2007). The Answer Is In The Question: A Guide For Describing AndInvestigating The Conceptual Foundations And Statistical Properties Of CognitivePsychometric Models. International Journal of Testing,7(2),95-125.
    Rupp, A. A.,&Templin, J.(2008). The Effects Of Q-matrix Misspecification OnParameter Estimates And Classification Accuracy In The Dina Model. Educational andPsychological Measurement,68(1),78-96.
    Ryan, J. M.(2003). An Analysis Of Item Mapping And Test Reporting Strategies.Retrieved August,14,2007.
    Schmidt, R. W.(1990). The Role Of Consciousness In Second Language Learning.Applied linguistics,11(2),129-158.
    Sherman, J.(1997). The Effect Of Question Preview In Listening Comprehension Tests.Language Testing,14(2),185-213.
    Shohamy, E.,&Inbar, O.(1991). Validation Of Listening Comprehension Tests: TheEffect Of Text And Question Type. Language Testing,8(1),23-40.
    Sinharay, S.,&Almond, R. G.(2007). Assessing Fit Of Cognitive Diagnostic ModelsA Case Study. Educational and psychological measurement,67(2),239-257.
    Spolsky, B.(1992). Diagnostic Testing Revisited. Language Assessment and Feedback:Testing And Other Strategies,29-39.
    Summers, R.(2008). Dynamic Assessment: Towards A Model Of DialogicEngagement. University of South Florida.
    Tatsuoka, K. K.(1983). Rule space: An Approach for Dealing with MisconceptionsBased on Item Response Theory. Journal of Educational Measurement,20(4),345-354.
    Tatsuoka, K. K., Linn, R. L., Tatsuoka, M. M.,&Yamamoto, K.(1988). DifferentialItem Functioning Resulting from the Use of Different Solution Strategies. Journal ofEducational Measurement,25(4),301-319.
    Templin, J. L.,&Henson, R. A.(2006). Measurement Of Psychological DisordersUsing Cognitive Diagnosis Models. Psychological Methods,11(3),287.
    Thompson, I.(1995). Assessment of Second/Foreign Language ListeningComprehension. A Guide for the Teaching of Second Language Listening,31-58.
    Treagust, D. F., Glynn, S.,&Duit, R.(1995). Diagnostic Assessment Of Students‘Science Knowledge. Learning Science in the Schools: Research Reforming Practice,1,327-436.
    Tsui, A. B. M.,&John, F.(1998). Bottom-up or Top-down Processing as aDiscriminator of L2Listening Performance. Applied Linguistics19,432-451.
    Urquhart, A. H.,&Weir, C. J.(1998). Reading in a Second Language: Process, product,and practice: Longman Publishing Group.
    Vandergrift, L.(1999). Facilitating Second Language Listening Comprehension:Acquiring Successful Strategies. ELT Journal,53(3),168-176.
    Vandergrift, L.(2003a). From Prediction Through Reflection: Guiding Students:Through the Process of L2Listening. Canadian Modern Language Review/La Revuecanadienne des langues vivantes,59(3),425-440.
    Vandergrift, L.(2003b). Orchestrating Strategy Use: Toward A Model of The SkilledSecond Language Listener. Language Learning,53(3),463-496.
    Vandergrift, L.(2004). Listening to Learn or Learning to Listen? Annual Review ofApplied Linguistics,24(1),3-25.
    Vandergrift, L.(2006). Second Language Listening: Listening Ability or LanguageProficiency? The Modern Language Journal,90(1),6-18.
    Vandergrift, L.(2007). Recent Developments in Second and Foreign LanguageListening Comprehension Research. Language Teaching,40(03),191.
    Wang, C., Chang, H.-H.,&Douglas, J.(2012). Combining CAT With CognitiveDiagnosis: A Weighted Item Selection Approach. Behavior Research Methods,44(1),95-109.
    Wang, C.,&Gierl, M. J.(2007). Investigating the Cognitive Attributes UnderlyingStudent Performance on the SAT Critical Reading Subtest: An Application of the AttributeHierarchy Method. Paper Presented at The Annual Meeting of the National Council onMeasurement in Education.
    Wang, C.,&Gierl, M. J.(2011). Using the Attribute Hierarchy Method to MakeDiagnostic Inferences About Examinees‘Cognitive Skills in Critical Reading. Journal ofEducational Measurement,48(2),165-187.
    Wang, C., Gierl, M. J.,&Leighton, J. P.(2006). Investigating the Cognitive AttributesUnderlying Student Performance on A Foreign Language Reading Test: An Application ofthe Attribute Hierarchy Method. Paper presented at the the graduate student poster session atthe2006annual meeting of the National Council on Measurement in Education, SanFrancisco, California.
    Weir, C.J.(1990). Communicative Language Testing. New York: Prentice-hall.
    Weir, C. J.(1993). Understanding&Developing Language Tests: Prentice Hall Books.
    Wolvin, A. D.(2010). Introduction: Perspectives on Listening in the21st Century InWolvin, A.D.(Ed.), Listening And Human Communication In The21st Century.(PP.1-29)Blackwell Publishing Ltd.
    Wu, Y. A.(1998). What Do Tests of Listening Comprehension Test?-A RetrospectionStudy of EFL Test-Takers Performing A Multiple-Choice Task. Language Testing,15(1),21-44.
    Yang, X.,&Embretson, S. E..(2007). Construct validity and cognitive diagnosticassessment. In J. Leighton.,&M. Gier (Ed.), Cognitive diagnostic assessment in education:Theory and applications (pp.119–145). New York: Cambridge University Press.
    Zhou, J.(2010). Estimating attribute-based reliability in cognitive diagnosticassessment: University of Alberta.
    蔡艳,涂冬波&丁树良(2010).认知诊断测验编制的理论及方法.考试研究(3):79-92.
    蔡艳(2010).群体水平的英语阅读问题解决能力评估及认知诊断. MI-博士.江西师范大学.
    曹慧媛,&刘军.(2009).基于AHM的认知诊断分类研究.科学技术与工程(10):2790-2792
    陈颖(Chen, Y.)(2005).交际法英语听力测试和交际法英语听力教学的研究.福建师范大学, M1-硕士.
    戴海琦&宋宜梅(2008).基于认知诊断的被试类比推理测验行为分析.心理科学31(4):971–973.
    丁树良,毛萌萌,汪文义,罗芬&Cui Ying (2012).教育认知诊断测验与认知模型一致性的评估.心理学报,44(6):1535-1154.
    杜金榜(Du, J.)(1999).外语教学中的诊断测试.外语教学与研究(4):40-43
    段君(Duan, J.)(2009).对英语在线诊断性听力测试设计的尝试性研究.(硕士),西北大学.
    甘媛源&余嘉元(2009).心理测量理论的新进展:潜在分类模型.中国考试(研究版)(3),3-8.
    国家高等教育高教司(2007).大学英语课程教学要求.上海:上海外语教育出版社.
    郭纯洁(2003)有声思维法.北京:外语教学与研究出版社.
    贺梦依(2007).英语听力策略论——理论与实践.重庆:西南师范大学出版社.
    贺泉莉(2009).小学生数学障碍的认知诊断研究.硕士,西南大学.
    李美娟(2012).初中英语阅读测验的认知诊断评价及其影响因素研究.(硕士),北京师范大学.
    李峰,余娜,&辛涛.(2009).小学四、五年级数学诊断性测验的编制——基于规则空间模型的方法.心理发展与教育,25(3):113-118
    刘经兰&戴海琦(2003).小学四年级数学诊断性测验的编制与研究.心理学探新(3):57-59
    罗欢(2009).知诊断中属性权重的研究——以多级评分AHM为例.(硕士),江西师范大学.
    罗欢,丁树良,汪文义,喻晓峰&曹慧媛.(2010).属性不等权重的多级评分属性层级方法.心理学报(4):528-538
    罗凌云.(2010).认知诊断中项目属性自动标识策略的相关研究.硕士,江西师范大学.
    罗照盛(2012).项目反应理论基础.北京:北京师范大学出版社.
    马晓梅&孟亚茹(2008).基于学习者可控因素的个性化英语学习策略诊断与指导系统研发.解放军外国语学院学报,31(4):38-42.
    马晓梅,孟亚茹等(2008).―个性化英语学习诊断与指导系统‖实证研究与系统构架概要.外语教学与研究,40(3):184-187.
    马晓梅,孟亚茹,何惠勤&刘睿(2012).个性化英语视听自我诊断指导模式构建及系统研发.外语教学(5):59-63.
    聂斌(2011).认知诊断中属性结构的完备性与精确性.江西师范大学, M1-硕士.
    漆书青,戴海崎&丁树良(2002).现代教育与心理测量学原理.北京:高等教育出版社.
    孙佳楠,张淑梅,辛涛&包钰(2011).基于Q矩阵和广义距离的认知诊断方法.心理学报(9):1095-1102.
    涂冬波,蔡艳,戴海琦&丁树良(2010).一种多级评分的认知诊断模型:P-DINA模型的开发.心理学报,42(10):1011-1020.
    涂冬波,戴海琦,蔡艳&丁树良(2010).小学儿童数学问题解决认知诊断.心理科学(6):1461-1466
    王静(2008). C.Test阅读理解测验的诊断性评价研究.(硕士),北京语言大学.
    许式靖.(2007). C-TEST听力理解测验的诊断性评价研究(硕士).北京语言大学.
    许式靖(2010).[汉语测试的诊断性评价研究].中国考试(07):12-16.
    游晓锋,丁树良&刘红云(2011).缺失数据的估计方法及应用.江西师范大学学报(自然科学版)(03):325-330.
    喻晓锋(2009).贝叶斯网在认知诊断中的应用.江西师范大学, M1-硕士.
    张宠(2009).汉语作为第二语言的阅读理解诊断性成绩测试研究.(M1-硕士硕士),北京语言大学.
    周霞(2009). Hsk[中级]听力理解测验的诊断性评价研究.(M1-硕士),北京语言大学.
    祝玉芳&丁树良(2008).规则空间模型理论基础的改进.江西师范大学学报(自然科学版)(1):69-72
    邹申(1997). TEM考试效度研究.上海:上海外语教育出版社.
    邹申(2005).语言测试.上海:上海外语教育出版社.