英语口语测试任务特征对评分员关注点的影响
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:The Effect of Task Characteristics on Raters' Use of Criteria in an EFL Speaking Test
  • 作者:徐柳 ; 蔡宏文
  • 英文作者:XU Liu;CAI Hongwen;Shenzhen Technology University;Guangdong University of Foreign Studies;
  • 关键词:任务特征 ; 构念相关性 ; 英语口语测试
  • 英文关键词:task characteristics;;construct relevance;;EFL speaking test
  • 中文刊名:XDWY
  • 英文刊名:Modern Foreign Languages
  • 机构:深圳技术大学;广东外语外贸大学;
  • 出版日期:2019-05-07 11:39
  • 出版单位:现代外语
  • 年:2019
  • 期:v.42;No.176
  • 基金:广东外语外贸大学特色创新项目“大规模英语考试中的评分员认知及其对评分结果的影响”(15T29)的阶段性成果
  • 语种:中文;
  • 页:XDWY201904009
  • 页数:12
  • CN:04
  • ISSN:44-1165/H
  • 分类号:106-117
摘要
本研究基于英语专业四级口试,对比了即席讲话和交谈等两项任务的评分过程中评分员的关注点。这两项任务在交互性、体裁和话题等方面存在差异。研究要求评分员对考生的录音进行评分,并报告评分理由,再将收集到的评分理由与官方评分标准进行对比,从中甄别出三类成分:两项任务共有的构念成分、任务独有的构念成分和构念无关成分。结果发现,评分员的关注点与官方标准高度一致,但是在给不同任务评分时对相同的构念成分采用了不同的指标,并使用了与构念无关的标准。研究结果表明,任务特征对评分员关注点既有间接影响,也有直接影响。
        This study compared the criteria used by raters when rating two speaking tasks in the Test for English Majors, Band 4, Oral Test. The two tasks varied in the mode of communication,genre, and topic. Raters were asked to rate the recorded response of the test-takers and justify their ratings. The justifications were compared to the official rubric, and three types of components were thus identified: construct-relevant components common to the two tasks, construct-relevant components unique to each task, and construct-irrelevant components. The results showed that the raters' justifications agreed strongly with the descriptors in the official rubric, but the raters' operational definition of the target construct varied and included some construct-irrelevant criteria when rating the different tasks. These results were interpreted as direct and indirect effects of task characteristics on raters' use of criteria.
引文
American Educational Research Association,American Psychological Association&National Council on Measurement in Education.2014.Standards for Educational and Psychological Testing.Washington,DC:American Educational Research Association.
    Ang-Aw,H.&C.Goh.2011.Understanding discrepancies in rater judgment on national-level oral examination tasks.RELC Journal 42(1):31-51.
    Bachman,L.1990.Fundamental Considerations in Language Testing.Oxford,UK:Oxford University Press.
    Bachman,L.2007.What is the construct?The dialectic of abilities and contexts in defining constructs in language assessment.In J.Fox,M.Wesche,D.Bayliss,L.Cheng,C.Turner&CDoe(eds.).Language Testing Reconsidered.Ottawa,Canada:University of Ottawa Press,41-71.
    Bejar,I.2012.Rater cognition:Implications for validity.Educational Measurement:Issues and Practice 31(3):2-9.
    Bliss,L.&A.McCabe.2008.Personal narratives:Cultural differences and clinical implications.Topics in Language Disorders 28(2):162-177.
    Brown,A.2000.An investigation of the rating process in the IELTS oral interview.IELTSResearch Reports 3:49-84.
    Brown,A.,N.Iwashita&T.McNamara.2005.An examination of rater orientations and test-taker performance on English-for-academic-purposes speaking tasks.TOEFL Monograph Series 29:1-157.
    Cai,H.(蔡宏文).2015.Weight-based classification of raters and rater cognition in an EFLspeaking test.Language Assessment Quarterly 12(3):262-282.
    Davis,L.2015.The influence of training and experience on rater performance in scoring spoken language.Language Testing 33(1):117-135.
    Derwing,T.&M.Munro.2015.Pronunciation Fundamentals:Evidence-based Perspectives for L2Teaching and Research.Amsterdam,the Netherlands:John Benjamins.
    Educational Testing Service.2012.The Official Guide to the TOEFL Test(4thedn.).New York,N.Y.:McGraw-Hill Education.
    Galaczi,E.2014.Interactional competence across proficiency levels:How do learners manage interaction in paired speaking tests?Applied Linguistics 35(5):553-574.
    Galaczi,E.&L.Taylor.2018.Interactional competence:Conceptualisations,operationalisations and outstanding questions.Language Assessment Quarterly 15(3):219-236.
    Halliday,M.&R.Hasan.1976.Cohesion in English.Harlow:Harlow Longman Group Ltd.
    Jacoby,S.&T.McNamara.1999.Locating competence.English for Specific Purposes 18(3):213-241.
    Labov,W.&J.Waletzky.1997.Narrative analysis:Oral versions of personal experience.In J.Helm(eds.).Essays on the Verbal and Visual Arts.Seattle:University of Washington Press,12-44.
    Liu,Jianda(刘建达).2010.Multi-facets Rasch modeling on rater effect.Modern Foreign Languages(3):85-93.[2010,评卷人效应的多层面Rasch模型研究.《现代外语》第3期:85-93.]
    May,L.2011.Interaction in a Paired Speaking Test:The Rater's Perspective.Frankfurt,Germany:Peter Lang.
    McNamara,T.1996.Measuring Second Language Performance.Harlow:Addison Wesley Longman.
    Messick,S.1989.Validity.In R.L.Linn(ed.).Educational Measurement(3rdedn.).New York,NY:Macmillan,13-103.
    National Education Examinations Authority(教育部考试中心).2018.China's Standards of English Language Ability.Beijing:Higher Education Press.[2018,《中国英语能力等级量表》.北京:高等教育出版社.]
    Orr,M.2002.The FCE speaking test:Using rater reports to help interpret test scores.System 30(2):143-154.
    Purpura,J.2014.Cognition and language assessment.In A.J.Kunnan(ed.).Companion to Language Assessment.Oxford,UK:Wiley-Blackwell,1452-1476.
    Richards,L.2005.Handling Qualitative Data:A Practical Guide.London,UK:Sage Publications.
    Seedhouse,P.&A.Harris.2011.Topic development in the IELTS Speaking Test.IELTS:Research Reports Vol.12.
    Upshur,J.&C.Turner.1999.Systematic effects in the rating of second language speaking ability:Test method and learner discourse.Language Testing 16(1):82-111.
    Van Eemeren,F.2015.The role of logic in analyzing and evaluating argumentation.In F.H.van Eemeren(ed.).Reasonableness and Effectiveness in Argumentative Discourse.Cham,Switzerland:Springer,667-680.
    Wang,Haizhen(王海贞).2008.Rater orientation and application of TEM-4 Oral Test rating scale.Foreign Language Learning Theory and Practice(2):33-39.[2008,全国英语专业四级口试评分员对评分标准的理解和使用.《外语教学理论与实践》第2期:33-39.]
    Wei,J.(魏菁)&L.Llosa.2015.Investigating differences between American and Indian raters in assessing TOEFL i BT speaking tasks.Language Assessment Quarterly 12(3):283-304.
    Wen,Qiufang(文秋芳)&Wang,Ling(王凌).2009.Validation of TEM4 Oral.Journal of PLAUniversity of Foreign Languages(5):37-41.[2009,英语专业四级口试的效度研究.《解放军外国语学院学报》第5期:37-41.]
    Writing Group of Syllabus for TEM4-Oral(高校英语专业四级口试大纲编写小组).2014.Syllabus for TEM4-Oral.Shanghai:Shanghai Foreign Language Education Press.[2014,《高校英语专业四级口试大纲》.上海:上海外语教育出版社.]
    Zhang,J.(张洁)&L.He(何莲珍).2008.Study of sources of score variability in performance assessment using MFRM:A case of speaking test in PETS Band 3.CELEA Journal 31(4):40-49.
    Zhang,Y.(张颖)&C.Elder.2014.Investigating native and non-native English-speaking teacher raters'judgments of oral proficiency in the College English Test-Spoken English Test(CET-SET).Assessment in Education:Principles,Policy&Practice 21(3):306-325.
    1访谈提纲备索。

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700