Can student self-ratings be compared with peer ratings? A study of measurement invariance of multisource feedback
详细信息    查看全文
  • 作者:Keng-Lin Lee ; Shih-Li Tsai ; Yu-Ting Chiu…
  • 关键词:Professionalism ; Multisource feedback ; Self ; assessment ; Peer ; assessment ; Measurement invariance
  • 刊名:Advances in Health Sciences Education
  • 出版年:2016
  • 出版时间:May 2016
  • 年:2016
  • 卷:21
  • 期:2
  • 页码:401-413
  • 全文大小:420 KB
  • 参考文献:Abbasi, K. (2011). A way forward for whistleblowing. Journal of the Royal Society of Medicine, 104, 275.CrossRef
    Al Ansari, A., Donnon, T., Al Khalifa, K., Darwish, A., & Violato, C. (2014). The construct and criterion validity of the multi-source feedback process to assess physician performance: A meta-analysis. Advances in Medical Education and Practice, 5, 39–51.CrossRef
    Al Khalifa, K., Al Ansari, A., Violato, C., & Donnon, T. (2013). Multisource feedback to assess surgical practice: A systematic review. Journal of Surgical Education, 70(4), 475–486.CrossRef
    Allerup, P., Aspegren, K., Ejlersen, E., Jørgensen, G., Malchow-Møller, A., Møller, et al. (2007). Use of 360-degree assessment of residents in internal medicine in a Danish setting: A feasibility study. Medical Teacher, 29(2–3), 166–170.CrossRef
    Andrews, J. J. W., Violato, C., Al Ansari, A., Donnon, T., & Pugliese, G. (2013). Assessing psychologists in practice: Lessons from the health professions using multisource feedback. Professional Psychology: Research and Practice, 44(4), 193–207.CrossRef
    Archer, J. C., Norcini, J., & Davies, H. A. (2005). Use of SPRAT for peer review of paediatricians in training. British Medical Journal, 330(7502), 1251–1253.CrossRef
    Arnold, E. L., Blank, L. L., Race, K. E. H., & Cipparrone, N. (1998). Can professionalism be measured? The development of a scale for use in the medical environment. Academic Medicine, 73(10), 1119–1121.CrossRef
    Bolsin, S., Pal, R., Wilmshurst, P., & Pena, M. (2011). Whistleblowing and patient safety: The patient’s or the profession’s interests at stake? Journal of the Royal Society of Medicine, 104(7), 278–282.CrossRef
    Campbell, J. L., Roberts, M., Wright, C., Hill, J., Greco, M., Taylor, M., et al. (2011). Factors associated with variability in the assessment of UK doctors’ professionalism: Analysis of survey results. British Medical Journal, 343, d6212.CrossRef
    Chandler, N., Henderson, G., Park, B., Byerley, J., Brown, W. D., & Steiner, M. J. (2010). Use of a 360-degree evaluation in the outpatient setting: The usefulness of nurse, faculty, patient/family, and resident self-evaluation. Journal of Graduate Medical Education, 2(3), 430–434.CrossRef
    Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling, 14, 464–504.CrossRef
    Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233–255.CrossRef
    Cohen, J. J. (2006). Professionalism in medical education, an American perspective: From evidence to accountability. Medical Education, 40(7), 607–617.CrossRef
    Cook, D. A., & Beckman, T. J. (2006). Current concepts in validity and reliability for psychometric instruments: Theory and application. American Journal of Medicine, 119(2), 166.e7–166.e16.CrossRef
    Davies, H., Archer, J., Bateman, A., Dewar, S., Crossley, J., Grant, J., et al. (2008). Specialty-specific multi-source feedback: Assuring validity, informing training. Medical Education, 42(10), 1014–1020.CrossRef
    Davis, D. A., Mazmanian, P. E., Fordis, M., Van Harrison, R., Thorpe, K. E., & Perrier, L. (2006). Accuracy of physician self-assessment compared with observed measures of competence: A systematic review. Journal of the American Medical Association, 296(9), 1094–1102.CrossRef
    Deptula, P., & Chun, M. B. (2013). A literature review of professionalism in surgical education: Suggested components for development of a curriculum. Journal of Surgical Education, 70(3), 408–422.CrossRef
    Donnon, T., Al Ansari, A., Al Alawi, S., & Violato, C. (2014). The reliability, validity, and feasibility of multisource feedback physician assessment: A systematic review. Academic Medicine, 89(3), 511–516.CrossRef
    Donnon, T., McIlwrick, J., & Woloschuk, W. (2013). Investigating the reliability and validity of self and peer assessment to measure medical students’ professional competencies. Creative Education, 4(6A), 23–28.CrossRef
    Evans, A. W., McKenna, C., & Oliver, M. (2002). Self-assessment in medical practice. Journal of the Royal Society of Medicine, 96(10), 511–513.CrossRef
    Ginsburg, S., Regehr, G., Hatala, R., McNaughton, N., Frohna, A., Hodges, B., et al. (2000). Context, conflict, and resolution: A new conceptual framework for evaluating professionalism. Academic Medicine, 75(Suppl 10), S6–S11.CrossRef
    Ginsburg, S., Regehr, G., & Lingard, L. (2004). Basing the evaluation of professionalism on observable behaviors: A cautionary tale. Academic Medicine, 79(Suppl 10), S1–S4.CrossRef
    Gornall, J. (2009). The price of silence. British Medical Journal, 339, 1000–1004.
    Ho, M. J., Lin, C. W., Chiu, Y. T., Lingard, L., & Ginsburg, S. (2012). A cross-cultural study of students’ approaches to professional dilemmas: Sticks or ripples. Medical Education, 46(3), 245–256.CrossRef
    Ho, M. J., Yu, K. H., Hirsh, D., Huang, T. S., & Yang, P. C. (2011). Does one size fit all? Building a framework for medical professionalism. Academic Medicine, 86(11), 1407–1414.CrossRef
    Hu, L. T., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3(4), 424–453.CrossRef
    Hu, L., & Bentler, P. M. (1999). Cut off criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.CrossRef
    Jöreskog, K. G., & Goldberger, A. S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70, 631–639.
    Jöreskog, K. G., & Sörbom, D. (2004). LISREL 8.7 for Windows. Lincolnwood, IL: Scientific Software International, Inc.
    Legge, J. (translator). (1861). The Chinese classic, etc.: Vol I. Confucian analects, the great learning, and the doctrine of the mean. Hong Kong: Selbstverl.
    Leung, K. K., Wang, W. D., & Chen, Y. Y. (2012). Multi-source evaluation of interpersonal and communication skills of family medicine residents. Advances in Health Sciences Education, 17, 717–726.CrossRef
    Lockyer, J. (2003). Multisource feedback in the assessment of physician competencies. Journal of Continuing Education in the Health Professions, 23(1), 4–12.CrossRef
    Lockyer, J. M., & Clyman, S. G. (2008). Multisource feedback (360 degree evaluation). In E. S. Holmboe & R. E. Hawkins (Eds.), Practical guide to the evaluation of clinical competence (1st ed., pp. 75–85). Philadelphia, PA: Mosby-Elsevier.
    Lockyer, J. M., Violato, C., & Fidler, H. (2006a). The assessment of emergency physicians by a regulatory authority. Academic Emergency Medicine, 13(12), 1296–1303.CrossRef
    Lockyer, J. M., Violato, C., & Fidler, H. (2006b). A multi source feedback program for anesthesiologists. Canadian Journal of Anesthesia, 53(1), 33–39.CrossRef
    Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, 525–543.CrossRef
    Meredith, W., & Teresi, J. A. (2006). An essay on measurement and factorial invariance. Medical Care, 44, S69–S77.CrossRef
    Miller, A., & Archer, J. (2010). Impact of workplace based assessment on doctors’ education and performance: A systematic review. British Medical Journal, 341, c5064.CrossRef
    Motl, R. W., Dishman, R. K., Birnbaum, A. S., & Lytle, L. A. (2005). Longitudinal invariance of the Center for Epidemiological Studies Depression Scale (CES-D) among girls and boys in middle school. Educational and Psychological Measurement, 65, 90–108.CrossRef
    Musick, D. W., McDowell, S. M., Clark, N., & Salcido, R. (2003). Pilot study of a 360-degree assessment instrument for physical medicine & rehabilitation residency programs. American Journal of Physical Medicine and Rehabilitation, 82(5), 394–402.
    National Board of Medical Examiners. (2006). Assessment of professional behaviors. Retrieved 2010. https://​www2.​nbme.​org/​APB/​Schools/​APB/​join.​asp .
    Overeem, K., Wollersheim, H. C., Arah, O. A., Cruijsberg, J. K., Grol, R. P. T. M., & Lombarts, K. M. J. M. H. (2012). Evaluation of physicians’ professional performance: An iterative development and validation study of multisource feedback instruments. BMC Health Services Research, 12, 80.CrossRef
    Qu, B., Zhao, Y. H., & Sun, B. Z. (2012). Assessment of resident physicians in professionalism, interpersonal and communication skills: A multisource feedback. International Journal of Medical Sciences, 9(3), 228–236.CrossRef
    Raee, H., Amini, M., Nasab, A. M., Pour, A. M., & Jafari, M. M. (2014). Team-based assessment of professional behavior in medical students. Journal of Advances in Medical Education and Professionalism, 2(3), 126–130.
    Reise, S. P., Widaman, K. F., & Pugh, R. H. (1993). Confirmatory factor analysis and item response theory: Two approaches for exploring measurement invariance. Psychological Bulletin, 114, 552–566.CrossRef
    Rhodes, R., & Strain, J. J. (2004). Whistleblowing in academic medicine. Journal of Medical Ethics, 30, 35–39.CrossRef
    Richmond, M., Canavan, C., Holtman, M. C., & Katsufrakis, P. J. (2011). Feasibility of implementing a standardized multisource feedback program in the graduate medical education environment. Journal of Graduate Medical Education, 3(4), 511–516.CrossRef
    Roberts, M. J., Campbell, J. L., Richards, S. H., & Wright, C. (2013). Self-other agreement in multisource feedback: The influence of doctor and rater group characteristics. Journal of Continuing Education in the Health Professions, 33(1), 14–23.CrossRef
    Sargeant, J. M., Mann, K. V., Ferrier, S. N., Langille, D. B., Muirhead, P. D., Hayes, V. M., & Sinclair, D. E. (2003). Responses of rural family physicians and their colleague and coworker raters to a multi-source feedback process: A pilot study. Academic Medicine, 78(Suppl 10), S42–S44.CrossRef
    Sargeant, J., Mann, K., Sinclair, D., van der Vleuten, C., & Metsemakers, J. (2007). Challenges in multisource feedback: Intended and unintended outcomes. Medical Education, 41(6), 583–591.CrossRef
    Sargeant, J., Mann, K., van der Vleuten, C., & Metsemakers, J. (2008). “Directed” self-assessment: Practice and feedback within a social context. The Journal of Continuing Education in the Health Professions, 28(1), 47–54.CrossRef
    Shrank, W. H., Reed, V. A., & Jernstedt, G. C. (2004). Fostering professionalism in medical education: A call for improved assessment and meaningful incentives. Journal of General Internal Medicine, 19(8), 887–892.CrossRef
    Stefani, L. A. J. (1994). Peer, self and tutor assessment: Relative reliabilities. Studies in Higher Education, 19(1), 69–75.CrossRef
    Tromp, F., Vernooij-Dassen, M., Kramer, A., Grol, R., & Bottema, B. (2010). Behavioural elements of professionalism: Assessment of a fundamental concept in medical care. Medical Teacher, 32(4), e161–e169.CrossRef
    van Mook, W. N. K. A., Gorter, S. L., O’Sullivan, H., Wass, V., Schuwirth, L. W., & van der Vleuten, C. P. M. (2009). Approaches to professional behaviour assessment: Tools in the professionalism toolbox. European Journal of Internal Medicine, 20(8), e153–e157.CrossRef
    van Mook, W. N., Van Luijk, S. J., Fey-Schoenmakers, M. J., Tans, G., Rethans, J. J., Schuwirth, L. W., & van der Vleuten, C. P. (2010). Combined formative and summative professional behaviour assessment approach in the bachelor phase of medical school: A Dutch perspective. Medical Teacher, 32(12), e517–e531.CrossRef
    Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4–69.CrossRef
    Veloski, J. J., Fields, S. K., Boex, J. R., & Blank, L. L. (2005). Measuring professionalism: A review of studies with instruments reported in the literature between 1982 and 2002. Academic Medicine, 80(4), 366–370.CrossRef
    Violato, C., Lockyer, J., & Fidler, H. (2003). Multisource feedback: A method of assessing surgical practice. British Medical Journal, 326(7388), 546–548.CrossRef
    Violato, C., Lockyer, J. M., & Fidler, H. (2006). Assessment of pediatricians by a regulatory authority. Pediatrics, 117(3), 796–802.CrossRef
    Wilkinson, T. J., Wade, W. B., & Knock, L. D. (2009). A blueprint to assess professionalism: Results of a systematic review. Academic Medicine, 84(5), 551–558.CrossRef
    Wilmshurst, P. (2013). No doctor should be untouchable. British Medical Journal, 346, f2338.CrossRef
    Wood, J., Collins, J., Burnside, E., Albanese, M. A., Propeck, P. A., Kelcz, F., et al. (2004). Patient, faculty, and self-assessment of radiology resident performance: A 360-degree method of measuring professionalism and interpersonal/communication skills. Academic Radiology, 11(8), 931–939.
  • 作者单位:Keng-Lin Lee (1)
    Shih-Li Tsai (1)
    Yu-Ting Chiu (1)
    Ming-Jung Ho (1)

    1. Department of Medical Education and Bioethics, National Taiwan University College of Medicine, No. 1, Ren-Ai Road, Section 1, Taipei, Taiwan
  • 刊物类别:Humanities, Social Sciences and Law
  • 刊物主题:Education
    Medical Education
  • 出版者:Springer Netherlands
  • ISSN:1573-1677
文摘
Measurement invariance is a prerequisite for comparing measurement scores from different groups. In medical education, multi-source feedback (MSF) is utilized to assess core competencies, including the professionalism. However, little attention has been paid to the measurement invariance of assessment instruments; that is, whether an instrument holds the same meaning across different rater groups. To examine the measurement invariance of the National Taiwan University professionalism MSF (NTU P-MSF) in order to determine whether medical students’ self-rating can be compared to their peers’ rating. An eight-factor model was specified for confirmatory factor analysis to examine the construct validity of the NTU P-MSF. Cronbach’s alpha was computed for the items of each domain to evaluate internal consistent reliability. The same eight-factor model was used for multi-group confirmatory factor analyses. Four hierarchical models were specified to test configural (i.e., identical factor–item relationship), metric (i.e., identical factor loadings), scalar (i.e., identical intercepts), and error variance across self-rating and peer rating groups. One hundred and twenty second-year medical students from weekly discussion groups conducted as part of a medical professionalism course agreed to use the NTU P-MSF to assess themselves or their discussion group peers. NTU P-MSF assessment scores were a good fit for the eight-factor model among self group and peer group. The Cronbach’s alpha coefficients of students’ NTU P-MSF scores and peers’ scores ranged from 0.76 to 0.89 and 0.84 to 0.91, respectively indicating that the NTU P-MSF scores also have good internal consistent reliability between both groups. In addition, same factor structure and similar factor loadings and intercepts of NTU P-MSF scores between both groups indicate that NTU P-MSF scores had configural, metric, and scalar invariance. Thus, students’ self-assessments and peer assessments can be compared in terms of the constructs of NTU P-MSF scores, change in NTU P-MSF scores, and its factor scores. This study demonstrates how to investigate the measurement invariance of a professionalism MSF and contributes to the discussion on self- and peer assessment in medical education.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700