Three experienced trauma surgeons (raters) were presented with the radiographs of 114 patients with tibia fractures randomly allocated from the SFR. The raters classified the fractures independently and blinded to clinical patient information in two classification sessions with a time interval of one month. The AO/OTA classification coded by the three expert raters (our predefined gold standard) was compared with the classifications in the SFR. Inter- and intra-observer agreement was evaluated. The degree of agreement was reported using the approach of Landis and Koch.
The accuracy of the SFR, defined as agreement between the SFR and the gold standard classification, was kappa = 0.75 for the AO/OTA type and 0.56 for the AO/OTA group, corresponding to substantial and moderate agreement, respectively. Inter-observer agreement across the three expert raters was kappa = 0.74 for the AO/OTA type and 0.53 for the AO/OTA group. Intra-observer agreement was kappa = 0.74–0.79 for the AO/OTA type and 0.62–0.64 for the AO/OTA group.
This study shows that the accuracy of classification of tibia fractures in the SFR was substantial for the AO/OTA type (kappa = 0.75) and moderate for the AO/OTA group (kappa = 0.56) as defined by Landis and Koch. This degree of accuracy is similar to that in previous studies. We interpret this as meaning that the results of this study demonstrate the high reliability of the data in the SFR and enable the SFR to be used for further scientific analysis.