A taxonomy-based approach was used to classify AgMIP rice simulation models.
Different model structures often resulted in similar outputs.
Similar structures often led to large differences in outputs.
User subjectivity likely hides relationships between model structure and behaviour.
Shared protocols are still needed to limit the risks during calibration.