Application of classification trees and logistic regression to determine factors responsible for lamb mortality
详细信息查看全文 | 推荐本文 |
摘要
The aim of the presented research was to statistically analyse the survival of 20,044 Polish Merino lambs between birth and 100 day of their life, using classification trees and logistic regression. The lamb survival trait was expressed in binomial scale: 1 for survival, 0 for mortality. Two different models of the trees were developed, depending on the division criterion: they were the function of entropy and the Gini index. For comparison purposes, an additional statistical analysis was carried out using a multiple logistic regression. The quality of decision tree models and multiple regressions was compared taking into consideration the following criteria: average error function, average squared error, lift cumulative, Kolmogorov-Smirnov statistics and the area under the Receiver Operating Characteristic curve. A statistical analysis was conducted using the Enterprise Miner 6.2 software included in the SAS package. The calculated quality criteria of four models that were developed lead to the conclusion that the classification trees established based on the Gini index, and on the function of entropy, are the most accurate in defining the variability of characteristics under examination, i.e. survival of lambs up to 100 days of age. In the case of the best classification model available, i.e. a tree built using the Gini index, the ranking of variable importance, which was developed based on the 鈥淚mportance鈥?measure, leads to the conclusion that the flock, type, and the year of a lamb's birth are the most significant differentiating factors.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700