Use of random forests and support vector machines to improve annual egg production estimation
详细信息    查看全文
文摘
The delta-generalized additive model (Delta_GAM) is commonly used for analyzing zero-inflated continuous data, and has been widely applied in egg production methods (EPMs). It consists of two GAMs: one with a binomial distribution to estimate the probability of non-zero values, and the other with a log-normal distribution (Delta_LN model) or a gamma distribution (Delta_LG model) to model the continuous non-zero values. However, the rather restrictive distribution assumptions are not fulfilled for egg production data. In this study, we modified the Delta_GAMs using two machine learning techniques: random forest (Delta_RF) and support vector machines (Delta_SVM). We applied the tenfold cross-validation procedure to compare the performance of these four models using root mean square error (RMSE) and the EPM survey data of small yellow croaker Larimichthys polyactis, mullet Liza haematocheilus and gizzard shad Konosirus punctatus from Haizhou Bay, China. Both the Delta_RF and Delta_SVM models showed superior performance to that of the Delta_LN and Delta_LG models. Predicted spatial and temporal distributions varied among the models, although predictive performance varied little. The annual egg production was predicted and estimated with large uncertainty. We propose that machine learning techniques such as RFs and SVMs be used to model zero-inflated continuous data from EPM surveys, which tend to provide a more reliable estimation of annual egg production (AEP).
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.