Show simple item record

Yield Prediction with Explainable Machine Learning

dc.contributor.advisorSteinhage, Volker
dc.contributor.authorHuber, Florian Philipp
dc.date.accessioned2024-11-21T11:06:46Z
dc.date.available2024-11-21T11:06:46Z
dc.date.issued21.11.2024
dc.identifier.urihttps://hdl.handle.net/20.500.11811/12560
dc.description.abstractStarting from a federal project to predict grapevine yields in Germany, we faced five challenges to enable machine learning for yield prediction. The first challenge is training on small data sets, as capturing data for yield prediction is very time consuming with most plants following an annual cycle. Providing a feature-based representation of remote sensing data by modeling underlying distributions allows gradient boosting methods to outperform deep learning approaches by 25% in our experiments for soybean yield prediction in the US, one of the biggest datasets for yield prediction that allows international comparability. The second challenge is the need for explanations to show that the model's decision making is in-line with experts knowledge of the field. For this challenge, we extend the idea of Shapley value feature attributions to predefined groups of features. The groupings are naturally given for yield prediction scenarios and allow for an improved representation of the explanations, as individual features are plentiful and often abstract. We give a novel algorithm to solve the problem of calculating the grouped Shapley values in polynomial time for random forests as they result from the gradient boosting pipeline from challenge one. Third, we work towards better feature selection for yield prediction tasks. The introduction of grouped Shapley values sparks the question of whether Shapley values could be used for feature selection. To address this question, we define four necessary conditions for defining a Shapley value suitable for feature selection. Additionally, we analyze the problem of model averaging where unimportant features are allowed to alter the final feature selection by introducing a novel exhaustive feature selection tool that has no problems with model averaging, and use it to further evaluate Shapley values for feature selection. Our experiments indicate that there is a small loss in accuracy due to model averaging, while the runtime of Shapley values as a heuristic measure for feature selection is superior for random forests. The fourth challenge is handling gaps in remote sensing data. As we need to use remote sensing data to provide consistent coverage for a small research area, clouds that occlude the satellite's view on the Earth can hide a meaningful amount of data. We approach this challenge by introducing a novel deep interpolation pipeline that uses a U-Net structure together with partial convolutions to gradually fill in remote sensing data in our research area, finally improving previously established statistical methods by 44% in terms of RMSE. Lastly, we worked towards a solution to make predictions for shifting domains, where we used regularized transfer learning to improve yield prediction by transferring knowledge between different domains by 16% in terms of RMSE, compared to not using transfer learning techniques.en
dc.language.isoeng
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject.ddc004 Informatik
dc.titleYield Prediction with Explainable Machine Learning
dc.typeDissertation oder Habilitation
dc.publisher.nameUniversitäts- und Landesbibliothek Bonn
dc.publisher.locationBonn
dc.rights.accessRightsopenAccess
dc.identifier.urnhttps://nbn-resolving.org/urn:nbn:de:hbz:5-79699
ulbbn.pubtypeErstveröffentlichung
ulbbnediss.affiliation.nameRheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.locationBonn
ulbbnediss.thesis.levelDissertation
ulbbnediss.dissID7969
ulbbnediss.date.accepted31.10.2024
ulbbnediss.instituteMathematisch-Naturwissenschaftliche Fakultät : Fachgruppe Informatik / Institut für Informatik
ulbbnediss.fakultaetMathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coRefereeDemidova, Elena


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

The following license files are associated with this item:

http://creativecommons.org/licenses/by-nc-nd/4.0/