Kierdorf, Jana: Interpretable Machine Learning for Image-based Harvest-readiness Prediction of Cauliflower. - Bonn, 2025. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-82796
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-82796
@phdthesis{handle:20.500.11811/13085,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-82796,
author = {{Jana Kierdorf}},
title = {Interpretable Machine Learning for Image-based Harvest-readiness Prediction of Cauliflower},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2025,
month = may,
note = {Cauliflower cultivation is subject to high-quality control criteria during sales, highlighting the importance of accurate harvest timing. However, accurately determining harvest-readiness is challenging because the cauliflower curd is covered by its canopy. This leads to cauliflower being harvested by hand, making the harvesting process labor-intensive and subjective. To address these challenges, there is growing interest in developing non-invasive, sensor-based approaches. These provide fast, field-comprehensive, cost-effective, and reliable solutions by delivering objective and non-invasive data. The integration of time series data for plant phenotyping can provide detailed insights into the dynamic development of cauliflower, enabling more precise predictions of the optimal harvest time compared to single-point observations. However, data acquisition on a daily or weekly basis is resource-intensive, making the careful selection of acquisition days highly important.
The main goal of this thesis is the image-based prediction of cauliflower harvest-readiness. While the combination of monitoring cauliflower fields using drones and applications of deep learning enables automated harvest-readiness estimation, errors can occur due to field variability and limited training data. We assess and compare different models considering different forecasting times and prediction goals. We analyze the reliability of a harvest-readiness classifier with interpretable machine learning. By identifying groups of saliency maps, we derive reliability scores for each classification result using knowledge about the domain and the image properties. The reliability can be used for unseen data to (i) inform farmers to improve their decision-making and (ii) increase the model prediction accuracy.
Another approach examines harvest-readiness based on time series data, analyzing which acquisition days and developmental stages of the plants positively affect model accuracy. We use the interpretation technique GroupSHAP to gain insights into the acquisition days relevant to predictions and to support future data acquisition planning. By using image time series instead of single time points, we achieve a significant increase in model accuracy. GroupSHAP enables the identification of time points that positively affect model accuracy. By reducing the number of acquisition dates and focusing on positively influencing time points, accuracy improves further. A selective choice of acquisition dates can thus lead to more efficient data collection in the future.
The work described in this thesis makes several significant contributions to the task of harvest-readiness estimation of cauliflower. It integrates interpretable Machine Learning approaches for novel solutions to enhance classification accuracy and gain insights into the classifiers’ decision-making process. In practice, these insights can not only be used to improve classification models but also support farmers in their decision-making processes for harvest timing and data collection. All contributions were validated against our published GrowliFlower dataset, which also represents an important part of this work, and disseminated through conference papers and journal articles following the peer review process. The publication of the dataset supports the development and evaluation of various Machine Learning approaches and is expected to facilitate future research.},
url = {https://hdl.handle.net/20.500.11811/13085}
}
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-82796,
author = {{Jana Kierdorf}},
title = {Interpretable Machine Learning for Image-based Harvest-readiness Prediction of Cauliflower},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2025,
month = may,
note = {Cauliflower cultivation is subject to high-quality control criteria during sales, highlighting the importance of accurate harvest timing. However, accurately determining harvest-readiness is challenging because the cauliflower curd is covered by its canopy. This leads to cauliflower being harvested by hand, making the harvesting process labor-intensive and subjective. To address these challenges, there is growing interest in developing non-invasive, sensor-based approaches. These provide fast, field-comprehensive, cost-effective, and reliable solutions by delivering objective and non-invasive data. The integration of time series data for plant phenotyping can provide detailed insights into the dynamic development of cauliflower, enabling more precise predictions of the optimal harvest time compared to single-point observations. However, data acquisition on a daily or weekly basis is resource-intensive, making the careful selection of acquisition days highly important.
The main goal of this thesis is the image-based prediction of cauliflower harvest-readiness. While the combination of monitoring cauliflower fields using drones and applications of deep learning enables automated harvest-readiness estimation, errors can occur due to field variability and limited training data. We assess and compare different models considering different forecasting times and prediction goals. We analyze the reliability of a harvest-readiness classifier with interpretable machine learning. By identifying groups of saliency maps, we derive reliability scores for each classification result using knowledge about the domain and the image properties. The reliability can be used for unseen data to (i) inform farmers to improve their decision-making and (ii) increase the model prediction accuracy.
Another approach examines harvest-readiness based on time series data, analyzing which acquisition days and developmental stages of the plants positively affect model accuracy. We use the interpretation technique GroupSHAP to gain insights into the acquisition days relevant to predictions and to support future data acquisition planning. By using image time series instead of single time points, we achieve a significant increase in model accuracy. GroupSHAP enables the identification of time points that positively affect model accuracy. By reducing the number of acquisition dates and focusing on positively influencing time points, accuracy improves further. A selective choice of acquisition dates can thus lead to more efficient data collection in the future.
The work described in this thesis makes several significant contributions to the task of harvest-readiness estimation of cauliflower. It integrates interpretable Machine Learning approaches for novel solutions to enhance classification accuracy and gain insights into the classifiers’ decision-making process. In practice, these insights can not only be used to improve classification models but also support farmers in their decision-making processes for harvest timing and data collection. All contributions were validated against our published GrowliFlower dataset, which also represents an important part of this work, and disseminated through conference papers and journal articles following the peer review process. The publication of the dataset supports the development and evaluation of various Machine Learning approaches and is expected to facilitate future research.},
url = {https://hdl.handle.net/20.500.11811/13085}
}