Vision-Based Semantic Scene Understanding for Agricultural Robots

Weyler, Jan

Volltext

View/Open (178MB)

Author

Weyler, Jan

ORCID

https://orcid.org/0000-0002-8944-8949

Type of Scholarly Publication

Dissertation

Date of Exam

28.10.2024

Date of Publication

13.12.2024

Advisor

Stachniss, Cyrill

Referee

Roscher, Ribana
Hanheide, Marc

Degree Granting Institutions

Rheinische Friedrich-Wilhelms-Universität Bonn

Metadata

Show full item record

Citable Links

Handle: https://hdl.handle.net/20.500.11811/12637
URN: https://nbn-resolving.org/urn:nbn:de:hbz:5-80080

Abstract

An essential task of agriculture has always been the production of food, feed, and fiber for an ever-growing world population with constrained arable land available. In the past decades, this challenge has been primarily addressed by novel and combined developments of agrochemicals and breeding high-yield crop varieties, which have substantially reduced the per capita cultivation area. However, conventionally applied agricultural procedures often use agrochemicals exhaustively across entire fields, causing detrimental effects on biodiversity and ecosystem services. This poses a challenge since biodiversity and ecosystem services are essential for a long-term, i.e., sustainable, supply of agricultural products. Moreover, the conventional breeding procedures of new high-yield crop varieties include many manual assessments, preventing high throughput. Thus, developing sustainable and automated alternative methods that increase agricultural production is vital to meet the increasing demands. In this context, agricultural robots offer the prospect of overcoming the limitations of conventional methods. Specifically, these robotic platforms can carry various sensors to autonomously identify local areas in agricultural fields where management actions are required, e.g., regions with high weed pressure. Subsequently, they can perform locally restricted targeted interventions using tools mounted on the agricultural platform, e.g., selective spraying units as well as mechanical or thermal units. Thus, they can substantially reduce the detrimental effects of any treatment while keeping productivity high. Moreover, they can cover entire fields time-effectively and perform autonomous in-field assessments. This increases the throughput of conventionally performed plant breeding procedures substantially and accelerates the development of high-yield crop varieties. A crucial requirement for agricultural robots to perform these autonomous actions is an accurate understanding of the current field status through their received sensory data. For example, the robot must first be able to automatically identify regions with high weed pressure to perform appropriate actions or even individual plants to perform detailed analyses. We refer to this type of comprehension about the field status through sensory data as semantic scene understanding that can range from a coarse to fine-grained analysis of sensor data in order to support different tasks performed autonomously by agricultural robots. In this thesis, we focus on developing novel vision-based semantic scene understanding methods based on imagery obtained by agricultural robots from arable field environments. As different tasks, such as automated targeted interventions in the field or autonomous in-field assessments, require different levels of understanding of the field, we present several approaches that can all be leveraged for sustainable and high-throughput farming procedures performed by agricultural robots. Specifically, we present in this thesis a method enabling a reliable association of each pixel in an image with the class crop, weed, or background in various agricultural fields captured by different agricultural robots. This information is essential to identify local areas where the robot should perform targeted management actions. Furthermore, we present an extended, readily applicable method that, simultaneously to the previous level of understanding, identifies individual plants as entire instances by automatically grouping all pixels belonging to a specific plant in the field. This extended information is crucial to restrict automated management action to single plants instead of local areas to reduce their ecological footprint further. Finally, we investigate a method to obtain the most holistic understanding of plants throughout this thesis that, in addition to single plants, enables the identification of individual leaves belonging to crops by grouping all pixels associated with a specific crop leaf. This enables an automated assessment of plant traits, e.g., the number of leaves per plant, for high-throughput breeding procedures targeting to develop new high-yield crop varieties. All previously described methods that we elaborate on in this thesis rely on machine learning techniques. Moreover, our technical contributions combine methods from photogrammetry as well as artificial intelligence and develop new interpretation methods that we apply to tasks in the agricultural domain. In sum, this thesis makes key contributions to several methods targeting a vision-based semantic scene understanding in the agricultural domain using images captured from agricultural fields. This information is a crucial requirement for agricultural robots to perform sustainable and autonomous actions in field environments targeting a sufficient long-term supply of agricultural products. Besides, the work presented in this thesis resulted in the publication of a large-scale image-based agricultural dataset that serves as a benchmark for developing novel semantic scene understanding methods in the agricultural domain. Moreover, we use this dataset throughout this thesis to ensure a consistent evaluation.

Subjects

Bildverarbeitung, Landwirtschaftliche Feldroboter, Maschinelles Lernen, Künstliche Intelligenz, Semantisches Szenenverständnis, Computer Vision, Agricultural Robotics, Machine Learning, Artificial Intelligence, Semantic Scene Understanding

Classification (DDC)

550 Geowissenschaften

600 Technik

630 Landwirtschaft, Veterinärmedizin

Related Publications

https://doi.org/10.1109/WACV51458.2022.00302
https://doi.org/10.1109/LRA.2023.3262417
https://doi.org/10.1109/TPAMI.2024.3419548
https://doi.org/10.1109/LRA.2023.3346760

Zitiervorschlag
BibTeX

Weyler, Jan: Vision-Based Semantic Scene Understanding for Agricultural Robots. - Bonn, 2024. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-80080

@phdthesis{handle:20.500.11811/12637,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-80080,
author = {{Jan Weyler}},
title = {Vision-Based Semantic Scene Understanding for Agricultural Robots},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2024,
month = dec,
note = {An essential task of agriculture has always been the production of food, feed, and fiber for an ever-growing world population with constrained arable land available. In the past decades, this challenge has been primarily addressed by novel and combined developments of agrochemicals and breeding high-yield crop varieties, which have substantially reduced the per capita cultivation area. However, conventionally applied agricultural procedures often use agrochemicals exhaustively across entire fields, causing detrimental effects on biodiversity and ecosystem services. This poses a challenge since biodiversity and ecosystem services are essential for a long-term, i.e., sustainable, supply of agricultural products. Moreover, the conventional breeding procedures of new high-yield crop varieties include many manual assessments, preventing high throughput. Thus, developing sustainable and automated alternative methods that increase agricultural production is vital to meet the increasing demands. In this context, agricultural robots offer the prospect of overcoming the limitations of conventional methods. Specifically, these robotic platforms can carry various sensors to autonomously identify local areas in agricultural fields where management actions are required, e.g., regions with high weed pressure. Subsequently, they can perform locally restricted targeted interventions using tools mounted on the agricultural platform, e.g., selective spraying units as well as mechanical or thermal units. Thus, they can substantially reduce the detrimental effects of any treatment while keeping productivity high. Moreover, they can cover entire fields time-effectively and perform autonomous in-field assessments. This increases the throughput of conventionally performed plant breeding procedures substantially and accelerates the development of high-yield crop varieties. A crucial requirement for agricultural robots to perform these autonomous actions is an accurate understanding of the current field status through their received sensory data. For example, the robot must first be able to automatically identify regions with high weed pressure to perform appropriate actions or even individual plants to perform detailed analyses. We refer to this type of comprehension about the field status through sensory data as semantic scene understanding that can range from a coarse to fine-grained analysis of sensor data in order to support different tasks performed autonomously by agricultural robots. In this thesis, we focus on developing novel vision-based semantic scene understanding methods based on imagery obtained by agricultural robots from arable field environments. As different tasks, such as automated targeted interventions in the field or autonomous in-field assessments, require different levels of understanding of the field, we present several approaches that can all be leveraged for sustainable and high-throughput farming procedures performed by agricultural robots. Specifically, we present in this thesis a method enabling a reliable association of each pixel in an image with the class crop, weed, or background in various agricultural fields captured by different agricultural robots. This information is essential to identify local areas where the robot should perform targeted management actions. Furthermore, we present an extended, readily applicable method that, simultaneously to the previous level of understanding, identifies individual plants as entire instances by automatically grouping all pixels belonging to a specific plant in the field. This extended information is crucial to restrict automated management action to single plants instead of local areas to reduce their ecological footprint further. Finally, we investigate a method to obtain the most holistic understanding of plants throughout this thesis that, in addition to single plants, enables the identification of individual leaves belonging to crops by grouping all pixels associated with a specific crop leaf. This enables an automated assessment of plant traits, e.g., the number of leaves per plant, for high-throughput breeding procedures targeting to develop new high-yield crop varieties. All previously described methods that we elaborate on in this thesis rely on machine learning techniques. Moreover, our technical contributions combine methods from photogrammetry as well as artificial intelligence and develop new interpretation methods that we apply to tasks in the agricultural domain. In sum, this thesis makes key contributions to several methods targeting a vision-based semantic scene understanding in the agricultural domain using images captured from agricultural fields. This information is a crucial requirement for agricultural robots to perform sustainable and autonomous actions in field environments targeting a sufficient long-term supply of agricultural products. Besides, the work presented in this thesis resulted in the publication of a large-scale image-based agricultural dataset that serves as a benchmark for developing novel semantic scene understanding methods in the agricultural domain. Moreover, we use this dataset throughout this thesis to ensure a consistent evaluation.},
url = {https://hdl.handle.net/20.500.11811/12637}
}

The following license files are associated with this item: