Schindler, Frank: Instance Segmentation, Tracking and Action Detection of Animals in Wildlife Videos. - Bonn, 2024. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-79703
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-79703
@phdthesis{handle:20.500.11811/12558,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-79703,
doi: https://doi.org/10.48565/bonndoc-424,
author = {{Frank Schindler}},
title = {Instance Segmentation, Tracking and Action Detection of Animals in Wildlife Videos},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2024,
month = nov,
note = {Monitoring animal species efficiently in their natural habitats is essential to describe and analyze the development of ecosystems and populations and to detect the causes of changes due to climate change or other external influences. Camera traps are increasingly being used to generate video material. Until now, however, the resulting material has either been examined manually by researchers or with systems that require their expert knowledge. Supporting ecologists with AI applications is not only necessary due to the large amount of data and limited number of available experts, but also enables new insights and standardized analyses. Therefore, an automation of this analysis process by adapting the prominent computer vision tasks of instance segmentation, tracking, and action detection to the context of ecology can help to solve important ecological problem statements like population estimation, animal migration or behavioral analysis.
In this doctoral thesis, we present a new approach to perform instance segmentation, tracking and action detection for camera trap videos of animals in one system. Central to our research is how reliable instance segmentation can improve both tracking and action detection.
The ability to accurately detect and track animals in wildlife videos is essential for researchers to analyze animal behavior and identify individual animals. Simply detecting animals by bounding boxes is not enough to distinguish between animals that are in close proximity to each other. Instead, a precise contour of each animal, an instance mask, is required, which is obtained by the instance segmentation. Moreover, an instance mask shows the pose of the animal, which is helpful for a detailed action recognition. We introduce SWIFT (Segmentation With FIltering of Tracklets), a novel multi-object tracking and segmentation (MOTS) pipeline that effectively addresses this problem. SWIFT improves the average precision of the instance masks compared to using state-of-the-art computer vision instance segmentation approaches by 4 percentage points on average for the different datasets. The SWIFT Tracking Algorithm that uses multiple filtering steps to either delete tracks that are found incorrectly or to merge tracks that are not yet connected achieves multi-object tracking and segmentation accuracy scores up to 68.0%.},
url = {https://hdl.handle.net/20.500.11811/12558}
}
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-79703,
doi: https://doi.org/10.48565/bonndoc-424,
author = {{Frank Schindler}},
title = {Instance Segmentation, Tracking and Action Detection of Animals in Wildlife Videos},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2024,
month = nov,
note = {Monitoring animal species efficiently in their natural habitats is essential to describe and analyze the development of ecosystems and populations and to detect the causes of changes due to climate change or other external influences. Camera traps are increasingly being used to generate video material. Until now, however, the resulting material has either been examined manually by researchers or with systems that require their expert knowledge. Supporting ecologists with AI applications is not only necessary due to the large amount of data and limited number of available experts, but also enables new insights and standardized analyses. Therefore, an automation of this analysis process by adapting the prominent computer vision tasks of instance segmentation, tracking, and action detection to the context of ecology can help to solve important ecological problem statements like population estimation, animal migration or behavioral analysis.
In this doctoral thesis, we present a new approach to perform instance segmentation, tracking and action detection for camera trap videos of animals in one system. Central to our research is how reliable instance segmentation can improve both tracking and action detection.
The ability to accurately detect and track animals in wildlife videos is essential for researchers to analyze animal behavior and identify individual animals. Simply detecting animals by bounding boxes is not enough to distinguish between animals that are in close proximity to each other. Instead, a precise contour of each animal, an instance mask, is required, which is obtained by the instance segmentation. Moreover, an instance mask shows the pose of the animal, which is helpful for a detailed action recognition. We introduce SWIFT (Segmentation With FIltering of Tracklets), a novel multi-object tracking and segmentation (MOTS) pipeline that effectively addresses this problem. SWIFT improves the average precision of the instance masks compared to using state-of-the-art computer vision instance segmentation approaches by 4 percentage points on average for the different datasets. The SWIFT Tracking Algorithm that uses multiple filtering steps to either delete tracks that are found incorrectly or to merge tracks that are not yet connected achieves multi-object tracking and segmentation accuracy scores up to 68.0%.},
url = {https://hdl.handle.net/20.500.11811/12558}
}