Sadeqi, Mohammad Bahman: Genome Wide Association Study and Genomic Selection Models for Nitrogen Use Efficiency in Bread Wheat. - Bonn, 2024. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-76203
@phdthesis{handle:20.500.11811/11543,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-76203,
author = {{Mohammad Bahman Sadeqi}},
title = {Genome Wide Association Study and Genomic Selection Models for Nitrogen Use Efficiency in Bread Wheat},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2024,
month = may,

note = {Nitrogen (N) as an essential element in the structure of proteins, nucleic acids and chlorophyll plays an important role in grain yield. Nitrogen fertilizer, which is most commonly used in cereal production, is necessary to increase the shoot biomass and dry matter of bread wheat. Increasing the amount of nitrogen fertilizer contributes greatly to yield stability in bread wheat, but soil and environmental pollution due to significant greenhouse gas emissions from nitrogen fertilizer production, the cost of nitrogen fertilizer and the energy required in agricultural practice can be considered as major negative consequences of nitrogen fertilizer. For all these reasons, it is important to identify nitrogen use efficiency (NUE) as a complex target trait in breeding programs. Based on this definition, investigating GY under different N applications could be a practical approach to model NUE. The simultaneous optimization of NUE and GY under different N applications could be the main goal of breeding for use efficiency. Characterization of agronomic traits to model NUE provides useful information and guidance for genomic selection programs. Allelic variation for GY at low and high N could be high due to the large target size of mutations in candidate genes. Therefore, the major challenge in genome-wide association study (GWAS) models is to find a significant and reliable association for the complex trait. To address this dilemma, two precise and efficient computational approaches, local FDR correction and Bayesian survival analysis, were developed as different filters to determine the best GWAS model and obtain a reliable association in the output of the best selected model. GWAS models for GY under low and high N levels have shown that the local FDR correction based only on maximum likelihood estimation is not more accurate than the local Bayesian FDR correction to determine the effect size and power of a large-scale genomic file. Currently, phenotyping with the aim of identifying high yielding genotypes is still expensive compared to genomic selection (GS) approaches. GS models consist of a whole genome genotyping file (SNPs) and a phenotyping file (individuals) in the reference population (training population). Statistical machine learning algorithms such as classical methods, kernel regression and ensemble learning algorithms are used to predict the phenotypes or breeding values (BVs) of the candidates for selection in the test population (validation). In modern GS models, there are two types of traits, including genetic parameters with random effects and hyper-parameters with fixed effects, which determine the results. Linear GS models such as rrBLUP and gBLUP are specified without having to worry too much about the assumptions. Bayesian inference, however, is more flexible when it comes to assumptions, and it has a distribution of responses that can change with each run. The main challenge with BGLR and LASSO is to ensure that the distribution of statistical estimators follows the genetic parameters of the population. In SVM, hyper-parameter optimization can be done by various methods available for hyper-parameter optimization in SVM, but the most commonly used and convenient method is grid search. Our study has shown that focusing on the definition and optimization of the regularization parameters is crucial for the performance and accuracy of the GS model, which has not been sufficiently addressed in previous GS studies. However, in the BOOST model, the regularization parameter is only used to control the bias of the model. By adjusting the regularization parameter, the model can be made less complex and less prone to overfitting. In the BAGG model, the regularization parameter is used to control the variance of the model. By reducing the variance, the model can be made more stable and less susceptible to noise. In STACK, both bias and variance are taken into account by adjusting the regularization parameter to find a balance between model complexity and stability. Thus, our study confirmed the results of the bias-variance trade-off and the adaptive error of prediction for the STACK model was in the mid-range compared to other models. This remarkable result for the STACK model is consistent with previous results. For all ensemble models, especially the STACK model, the number of epochs and the stack must be specified as hyper-parameters together with the activation process. Ultimately, a smaller learning rate in the training dataset with a desired batch size leads to maximum SNP heritability and genomic estimated breeding values (GEBVs) at the mean of the given GS model.},
url = {https://hdl.handle.net/20.500.11811/11543}
}

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden:

InCopyright