Spicher, Sebastian: Robust Atomistic Modeling of Large Molecules by Efficient Force-Field and Tight-Binding Methods. - Bonn, 2021. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-64553
@phdthesis{handle:20.500.11811/9409,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-64553,
author = {{Sebastian Spicher}},
title = {Robust Atomistic Modeling of Large Molecules by Efficient Force-Field and Tight-Binding Methods},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2021,
month = nov,

note = {Modern chemistry has almost no boundaries in elemental composition and molecular size. State-of-the-art chemical systems reach tremendous dimensions containing thousands of atoms, due to significant progress in the fields of polymer and supramolecular chemistry, material science, and biochemistry. This is associated with an increasing demand for the performance of computational methods to compare experiments and simulations without restrictions in size and composition of the investigated system. Universal, fast, and yet accurate methods are therefore of increasing importance and related popularity in the field of theoretical chemistry. Thus, this thesis is devoted to the development and application of efficient force-field (FF) and tight-binding (TB) methods for the robust atomistic simulation of large molecules. In particular, a generic FF is introduced, as well as improvements to existing semiempirical extended tight-binding (xTB) methods, for the accurate calculation of Geometries, harmonic vibrational Frequencies (HVF), and Noncovalent interaction (NCI) energies, termed as GFN methods.
The main subjects of interest for experimental and theoretical comparisons are often molecular geometries, reaction free energies, and spectroscopic properties such as infrared (IR) spectra. Their description requires not only accurate energies, but also efficient gradients (first derivative), harmonic vibrational frequencies (second derivative), and corresponding solvation models. However, with increasing molecular size and complexity, the number of computational methods that are applicable for energies, geometries, and frequencies decreases rapidly. For ab initio electronic structure methods, therefore, the limit is reached for not much more than one hundred atoms and hence, a fully quantum mechanical (QM) description is not possible. The next consequent step towards higher computational efficiency is represented by semiempirical quantum mechanical (SQM) methods, even though they are often not generally applicable due to incomplete parameterizations or conceptual shortcomings. Recently, this changed by the development of the GFNn-xTB family of methods (n = {0,1,2}), which are parameterized for a major part of the periodic table up to radon. The underlying approximations extend the accessible atom size regime to ~1000 atoms. Yet, without massively parallel supercomputers, the description of larger systems remains denied and classical approaches such as FFs have to be applied. Although many different types of FFs exist, universally accurate variants still represent an almost blank space in the repertoire of theoretical methods. Therefore, the development of more accurate (polarizable) FFs is named as a "holy grail" for computational organic- and biochemistry.
The first part of this thesis presents a new generic force-field within the GFN framework. This method, termed GFN-FF, represents a unique, partially polarizable, universal FF for the accurate description of structures and dynamics of large molecules and is developed to combine FF speed with SQM accuracy. What distinguishes it from other FFs is a full periodic table (Z ≤ 86) parameterization and a completely automated setup routine. To yield high accuracy for NCIs, a sophisticated charge model based on electronegativity equilibration (EEQ) of Gaussian type charge densities is employed and the treatment of Pauli repulsion and London dispersion interactions is analogous to TB methods. Additionally, a novel hydrogen bond correction is introduced. In this thesis, a detailed description of the underlying theory is given followed by illustrative application examples. It is shown that for structures of metal-organic frameworks (MOF) and biomacromolecules (proteins) the GFN-FF optimized structures correspond well to the experimental crystal structures. GFN-FF is here in many cases the only applicable computational method. On established benchmark sets for conformational and NCI energies, GFN-FF often reaches an accuracy that is comparable to SQM methods or even more sophisticated GGA density functionals.
The next part of this thesis explores the new possibilities of GFN-FF in combination with the conformer-rotamer ensemble sample tool (CREST) in the context of conformational space exploration for large and complex structures, ranging from biomacromolecules to metal-organic frameworks. In a first application-based study, the gas storage of greenhouse gases and bio-fuels, such as carbon dioxide and methanol, in MOFs and porous organic cages (POCs) is investigated. Optimal binding sites are determined by the CREST algorithm at the GFN-FF level of theory and re-optimized by DFT. The association energies calculated by GFNn-xTB and GFN-FF show comparable accuracy to the good performing (meta-)GGAs. As a second study, spin-spin distance distributions for nitroxide labeled mutants of azurin and T4 lysozyme are modeled by molecular dynamics (MD) simulations at the GFN-FF level of theory and compared to experimental EPR results. With deviations to the experiment of less than 2 Å in the mean spin-spin distances, GFN-FF outperforms competitive methods. In the last part, GFN methods are assessed for the calculation of HVF from which the thermostatistical contributions to the free energies are derived within the modified rigid-rotor-harmonic-oscillator (RRHO) approximation. The accuracy of GFN2-xTB and GFN-FF is benchmarked in comparison to DFT reference data. As an outlook for future applications, free association energies, also including solvation effects, are calculated for protein-drug complexes of almost 5000 atoms. In addition, a new method termed single-point hessian (SPH) is introduced for improved HVF of general non-equilibrium structures, in which the input geometry is retained by the application of a biasing potential. Thereby, the SPH approach enables the calculation of accurate thermodynamics on every point of the potential energy surface (PES). Significant improvements in thermostatistical contributions and IR spectra are obtained by the SPH approach at the SQM and FF level of theory, if, e.g., DFT structures are provided as input. Finally, the effect of explicit solvation is investigated in the context of IR spectra. For the first time, a novel algorithm named quantum cluster growth (QCG) is applied, yielding results remarkably close to the experimental reference spectra.
Overall, the methods developed and evaluated in this work present a great leap forward in theoretical chemistry, bridging the gap between theory and experiment for large molecules. GFN-FF and SPH calculations are added to the portfolio of computational methods and represent valuable and versatile tools for theoretical pre-screening and modeling. From organometallic to biochemical systems, the unique combination of efficiency, generality, and accuracy of the GFN-FF and GFNn-xTB methods is promising for future applications in protein-drug design, gas storage, explicit solvation, free energy computations, and IR spectra interpretation.},

url = {https://hdl.handle.net/20.500.11811/9409}
}

The following license files are associated with this item:

InCopyright