Zur Kurzanzeige

Predicting Rules for Cancer Subtype Classification using Grammar-Based Genetic Programming on various Genomic Data Types

dc.contributor.advisorPerner, Sven
dc.contributor.authorDeng, Mario
dc.date.accessioned2020-04-24T23:31:28Z
dc.date.available2020-04-24T23:31:28Z
dc.date.issued12.03.2018
dc.identifier.urihttps://hdl.handle.net/20.500.11811/7501
dc.description.abstractWith the advent of high-throughput methods more genomic data then ever has been generated during the past decade. As these technologies remain cost intensive and not worthwhile for every research group, databases, such as the TCGA and Firebrowse, emerged. While these database enable the fast and free access to massive amounts of genomic data, they also embody new challenges to the research community.
This study investigates methods to obtain, normalize and process genomic data for computer aided decision making in the field of cancer subtype discovery. A new software, termed FirebrowseR is introduced, allowing the direct download of genomic data sets into the R programming environment. To pre-process the obtained data, a set of methods is introduced, enabling data type specific normalization. As a proof of principle, the Web-TCGA software is created, enabling fast data analysis.
To explore cancer subtypes a statistical model, the EDL, is introduced. The newly developed method is designed to provide highly precise, yet interpretable models. The EDL is tested on well established data sets, while its performance is compared to state of the art machine learning algorithms. As a proof of principle, the EDL was run on a cohort of 1,000 breast cancer patients, where it reliably re-identified the known subtypes and automatically selected the corresponding maker genes, by which the subtypes are defined.
In addition, novel patterns of alterations in well known maker genes could be identified to distinguish primary and mCRPC samples. The findings suggest that mCRPC is characterized through a unique amplification of the Androgen Receptor, while a significant fraction of primary samples is described by a loss of heterozygosity TP53 and NCOR1.
dc.language.isoeng
dc.rightsIn Copyright
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectKrebs
dc.subjectStatistik
dc.subjectmaschinelles Lernen
dc.subjectAPI
dc.subjectOptimierung
dc.subjectcancer
dc.subjectstatistics
dc.subjectmachine learning
dc.subjectoptimization
dc.subject.ddc004 Informatik
dc.subject.ddc310 Allgemeine Statistiken
dc.subject.ddc570 Biowissenschaften, Biologie
dc.subject.ddc610 Medizin, Gesundheit
dc.titlePredicting Rules for Cancer Subtype Classification using Grammar-Based Genetic Programming on various Genomic Data Types
dc.typeDissertation oder Habilitation
dc.publisher.nameUniversitäts- und Landesbibliothek Bonn
dc.publisher.locationBonn
dc.rights.accessRightsopenAccess
dc.identifier.urnhttps://nbn-resolving.org/urn:nbn:de:hbz:5n-49769
ulbbn.pubtypeErstveröffentlichung
ulbbnediss.affiliation.nameRheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.locationBonn
ulbbnediss.thesis.levelDissertation
ulbbnediss.dissID4976
ulbbnediss.date.accepted29.01.2018
ulbbnediss.instituteMathematisch-Naturwissenschaftliche Fakultät : Fachgruppe Molekulare Biomedizin / Life & Medical Sciences-Institut (LIMES)
ulbbnediss.fakultaetMathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coRefereeSchultze, Joachim L.


Dateien zu dieser Ressource

Thumbnail

Das Dokument erscheint in:

Zur Kurzanzeige

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden:

InCopyright