Predicting Rules for Cancer Subtype Classification using Grammar-Based Genetic Programming on various Genomic Data Types

Deng, Mario

dc.contributor.advisor	Perner, Sven
dc.contributor.author	Deng, Mario
dc.date.accessioned	2020-04-24T23:31:28Z
dc.date.available	2020-04-24T23:31:28Z
dc.date.issued	12.03.2018
dc.identifier.uri	https://hdl.handle.net/20.500.11811/7501
dc.description.abstract	With the advent of high-throughput methods more genomic data then ever has been generated during the past decade. As these technologies remain cost intensive and not worthwhile for every research group, databases, such as the TCGA and Firebrowse, emerged. While these database enable the fast and free access to massive amounts of genomic data, they also embody new challenges to the research community. This study investigates methods to obtain, normalize and process genomic data for computer aided decision making in the field of cancer subtype discovery. A new software, termed FirebrowseR is introduced, allowing the direct download of genomic data sets into the R programming environment. To pre-process the obtained data, a set of methods is introduced, enabling data type specific normalization. As a proof of principle, the Web-TCGA software is created, enabling fast data analysis. To explore cancer subtypes a statistical model, the EDL, is introduced. The newly developed method is designed to provide highly precise, yet interpretable models. The EDL is tested on well established data sets, while its performance is compared to state of the art machine learning algorithms. As a proof of principle, the EDL was run on a cohort of 1,000 breast cancer patients, where it reliably re-identified the known subtypes and automatically selected the corresponding maker genes, by which the subtypes are defined. In addition, novel patterns of alterations in well known maker genes could be identified to distinguish primary and mCRPC samples. The findings suggest that mCRPC is characterized through a unique amplification of the Androgen Receptor, while a significant fraction of primary samples is described by a loss of heterozygosity TP53 and NCOR1.	en
dc.language.iso	eng
dc.rights	In Copyright
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/
dc.subject	Krebs
dc.subject	Statistik
dc.subject	maschinelles Lernen
dc.subject	API
dc.subject	Optimierung
dc.subject	cancer
dc.subject	statistics
dc.subject	machine learning
dc.subject	optimization
dc.subject.ddc	004 Informatik
dc.subject.ddc	310 Allgemeine Statistiken
dc.subject.ddc	570 Biowissenschaften, Biologie
dc.subject.ddc	610 Medizin, Gesundheit
dc.title	Predicting Rules for Cancer Subtype Classification using Grammar-Based Genetic Programming on various Genomic Data Types
dc.type	Dissertation oder Habilitation
dc.publisher.name	Universitäts- und Landesbibliothek Bonn
dc.publisher.location	Bonn
dc.rights.accessRights	openAccess
dc.identifier.urn	https://nbn-resolving.org/urn:nbn:de:hbz:5n-49769
ulbbn.pubtype	Erstveröffentlichung
ulbbnediss.affiliation.name	Rheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.location	Bonn
ulbbnediss.thesis.level	Dissertation
ulbbnediss.dissID	4976
ulbbnediss.date.accepted	29.01.2018
ulbbnediss.institute	Mathematisch-Naturwissenschaftliche Fakultät : Fachgruppe Molekulare Biomedizin / Life & Medical Sciences-Institut (LIMES)
ulbbnediss.fakultaet	Mathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coReferee	Schultze, Joachim L.

Dateien zu dieser Ressource

Name:: 4976.pdf
Größe:: 4.1MB
Format:: PDF

Dokument öffnen

Das Dokument erscheint in:

E-Dissertationen (4305)

Zur Kurzanzeige

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden: