Current Search: Proteomes -- Data processing (x)
View All Items
- Title
- Statistical physics inspired methods to assign statistical significance in bioinformatics and proteomics: From sequence comparison to mass spectrometry based peptide sequencing.
- Creator
- Alves, Gelio, Florida Atlantic University, Yu, Yi-Kuo
- Abstract/Description
-
After the sequencing of many complete genomes, we are in a post-genomic era in which the most important task has changed from gathering genetic information to organizing the mass of data as well as under standing how components interact with each other. The former is usually undertaking using bioinformatics methods, while the latter task is generally termed proteomics. Success in both parts demands correct statistical significance assignments for results found. In my dissertation. I study two...
Show moreAfter the sequencing of many complete genomes, we are in a post-genomic era in which the most important task has changed from gathering genetic information to organizing the mass of data as well as under standing how components interact with each other. The former is usually undertaking using bioinformatics methods, while the latter task is generally termed proteomics. Success in both parts demands correct statistical significance assignments for results found. In my dissertation. I study two concrete examples: global sequence alignment statistics and peptide sequencing/identification using mass spectrometry. High-performance liquid chromatography coupled to a mass spectrometer (HPLC/MS/MS), enabling peptide identifications and thus protein identifications, has become the tool of choice in large-scale proteomics experiments. Peptide identification is usually done by database searches methods. The lack of robust statistical significance assignment among current methods motivated the development of a novel de novo algorithm, RAId, whose score statistics then provide statistical significance for high scoring peptides found in our custom, enzyme-digested peptide library. The ease of incorporating post-translation modifications is another important feature of RAId. To organize the massive protein/DNA data accumulated, biologists often cluster proteins according to their similarity via tools such as sequence alignment. Homologous proteins share similar domains. To assess the similarity of two domains usually requires alignment from head to toe, ie. a global alignment. A good alignment score statistics with an appropriate null model enable us to distinguish the biologically meaningful similarity from chance similarity. There has been much progress in local alignment statistics, which characterize score statistics when alignments tend to appear as a short segment of the whole sequence. For global alignment, which is useful in domain alignment, there is still much room for exploration/improvement. Here we present a variant of the direct polymer problem in random media (DPRM) to study the score distribution of global alignment. We have demonstrate that upon proper transformation the score statistics can be characterized by Tracy-Widom distributions, which correspond to the distributions for the largest eigenvalue of various ensembles of random matrices.
Show less - Date Issued
- 2006
- PURL
- http://purl.flvc.org/fcla/dt/12194
- Subject Headings
- Molecular biology--Data processing, Bioinformatics, Proteomics, Genomics
- Format
- Document (PDF)
- Title
- Bioinformatics mining of the dark matter proteome for cancer targets discovery.
- Creator
- Delgado, Ana Paula, Narayanan, Ramaswamy, Florida Atlantic University, Charles E. Schmidt College of Science, Department of Biological Sciences
- Abstract/Description
-
Mining the human genome for therapeutic target(s) discovery promises novel outcome. Over half of the proteins in the human genome however, remain uncharacterized. These proteins offer a potential for new target(s) discovery for diverse diseases. Additional targets for cancer diagnosis and therapy are urgently needed to help move away from the cytotoxic era to a targeted therapy approach. Bioinformatics and proteomics approaches can be used to characterize novel sequences in the genome...
Show moreMining the human genome for therapeutic target(s) discovery promises novel outcome. Over half of the proteins in the human genome however, remain uncharacterized. These proteins offer a potential for new target(s) discovery for diverse diseases. Additional targets for cancer diagnosis and therapy are urgently needed to help move away from the cytotoxic era to a targeted therapy approach. Bioinformatics and proteomics approaches can be used to characterize novel sequences in the genome database to infer putative function. The hypothesis that the amino acid motifs and proteins domains of the uncharacterized proteins can be used as a starting point to predict putative function of these proteins provided the framework for the research discussed in this dissertation.
Show less - Date Issued
- 2015
- PURL
- http://purl.flvc.org/fau/fd/FA00004361, http://purl.flvc.org/fau/fd/FA00004361
- Subject Headings
- Bioinformatics, Cancer -- Genetic aspects, Drug development -- Data processing, Genomics, Medical informatics, Proteomes -- Data processing, Tumors -- Immunological aspects
- Format
- Document (PDF)