Graduate Thesis Or Dissertation
 

Classification context in a machine learning approach to predicting protein secondary structure

Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/2z10ws85n

Descriptions

Attribute NameValues
Creator
Abstract
  • An important problem in molecular biology is to predict the secondary structure of proteins from their primary structure. The primary structure of a protein is the sequence of amino acid residues. The secondary structure is an abstract description of the shape of the folded protein, with regions identified as alpha helix, beta strands, and random coil. Existing methods of secondary structure prediction examine a short segment of the primary structure and predict the secondary structure class (alpha, beta, coil) of an individual residue centered in that segment. The last few years of research have failed to improve these methods beyond the level of 65% correct predictions. This thesis investigates whether these methods can be improved by permitting them to examine externally-supplied predictions for the secondary structure of other residues in the segment. The externally-supplied predictions are called the "classification context," because they provide contextual information about the secondary structure classifications of neighboring residues. The classification context could be provided by an existing algorithm that made initial secondary structure predictions, and then these could be taken as input by a second algorithm that would attempt to improve the predictions. A series of experiments on both real and simulated classification context were performed to measure the possible improvement that could be obtained from classification context. The results showed that the classification context provided by current algorithms does not yield improved performance when used as input by those same algorithms. However, if the classification context is generated by randomly damaging the correct classifications, substantial performance improvements are possible. Even small amounts of randomly damaged correct context improves performance.
Resource Type
Date Available
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Academic Affiliation
Non-Academic Affiliation
Subject
Rights Statement
Publisher
Peer Reviewed
Language
Digitization Specifications
  • File scanned at 300 ppi (Monochrome, 8-bit Grayscale) using ScandAll PRO 1.8.1 on a Fi-6670 in PDF format. CVista PdfCompressor 4.0 was used for pdf compression and textual OCR.
Replaces

Relationships

Parents:

This work has no parents.

In Collection:

Items