Walking tree heuristics for biological string alignment, gene location, and phylogenies Public Deposited

http://ir.library.oregonstate.edu/concern/technical_reports/wd375x462

Descriptions

Attribute NameValues
Creator
Abstract or Summary
  • Basic biological information is stored in strings of nucleic acids (DNA, RNA) or amino acids (proteins). Teasing out the meaning of these strings is a central problem of modern biology. Matching and aligning strings brings out their shared characteristics. Although string matching is well-understood in the edit-distance model, biological strings with transpositions and inversions violate this model's assumptions. We propose a family of heuristics called walking trees to align biologically reasonable strings. Both edit-distance and walking tree methods can locate specifi c genes within a large string when the genes' sequences are given. When we attempt to match whole strings, the walking tree matches most genes, while the edit-distance method fails. We also give examples in which the walking tree matches substrings even if they have been moved or inverted. The edit-distance method was not designed to handle these problems. We include an example in which the walking tree "discovered" a gene. Calculating scores for whole genome matches gives a method for approximating evolutionary distance. We show two evolutionary trees for the picornaviruses which were computed by the walking tree heuristic. Both of these trees show great similarity to previously constructed trees. The point of this demonstration is that WHOLE genomes can be matched and distances calculated. The first tree was created on a Sequent parallel computer and demonstrates that the walking tree heuristic can be effeciently parallelized. The second tree was created using a network of work stations and demonstrates that there is suffient parallelism in the phylogenetic tree calculation that the sequential walking tree can be used effectively on a network.
Resource Type
Date Available
Date Issued
Series
Keyword
Subject
Rights Statement
Publisher
Peer Reviewed
Language
Replaces
Additional Information
  • description.provenance : Made available in DSpace on 2012-02-14T23:19:49Z (GMT). No. of bitstreams: 1 Walking tree heuristics for biological string alignment gene location and phylogenies.pdf: 274919 bytes, checksum: 137326c5d6c39ed1d40c5966ff8c926d (MD5) Previous issue date: 1998
  • description.provenance : Submitted by Laura Wilson (laura.wilson@oregonstate.edu) on 2012-02-14T23:18:35Z No. of bitstreams: 1 Walking tree heuristics for biological string alignment gene location and phylogenies.pdf: 274919 bytes, checksum: 137326c5d6c39ed1d40c5966ff8c926d (MD5)
  • description.provenance : Approved for entry into archive by Laura Wilson(laura.wilson@oregonstate.edu) on 2012-02-14T23:19:49Z (GMT) No. of bitstreams: 1 Walking tree heuristics for biological string alignment gene location and phylogenies.pdf: 274919 bytes, checksum: 137326c5d6c39ed1d40c5966ff8c926d (MD5)

Relationships

In Administrative Set:
Last modified: 07/18/2017

Downloadable Content

Download PDF
Citations:

EndNote | Zotero | Mendeley

Items