Improved parallel and sequential walking tree methods for biological string alignments Public Deposited

http://ir.library.oregonstate.edu/concern/technical_reports/xd07gv26f

Descriptions

Attribute NameValues
Creator
Abstract or Summary
  • Approximate string matching is commonly used to align genetic sequences (DNA or RNA) to determine their shared characteristics. Most genetic string matching methods are based on the edit-distance model, which does not provide alignments for inversions and translocations. Recently, a heuristic called the Walking Tree Method [2, 3] has been developed to solve this problem. Unlike other heuristics, it can handle more than one level of inversion, i.e., inversions within inversions. Furthermore, it tends to capture the matched strings’ genes while other heuristics fail. There are two versions of the original walking tree heuristics: the score version gives only the alignment score, the alignment version gives both the score and the alignment mapping between the strings. The score version runs in quadratic time and uses linear space while the alignment version uses an extra log factor for time and space. In this paper, we will briefly describe the walking tree method and the original sequential and parallel algorithms. We will explain why different parallel algorithms are needed for a network of workstations rather than the original algorithm which worked well on a symmetric multi-processor. Our improved parallel method also led to a quadratic time sequential algorithm that uses less space. We give an example of our parallel method. We describe several experiments that show speedup linear in the number of processors, but eventual drop off in speedup as the communication network saturates. For big enough strings, we found linear speedup for all processors we had available. These results suggest that our improved parallel method will scale up as both the size of the problem and the number of processors increase. We include two figures that use real biological data and show that the walking tree methods can find translocations and inversions in DNA sequences and also discover unknown genes.
Resource Type
Date Available
Date Issued
Series
Keyword
Subject
Rights Statement
Publisher
Peer Reviewed
Language
Replaces
Additional Information
  • description.provenance : Submitted by Laura Wilson (laura.wilson@oregonstate.edu) on 2012-05-23T18:22:00Z No. of bitstreams: 1 Improved parallel and sequential walking tree methods for biological string alighments.pdf: 402234 bytes, checksum: 87ba0fad8380759c212e6e81ffbbc3ca (MD5)
  • description.provenance : Approved for entry into archive by Laura Wilson(laura.wilson@oregonstate.edu) on 2012-05-23T18:22:59Z (GMT) No. of bitstreams: 1 Improved parallel and sequential walking tree methods for biological string alighments.pdf: 402234 bytes, checksum: 87ba0fad8380759c212e6e81ffbbc3ca (MD5)
  • description.provenance : Made available in DSpace on 2012-05-23T18:23:00Z (GMT). No. of bitstreams: 1 Improved parallel and sequential walking tree methods for biological string alighments.pdf: 402234 bytes, checksum: 87ba0fad8380759c212e6e81ffbbc3ca (MD5) Previous issue date: 1999

Relationships

In Administrative Set:
Last modified: 07/18/2017

Downloadable Content

Download PDF
Citations:

EndNote | Zotero | Mendeley

Items