Alignment of nucleotide or amino acid sequences on microcomputers, using a modification of Sellers' (1974) algorithm which avoids the need for calculation of the complete distance matrix |
| |
Authors: | H Tyson B Haley |
| |
Affiliation: | Biology Department, McGill University, 1205 Av. Dr. Penfield, Montreal, Quebec H3A 1B1, Canada |
| |
Abstract: | A program to calculate optimum alignment between two sequences, which may be DNA, amino acid or other information, has been written in PASCAL. The Sellers' algorithm for calculating distance between sequences has been modified to reduce its demands on microcomputer memory space by more than half. Gap penalties and mismatch scores are user-adjustable. In 48 K of memory the program aligns sequences up to 170 elements in length; optimum alignment and total distance between a pair of sequences are displayed. The program aligns longer sequences by subdivision of both sequences into corresponding, overlapping sections. Section length and amount of section overlap are user-defined. More importantly, extension of this modification of Sellers' algorithm to align longer sequences, given hardware and compilers/languages capable of using a larger memory space (e.g. 640 K), shows that it is now possible to align, without subdivision, sequences with up to 700 elements each. The increase in computation time for this program with increasing sequence lengths aligned without subdivision is curvilinear, but total times are essentially dependent on hardware/language/compiler combinations. The statistical significance of an alignment is examined with conventional Monte Carlo approaches. |
| |
Keywords: | Sequence alignment Microcomputer Seller's algorithm Monte Carlo methods |
本文献已被 ScienceDirect 等数据库收录! |
|