Protein multiple alignment by sequence annealing
amap [OPTION] [MFAFILE] [MFAFILE]
AMAP is a tool to perform multiple alignment of peptidic sequences. It utilizes posterior decoding, and a sequence-annealing alignment, instead of the traditional progressive alignment method. It is the only alignment program that allows one to control the sensitivity / specificity tradeoff. It is based on the ProbCons source code, but uses alignment metric accuracy and eliminates the consistency transformation.
In its default configuration, AMAP is tuned to maximize the expected Alignment Metric Accuracy (AMA) score - a new alignment accuracy measure, based on a metric for the multiple-alignment space, which integrates sensitivity and specificity into a single balanced measure. AMA is defined as the fraction of correctly aligned residues (either to another residue or to a gap) out of the total number of residues in all the sequences.
amap aligns sequences provided in MFA format. This format consists of multiple sequences. Each sequence in MFA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (“>”) symbol in the first column.
-clustalw
use CLUSTALW output format instead of MFA
-c --consistency REPS
use 0 <= REPS <= 5 (default: 0) passes of consistency transformation
-ir --iterative-refinement REPS
use 0 <= REPS <=1000 (default: 0) passes of iterative-refinement
-pre --pre-training REPS
use 0 <= REPS <= 20 (default: 0) rounds of pretraining
-pairs
generate all-pairs pairwise alignments
-viterbi
use Viterbi algorithm to generate all pairs (automatically enables -pairs)
-v --verbose
Report progress while aligning (default: off)
-annot FILENAME
write annotation for multiple alignment to FILENAME
-t --train FILENAME
compute EM transition probabilities, store in FILENAME (default: no training)
-e --emissions
also reestimate emission probabilities (default: off)
-p --paramfile FILENAME
read parameters from FILENAME (default: )
-a --alignment-order
print sequences in alignment order rather than input order (default: off)
-g --gap-factor GF
use GF as the gap-factor parameter, set to 0 for best sensitivity, higher values for better specificity (default: 0.5)
-w --edge-weight-threshold W
stop the sequence annealing process when best edge has lower weight than W, set to 0 for best sensitivity, higher values for better specificity (default: 0)
-prog --progressive
use progressive alignment instead of sequence annealing alignment (default: off)
-noreorder --no-edge-reordering
disable reordering of edges during sequence annealing alignment (default: off)
-maxstep --use-max-stepsize
use maximum improvement step size instead of tGf edge ranking (default: off)
-print --print-posteriors
only print the posterior probability matrices (default: off)
-gui START STEP
print output for the AMAP Display Java based GUI (default: ) starting at weight START (default: infinity) with step size STEP (default: )
To run AMAP with the default options change to the align directory and type:
% amap <multi-fasta-file-name>
If no file name is provided the list of options are printed.
In order to use the AMAP Display run AMAP with the -gui option, and save the output to a file, then use the file as the input to AmapDisplay. For example, type:
% align/amap -gui examples/BB12020.tfa > examples/BB12020.tfa.out
% java -jar display/AmapDisplay.jar examples/BB12020.tfa.out
(on Debian systems, the examples directory is in /usr/share/doc/amap-align/examples
In older versions ( < 2.0-1) of the package for Debian(TM) systems, the amap command was renamed amap-align because there was already another tool called amap (which performs some computer network diagnostics). A symbolic link amap-align is still provided for upgrade purposes but will be removed in Debian releases posterior to Etch (Debian 4.0).
The current version of AMAP uses the PROBCONS 1.09 code base for some of the input/output procedures, and for the calculation of posterior probabilities (see PROBCONS.README in /usr/share/doc/amap-align/). Future releases might implement the algorithm using a new independent code base.
On Debian(TM) systems, probcons(1) is available in the probcons package.
For more details on AMAP and AMA, see Schwartz, Ariel S., Myers, Eugene W., and Pachter, Lior. Alignment Metric Accuracy (Submitted for publication). For more details on sequence-annealing, see Schwartz, Ariel S. and Pachter, Lior. Multiple Alignment by Sequence Annealing (Submitted for publication).
PROBCONS was published in Do, C.B., Mahabhashyam, M.S.P., Brudno, M., and Batzoglou, S. 2005. PROBCONS: Probabilistic Consistency-based Multiple Sequence Alignment. Genome Research 15: 330-340.
Ariel Schwartz <[email protected]>
Upstream author of AMAP
Chuong Do
Wrote Probcons, on which AMAP is based.
Charles Plessy> <[email protected]>
Wrote this manpage in DocBook XML for the Debian distribution.
AMAP, PROBCONS, and this manual page have been made freely available as PUBLIC DOMAIN software and hence are not subject to copyright in the United States. This system and/or any portion of the source code may be used, modified, or redistributed without restrictions. AMAP, PROBCONS and this manual page are distributed WITHOUT WARRANTY, express or implied. The authors accept NO LEGAL LIABILITY OR RESPONSIBILITY for loss due to reliance on the program.