amap: Protein multiple alignment by sequence annealing

SYNOPSIS

amap [OPTION] [MFAFILE] [MFAFILE]

DESCRIPTION

AMAP is a tool to perform multiple alignment of peptidic sequences. It utilizes posterior decoding, and a sequence-annealing alignment, instead of the traditional progressive alignment method. It is the only alignment program that allows one to control the sensitivity / specificity tradeoff. It is based on the ProbCons source code, but uses alignment metric accuracy and eliminates the consistency transformation.

In its default configuration, AMAP is tuned to maximize the expected Alignment Metric Accuracy (AMA) score - a new alignment accuracy measure, based on a metric for the multiple-alignment space, which integrates sensitivity and specificity into a single balanced measure. AMA is defined as the fraction of correctly aligned residues (either to another residue or to a gap) out of the total number of residues in all the sequences.

amap aligns sequences provided in MFA format. This format consists of multiple sequences. Each sequence in MFA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (“>”) symbol in the first column.

OPTIONS

-clustalw

use CLUSTALW output format instead of MFA

-c --consistency REPS

use 0 <= REPS <= 5 (default: 0) passes of consistency transformation

-ir --iterative-refinement REPS

use 0 <= REPS <=1000 (default: 0) passes of iterative-refinement

-pre --pre-training REPS

use 0 <= REPS <= 20 (default: 0) rounds of pretraining

-pairs

generate all-pairs pairwise alignments

-viterbi

use Viterbi algorithm to generate all pairs (automatically enables -pairs)

-v --verbose

Report progress while aligning (default: off)

-annot FILENAME

write annotation for multiple alignment to FILENAME

-t --train FILENAME

compute EM transition probabilities, store in FILENAME (default: no training)

-e --emissions

also reestimate emission probabilities (default: off)

-p --paramfile FILENAME

read parameters from FILENAME (default: )

-a --alignment-order

print sequences in alignment order rather than input order (default: off)

-g --gap-factor GF

use GF as the gap-factor parameter, set to 0 for best sensitivity, higher values for better specificity (default: 0.5)

-w --edge-weight-threshold W

stop the sequence annealing process when best edge has lower weight than W, set to 0 for best sensitivity, higher values for better specificity (default: 0)

-prog --progressive

use progressive alignment instead of sequence annealing alignment (default: off)

-noreorder --no-edge-reordering

disable reordering of edges during sequence annealing alignment (default: off)

-maxstep --use-max-stepsize

use maximum improvement step size instead of tGf edge ranking (default: off)

-print --print-posteriors

only print the posterior probability matrices (default: off)

-gui START STEP

print output for the AMAP Display Java based GUI (default: ) starting at weight START (default: infinity) with step size STEP (default: )

EXAMPLES

To run AMAP with the default options change to the align directory and type:

% amap <multi-fasta-file-name>

If no file name is provided the list of options are printed.

In order to use the AMAP Display run AMAP with the -gui option, and save the output to a file, then use the file as the input to AmapDisplay. For example, type:

% align/amap -gui examples/BB12020.tfa > examples/BB12020.tfa.out

% java -jar display/AmapDisplay.jar examples/BB12020.tfa.out

(on Debian systems, the examples directory is in /usr/share/doc/amap-align/examples

NOTE

In older versions ( < 2.0-1) of the package for Debian(TM) systems, the amap command was renamed amap-align because there was already another tool called amap (which performs some computer network diagnostics). A symbolic link amap-align is still provided for upgrade purposes but will be removed in Debian releases posterior to Etch (Debian 4.0).

RELATED TO amap…

The current version of AMAP uses the PROBCONS 1.09 code base for some of the input/output procedures, and for the calculation of posterior probabilities (see PROBCONS.README in /usr/share/doc/amap-align/). Future releases might implement the algorithm using a new independent code base.

On Debian(TM) systems, probcons(1) is available in the probcons package.

REFERENCES

For more details on AMAP and AMA, see Schwartz, Ariel S., Myers, Eugene W., and Pachter, Lior. Alignment Metric Accuracy (Submitted for publication). For more details on sequence-annealing, see Schwartz, Ariel S. and Pachter, Lior. Multiple Alignment by Sequence Annealing (Submitted for publication).

PROBCONS was published in Do, C.B., Mahabhashyam, M.S.P., Brudno, M., and Batzoglou, S. 2005. PROBCONS: Probabilistic Consistency-based Multiple Sequence Alignment. Genome Research 15: 330-340.

AUTHORS

Ariel Schwartz <[email protected]>

Upstream author of AMAP

Chuong Do

Wrote Probcons, on which AMAP is based.

Charles Plessy> <[email protected]>

Wrote this manpage in DocBook XML for the Debian distribution.

COPYRIGHT

AMAP, PROBCONS, and this manual page have been made freely available as PUBLIC DOMAIN software and hence are not subject to copyright in the United States. This system and/or any portion of the source code may be used, modified, or redistributed without restrictions. AMAP, PROBCONS and this manual page are distributed WITHOUT WARRANTY, express or implied. The authors accept NO LEGAL LIABILITY OR RESPONSIBILITY for loss due to reliance on the program.

amap (1)