Multiple alignment of nucleic acid and protein sequences
clustalw [-infile] file.ext [OPTIONS] clustalw [-help | -fullhelp]
Clustal W is a general purpose multiple alignment program for DNA or proteins.
The program performs simultaneous alignment of many nucleotide or amino acid sequences. It is typically run interactively, providing a menu and an online help. If you prefer to use it in command-line (batch) mode, you will have to give several options, the minimum being -infile.
-infile=file.ext
Input sequences.
-profile1=file.ext and -profile2=file.ext
Profiles (old alignment)
-options
List the command line parameters.
-help or -check
Outline the command line params.
-fullhelp
Output full help content.
-align
Do full multiple alignment.
-tree
Calculate NJ tree.
-pim
Output percent identity matrix (while calculating the tree).
-bootstrap=n
Bootstrap a NJ tree (n= number of bootstraps; def. = 1000).
-convert
Output the input sequences in a different file format.
General settings:
-interactive
Read command line, then enter normal interactive menus.
-quicktree
Use FAST algorithm for the alignment guide tree.
-type=
PROTEIN or DNA sequences.
-negative
Protein alignment with negative values in matrix.
-outfile=
Sequence alignment file name.
-output=
GCG, GDE, PHYLIP, PIR or NEXUS.
-outputorder=
INPUT or ALIGNED
-case
LOWER or UPPER (for GDE output only).
-seqnos=
OFF or ON (for Clustal output only).
-seqnos_range=
OFF or ON (NEW: for all output formats).
-range=m,n
Sequence range to write starting m to m+n.
-maxseqlen=n
Maximum allowed input sequence length.
-quiet
Reduce console output to minimum.
-stats=file
Log some alignments statistics to file.
Fast Pairwise Alignments:
-ktuple=n
Word size.
-topdiags=n
Number of best diags.
-window=n
Window around best diags.
-pairgap=n
Gap penalty.
-score
PERCENT or ABSOLUTE.
Slow Pairwise Alignments:
-pwmatrix=
:Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
-pwdnamatrix=
DNA weight matrix=BLOSUMIUB, BLOSUMCLUSTALW or BLOSUMfilename.
-pwgapopen=f
Gap opening penalty.
-pwgapext=f
Gap extension penalty.
Multiple Alignments:
-newtree=
File for new guide tree.
-usetree=
File for old guide tree.
-matrix=
Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename.
-dnamatrix=
DNA weight matrix=IUB, CLUSTALW or filename.
-gapopen=f
Gap opening penalty.
-gapext=f
Gap extension penalty.
-engaps
No end gap separation pen.
-gapdist=n
Gap separation pen. range.
-nogap
Residue-specific gaps off.
-nohgap
Hydrophilic gaps off.
-hgapresidues=
List hydrophilic res.
-maxdiv=n
Percent identity for delay.
-type=
PROTEIN or DNA
-transweight=f
Transitions weighting.
-iteration=
NONE or TREE or ALIGNMENT.
-numiter=n
Maximum number of iterations to perform.
Profile Alignments:
-profile
Merge two alignments by profile alignment.
-newtree1=
File for new guide tree for profile1.
-newtree2=
File for new guide tree for profile2.
-usetree1=
File for old guide tree for profile1.
-usetree2=
File for old guide tree for profile2.
Sequence to Profile Alignments:
-sequences
Sequentially add profile2 sequences to profile1 alignment.
-newtree=
File for new guide tree.
-usetree=
File for old guide tree.
Structure Alignments:
-nosecstr1
Do not use secondary structure-gap penalty mask for profile 1.
-nosecstr2
Do not use secondary structure-gap penalty mask for profile 2.
-secstrout=STRUCTURE or MASK or BOTH or NONE
Output in alignment file.
-helixgap=n
Gap penalty for helix core residues.
-strandgap=n
Gap penalty for strand core residues.
loopgap=n
Gap penalty for loop regions.
-terminalgap=n
Gap penalty for structure termini.
-helixendin=n
Number of residues inside helix to be treated as terminal.
-helixendout=n
Number of residues outside helix to be treated as terminal.
-strandendin=n
Number of residues inside strand to be treated as terminal.
-strandendout=n
Number of residues outside strand to be treated as terminal.
Trees:
-outputtree=nj OR phylip OR dist OR nexus
-seed=n
Seed number for bootstraps.
-kimura
Use Kimura's correction.
-tossgaps
Ignore positions with gaps.
-bootlabels=node
Position of bootstrap values in tree display.
-clustering=
NJ or UPGMA.
The Clustal bug tracking system can be found at \m[blue]http://bioinf.ucd.ie/bugzilla/buglist.cgi?quicksearch=clustal\m[].
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007). \m[blue]Clustal W and Clustal X version 2.0.\m[]\s-2\u[1]\d\s+2 Bioinformatics, 23, 2947-2948.
Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD. (2003). \m[blue]Multiple sequence alignment with the Clustal series of programs.\m[]\s-2\u[2]\d\s+2 Nucleic Acids Res., 31, 3497-3500.
Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ. (1998). \m[blue]Multiple sequence alignment with Clustal X\m[]\s-2\u[3]\d\s+2. Trends Biochem Sci., 23, 403-405.
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. (1997). \m[blue]The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.\m[]\s-2\u[4]\d\s+2 Nucleic Acids Res., 25, 4876-4882.
Higgins DG, Thompson JD, Gibson TJ. (1996). \m[blue]Using CLUSTAL for multiple sequence alignments.\m[]\s-2\u[5]\d\s+2 Methods Enzymol., 266, 383-402.
Thompson JD, Higgins DG, Gibson TJ. (1994). \m[blue]CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.\m[]\s-2\u[6]\d\s+2 Nucleic Acids Res., 22, 4673-4680.
Higgins DG. (1994). \m[blue]CLUSTAL V: multiple alignment of DNA and protein sequences.\m[]\s-2\u[7]\d\s+2 Methods Mol Biol., 25, 307-318
Higgins DG, Bleasby AJ, Fuchs R. (1992). \m[blue]CLUSTAL V: improved software for multiple sequence alignment.\m[]\s-2\u[8]\d\s+2 Comput. Appl. Biosci., 8, 189-191.
Higgins,D.G. and Sharp,P.M. (1989). \m[blue]Fast and sensitive multiple sequence alignments on a microcomputer.\m[]\s-2\u[9]\d\s+2 Comput. Appl. Biosci., 5, 151-153.
Higgins,D.G. and Sharp,P.M. (1988). \m[blue]CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.\m[]\s-2\u[10]\d\s+2 Gene, 73, 237-244.
Des Higgins
Copyright holder for Clustal.
Julie Thompson
Copyright holder for Clustal.
Toby Gibson
Copyright holder for Clustal.
Charles Plessy <[email protected]>
Prepared this manpage in DocBook XML for the Debian distribution.
Copyright © 1988–2010 Des Higgins, Julie Thompson & Toby Giboson (Clustal)
Copyright © 2008–2010 Charles Plessy (This manpage)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this program. If not, see http://www.gnu.org/licenses/, or on Debian systems, /usr/share/common-licenses/LGPL-3.
This manual page and its XML source can be used, modified, and redistributed as if it were in public domain.
Clustal W and Clustal X version 2.0.
http://www.ncbi.nlm.nih.gov/pubmed/17846036
Multiple sequence alignment with the Clustal series of programs.
http://www.ncbi.nlm.nih.gov/pubmed/12824352
Multiple sequence alignment with Clustal X
http://www.ncbi.nlm.nih.gov/pubmed/9810230
The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.
http://www.ncbi.nlm.nih.gov/pubmed/9396791
Using CLUSTAL for multiple sequence alignments.
http://www.ncbi.nlm.nih.gov/pubmed/8743695
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
http://www.ncbi.nlm.nih.gov/pubmed/7984417
CLUSTAL V: multiple alignment of DNA and protein sequences.
http://www.ncbi.nlm.nih.gov/pubmed/8004173
CLUSTAL V: improved software for multiple sequence alignment.
http://www.ncbi.nlm.nih.gov/pubmed/1591615
Fast and sensitive multiple sequence alignments on a microcomputer.
http://www.ncbi.nlm.nih.gov/pubmed/2720464
CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.
http://www.ncbi.nlm.nih.gov/pubmed/3243435