Clean up irregularities in ncbi asn.1 objects
cleanasn [\|-\|] [\|-A filename\|] [\|-C str\|] [\|-D str\|] [\|-F str\|] [\|-K str\|] [\|-L filename\|] [\|-M filename\|] [\|-N str\|] [\|-P str\|] [\|-Q str\|] [\|-R\|] [\|-S str\|] [\|-T\|] [\|-U str\|] [\|-V str\|] [\|-X str\|] [\|-Z str\|] [\|-a str\|] [\|-b\|] [\|-c\|] [\|-d str\|] [\|-f str\|] [\|-i filename\|] [\|-j filename\|] [\|-k filename\|] [\|-m str\|] [\|-n path\|] [\|-o filename\|] [\|-p path\|] [\|-q path\|] [\|-r path\|] [\|-v path\|] [\|-x ext\|]
cleanasn is a utility program to clean up irregularities in NCBI ASN.1 objects.
A summary of options is included below.
-
Print usage message
-A filename
Accession list file
-C str
Sequence operations, per the flags in str:
Compress
Decompress
Virtual gaps inside segmented sequence
Convert segmented set to delta sequence
-D str
Clean up descriptors, per the flags in str:
Remove Title
Remove Comment
Remove Nuc-Prot Set title
Remove Pop/Phy/Mut/Eco Set title
Remove mRNA title
Remove Protein title
-F str
Clean up features, per the flags in str:
Remove User-objects
Remove db_xrefs
Remove /evidence and /inference
Remove redundant gene xrefs
Fuse duplicate features
Package coding-region or parts features
Delete or update EC numbers
-K str
Perform a general cleanup, per the flags in str:
BasicSeqEntryCleanup
C++ BasicCleanup (via an external utility)
SeriousSeqEntryCleanup
GpipeSeqEntryCleanup
Normalize descriptor order
Remove NcbiCleanup User Objects
Synchronize genetic Codes
Resynchronize CDS partials
Resynchronize mRNA partials
Resynchronize Peptide partials
Adjust consensus splice
Promote to "worst" Seq-ID
-L filename
Log file
-M filename
Macro file
-N str
Clean up links, per the flags in str:
Link CDS mRNA by Overlap
Link CDS mRNA by Product
Reassign feature IDs
Fix missing reciprocal feature IDs
Clear feature IDs
-P
Publication options:
Remove All publications
Remove Serial number
Remove Figure, numbering, and name
Remove Remark
Update PMID-only publication
Replace unpublished with PMID
-Q str
Report:
Record count
ASN.1 BSEC report
ASN.1 SSEC report
NORM vs. SSEC report
PopPhyMutEco AutoDef report
Overlap report
Latitude-longitude country diff
Log SSEC differences
GenBank SSEC diff
asn2gb/asn2flat diff
Seg-to-delta GenBank diff
Validator SSEC diff
Modernize Gene/RNA/PCR
Unpublished Pub lookup
Published Pub lookup
Unindexed Journal report
Custom scan
-R
Remote fetching from ID (NCBI sequence databases)
-S str
Selective difference filter (capital letters skip)
SSEC
BSEC
Author
Publication
Location
RNA
Qualifier sort order
Genbank block
Package CdRegion or parts features
Move publication
Leave duplicate Bioseq publication
Automatic definition line
Pop/Phy/Mut/Eco Set definition line
-T
Taxonomy Lookup
-U str
Modernize, per the flags in str:
Genes
RNA
PCR Primers
-V str
Remove features by validator severity:
Reject
Error
Warning
Info
-X str
Miscellaneous options, per str:
Automatic definition line
Pop/Phy/Mut/Eco Set definition line
Instantiate NC title
Instantiate NM titles
Special XM titles
Instantiate Protein titles
Create mRNAs for coding sequences
Fix reciprocal protein_id/transcript_id
-Z str
Remove indicated User-object
-a str
ASN.1 type
Any (default)
Seq-entry
Bioseq
Bioseq-set
Seq-submit
Batch Processing [String]
-b
Input ASN.1 is Binary
-c
Input ASN.1 is Compressed
-d str
Source database
Any (default)
GenBank
EMBL
DDBJ
EMBL or DDBJ
RefSeq
NCBI
Only segmented sequences
Exclude segmented sequences
Exclude EMBL/DDBJ
Exclude gbcon, gbest, gbgss, gbhtg, gbpat, gbsts
-f str
Substring filter
-i filename
Single input file (defaults to stdin)
-j filename
First filename
-k filename
Last filename
-m str
Flatfile mode:
Release
Entrez
Sequin
Dump
-n path
asn2flat executable (default is /netopt/ncbi_tools/bin/asn2flat)
-o filename
Single output file (defaults to stdout)
-p path
Process all matching files in path
-q path
ffdiff executable (default is /netopt/genbank/subtool/bin/ffdiff)
-r path
Path for results
-v path
asnval executable (default is /netopt/ncbi_tools/bin/asnval)
-x ext
File selection suffix for use with -p (defaults to .ent)
The National Center for Biotechnology Information.