DESCRIPTION

This document illustrates some common formats used for sequences representation.

\s-1EMBL\s0
 ID   MMVASPHOS  standard; RNA; EST; 140 BP.
 AC   X97897;
 DE   M.musculus mRNA for protein homologous to
 DE   vasodilator-stimulated phosphoprotein
 SQ   Sequence 140 BP; 25 A; 58 C; 39 G; 17 T; 1 other;
      ttctcccaga agctgactct atggngaccc cgagagagac tgagcagaac      60
      ccccgcaccc ctgcacttcc aatcaggggc gccccgggag cactccccgt     120
      ccgccctccg cgcagccatg                                      140
 //
\s-1FASTA\s0

>MMVASPHOS ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc ccgccctccgcgcagccatg

\s-1GCG\s0

!!NA_SEQUENCE 1.0 (No documentation) dna1.txt Length: 88 Nov 22, 2001 14:38 Type: N Check: 3818 ..

1 TAGTCGTAGT CGGAGCGATG CTGACGATGA CGATGACGAT CGTAGCTGAT

51 CGATCGAGCT GATGCTGATC GAGCTAGCTG ATCGATCG

\s-1GDE\s0

#sample1 TTCAAGAGAAACAGCGGCCAAGGAAAAGACTCGGCATGATTGTCCATAGCTTACAAAGCG #sample2 TTCAAGAGAAACAGCGGCTGGGGGAAAGACTCGTCCTGATTGCCTGTAGATGGTAAAGCG

\s-1GENBANK\s0

LOCUS HUMHBV1 130 bp DNA PRI 17-JUN-1993 DEFINITION Human DNA/endogenous Hepatitis B virus (HBV) DNA, left host viral junction. ACCESSION M15770 BASE COUNT 32 a 43 c 29 g 26 t ORIGIN 1 agcgggcagt gcagctgctt ggacagcagg ggtgtttctt caacccaggc 61 ctcctgtcac aacaggccca ttcaattctg aacctgcaag ccaactccaa 121 cctcttttcc cagggggaac caaaaaccct //

\s-1IG\s0

; comment U03518 AACCTGCGGAAGGATCATTACCGAGTGCGGGTCCTTTGGGCCCAACCTCCCATCCGTGTC TATTGTACCCTGTTGCTTCGGCGGGCCCGCCGCTTGTCGGCCGCCGGGGGGGCGCCTCTG TGAGTTGATTGAATGCAATCAGTTAAAACTTTCAACAATGGATCTCTTGGTTCCGGC1

\s-1NBRF\s0 (pir)

>P1;CCHU cytochrome c [validated] - human MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIW GEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKATNE*

\s-1CODATA\s0

ENTRY CCHU #type complete TITLE cytochrome c [validated] - human ACCESSIONS A31764; A05676; I55192; A00001 SUMMARY #length 105 #molecular-weight 11749 #checksum 3247 SEQUENCE 5 10 15 20 25 30 1 M G D V E K G K K I F I M K C S Q C H T V E K G G K H K T G 31 P N L H G L F G R K T G Q A P G Y S Y T A A N K N K G I I W 61 G E D T L M E Y L E N P K K Y I P G T K M I F V G I K K K E 91 E R A D L I A Y L K K A T N E ///

\s-1RAW\s0

ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc ccgccctccgcgcagccatg Warning: This format cannot handle more than one sequence per file.

\s-1SWISSPROT\s0

ID 100K_RAT STANDARD; PRT; 149 AA. AC Q62671; DE 100 kDa protein (EC 6.3.2.-). SQ SEQUENCE 149 AA; 17004 MW; D06484B8BC29112E CRC64; MMSARGDFLN YALSLMRSHN DEHSDVLPVL DVCSLKHVAY VFQALIYWIK PQLERKRTRE LLELGIDNED SEHENDDDTS QSATLNDKDD ESLPAETGQN SITIRPPDDQ HLPTANTCIS RLYVPLYSSK QILKQKLLLA IKTKNFGFV //

RELATED TO seqfmt…

squizz(1), alifmt(5)

AUTHOR

Nicolas Joly ([email protected]), Institut Pasteur.