Shredder sequence file(s) into consecutive pieces of random length.
gt shredder [option ...] [sequence_file ...]
-coverage [value]
set the number of times the sequence_file is shreddered (default: 1)
-minlength [value]
set the minimum length of the shreddered fragments (default: 300)
-maxlength [value]
set the maximum length of the shreddered fragments (default: 700)
-overlap [value]
set the overlap between consecutive pieces (default: 0)
-sample [value]
take samples of the generated sequences pieces with the given probability (default: 1.000000)
-width [value]
set output width for FASTA sequence printing (0 disables formatting) (default: 0)
-o [filename]
redirect output to specified file (default: undefined)
-gzip [yes|no]
write gzip compressed output file (default: no)
-bzip2 [yes|no]
write bzip2 compressed output file (default: no)
-force [yes|no]
force writing to output file (default: no)
-help
display help and exit
-version
display version information and exit
Each sequence given in sequence_file is shreddered into consecutive pieces of random length (between -minlength and -maxlength) until it is consumed. By this means the last shreddered fragment of a given sequence can be shorter than the argument to option -minlength. To get rid of such fragments use gt seqfilter (see example below).
Shredder a given BAC:
$ gt shredder U89959_genomic.fas > fragments.fas
Shredder an EST collection into pieces between 50 and 100 bp and get rid of all (terminal) fragments shorter than 50 bp:
$ gt shredder -minlength 50 -maxlength 100 U89959_ests.fas \ | gt seqfilter -minlength 50 - > fragments.fas # 130 out of 1260 sequences have been removed (10.317%)
Shredder an EST collection and show only random 10% of the resulting fragments:
$ gt shredder -sample 0.1 U89959_ests.fas
Report bugs to <[email protected]>.