SYNOPSIS

filterdup <-t file> [-o outputfile] [-g genomesize] [options]

DESCRIPTION

filterdup -- Filter duplicate reads like in MACS. This script can also be used to convert ELAND result, ELAND multi, ELAND export, SAM, BAM, BOWTIE map formats to BED format.

OPTIONS

--version

show program's version number and exit

-h, --help

show this help message and exit.

-t TFILE

Sequencing alignment file. REQUIRED.

-o OUTPUTFILE

Output BED file name. If not specified, will write to standard output. DEFAULT: stdout

-f FORMAT, --format=FORMAT

Format of tag file, "AUTO", "BED" or "ELAND" or "ELANDMULTI" or "ELANDEXPORT" or "SAM" or "BAM" or "BOWTIE". The default AUTO option will let %prog decide which format the file is. Please check the definition in 00README file if you choose ELAND/ELANDMULTI/ELANDEXPORT/SAM/BAM/BOWTIE. DEFAULT: "AUTO"

-g GSIZE, --gsize=GSIZE

Effective genome size. It can be 1.0e+9 or 1000000000, or shortcuts:'hs' for human (2.7e9), 'mm' for mouse (1.87e9), 'ce' for C. elegans (9e7) and 'dm' for fruitfly (1.2e8), DEFAULT:hs

-s TSIZE, --tsize=TSIZE

Tag size. This will overide the auto detected tag size. DEFAULT: Not set

-p PVALUE, --pvalue=PVALUE

Pvalue cutoff for binomial distribution test. DEFAULT:1e-5

--keep-dup=KEEPDUPLICATES

It controls the %prog behavior towards duplicate tags at the exact same location -- the same coordination and the same strand. The default 'auto' option makes %prog calculate the maximum tags at the exact same location based on binomal distribution using given -p as pvalue cutoff; and the 'all' option keeps every tags (useful if you only want to convert formats). If an integer is given, at most this number of tags will be kept at the same location. Default: auto

--verbose=VERBOSE

Set verbose level. 0: only show critical message, 1: show additional warning message, 2: show process information, 3: show debug messages. If you want to know where are the duplicate reads, use 3. DEFAULT:2