Find signifcantly mutated pathways in a cohort given a list of somatic mutations.
This document describes gmt music path-scan version 0.04 (2013-05-14 at 16:03:05)
gmt music path-scan --gene-covg-dir=? --bam-list=? --pathway-file=? --maf-file=? --output-file=? [--bmr=?] [--genes-to-ignore=?] [--min-mut-genes-per-path=?] [--skip-non-coding] [--skip-silent]
... music path-scan \ --bam-list input_dir/bam_file_list \ --gene-covg-dir output_dir/gene_covgs/ \ --maf-file input_dir/myMAF.tsv \ --output-file output_dir/sm_pathways \ --pathway-file input_dir/pathway_dbs/KEGG.txt \ --bmr 8.7E-07
Directory containing per-gene coverage files (Created using music bmr calc-covg)
Tab delimited list of \s-1BAM\s0 files [sample_name, normal_bam, tumor_bam] (See Description)
Tab-delimited file of pathway information (See Description)
List of mutations using \s-1TCGA\s0 \s-1MAF\s0 specifications v2.3
Output file that will list the significant pathways and their p-values
Background mutation rate in the targeted regions Default value '1e-06' if not specified
Comma-delimited list of genes whose mutations should be ignored
Pathways with fewer mutated genes than this, will be ignored Default value '1' if not specified
Skip non-coding mutations from the provided \s-1MAF\s0 file Default value 'true' if not specified
Skip silent mutations from the provided \s-1MAF\s0 file Default value 'true' if not specified
Only the following four columns in the \s-1MAF\s0 are used. All other columns may be left blank.
Col 1: Hugo_Symbol (Need not be HUGO, but must match gene names used in the pathway file) Col 2: Entrez_Gene_Id (Matching Entrez ID trump gene name matches between pathway file and MAF) Col 9: Variant_Classification Col 16: Tumor_Sample_Barcode (Must match the name in sample-list, or contain it as a substring)
The Entrez_Gene_Id can also be left blank (or set to 0), but it is highly recommended, in case genes are named differently in the pathway file and the \s-1MAF\s0 file.
For example, a line in the pathway-file would look like: hsa00061 Fatty acid biosynthesis Lipid Metabolism 31:ACACA|32:ACACB|27349:MCAT|2194:FASN|54995:OXSM|55301:OLAH Ensure that the gene names and entrez IDs used match those used in the \s-1MAF\s0 file. Entrez IDs are not mandatory (use a 0 if Entrez \s-1ID\s0 unknown). But if a gene name in the \s-1MAF\s0 does not match any gene name in this file, the entrez IDs are used to find a match (unless it's a 0).
Michael Wendl, Ph.D.
This module uses reformatted copies of data from the Kyoto Encyclopedia of Genes and Genomes (\s-1KEGG\s0) database:
* KEGG - http://www.genome.jp/kegg/