bedtools

The bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.

bedtools: flexible tools for genome arithmetic and DNA sequence analysis.

usage:    bedtools <subcommand> [options]

 

The bedtools sub-commands include:

 

[ Genome arithmetic ]

    intersect     Find overlapping intervals in various ways.

    window        Find overlapping intervals within a window around an interval.

    closest       Find the closest, potentially non-overlapping interval.

    coverage      Compute the coverage over defined intervals.

    map           Apply a function to a column for each overlapping interval.

    genomecov     Compute the coverage over an entire genome.

    merge         Combine overlapping/nearby intervals into a single interval.

    cluster       Cluster (but don't merge) overlapping/nearby intervals.

    complement    Extract intervals _not_ represented by an interval file.

    subtract      Remove intervals based on overlaps b/w two files.

    slop          Adjust the size of intervals.

    flank         Create new intervals from the flanks of existing intervals.

    sort          Order the intervals in a file.

    random        Generate random intervals in a genome.

    shuffle       Randomly redistrubute intervals in a genome.

    annotate      Annotate coverage of features from multiple files.

 

[ Multi-way file comparisons ]

    multiinter    Identifies common intervals among multiple interval files.

    unionbedg     Combines coverage intervals from multiple BEDGRAPH files.

 

[ Paired-end manipulation ]

    pairtobed     Find pairs that overlap intervals in various ways.

    pairtopair    Find pairs that overlap other pairs in various ways.

 

[ Format conversion ]

    bamtobed      Convert BAM alignments to BED (& other) formats.

    bedtobam      Convert intervals to BAM records.

    bedtofastq    Convert BAM records to FASTQ records.

    bedpetobam    Convert BEDPE intervals to BAM records.

    bed12tobed6   Breaks BED12 intervals into discrete BED6 intervals.

 

[ Fasta manipulation ]

    getfasta      Use intervals to extract sequences from a FASTA file.

    maskfasta     Use intervals to mask sequences from a FASTA file.

    nuc           Profile the nucleotide content of intervals in a FASTA file.

 

[ BAM focused tools ]

    multicov      Counts coverage from multiple BAMs at specific intervals.

    tag           Tag BAM alignments based on overlaps with interval files.

 

[ Miscellaneous tools ]

    overlap       Computes the amount of overlap from two intervals.

    igv           Create an IGV snapshot batch script.

    links         Create a HTML page of links to UCSC locations.

    makewindows   Make interval "windows" across a genome.

    groupby       Group by common cols. & summarize oth. cols. (~ SQL "groupBy")

    expand        Replicate lines based on lists of values in columns.

 

[ General help ]

    --help        Print this help menu.

    --version     What version of bedtools are you using?.

    --contact     Feature requests, bugs, mailing lists, etc.

https://github.com/arq5x/bed

http://bedtools.readthedocs.org/en/latest/

posted @ 2014-08-15 15:07  skylinelzy  阅读(405)  评论(0)    收藏  举报