Welcome to gUtils’s documentation!¶
Contents:
Introduction¶
Range Operations¶
This section will describe additional GRanges operations provided by gUtils.
## make some example data sets
ref19 <- readRDS(system.file("extdata","refGene.hg19.gr.rds", package="gUtils"))
gr <- GRanges(1, IRanges(c(2,5,10), c(4,9,16)), seqinfo=Seqinfo("1", 20))
gr2 <- c(gr, GRanges(1, IRanges(c(1,9), c(6,14)), seqinfo=Seqinfo("1", 20)))
dt <- data.table(seqnames=1, start=c(2,5,10), end=c(3,8,15))
shift(gr, 2)
flank(gr, width=2)
gr.start(gr, width=3)
gr.end(gr, width=3)
gr.mid(gr) + 2
grbind
## add metadata to one field
mcols(gr)$score = 3
## try to concatenate
c(gr,gr2) ## ERROR
## with grbind
grbind(gr, gr2) ## SUCCESS. Adds NA for missing fields
## GenomicRanges::c does this already for GRangesList
streduce(gr2)
gr.sample(gr2, 2, len=2, replace=TRUE)
## output GRanges
GRanges object with 3 ranges and 1 metadata column:
seqnames ranges strand | query.id
<Rle> <IRanges> <Rle> | <integer>
[1] 1 [ 8, 9] * | 2
[2] 1 [ 5, 6] * | 2
[3] 1 [11, 12] * | 3
gr.rand(w=c(2,5,3), seqinfo(gr))
gr.simplify
gr.tile(GRanges(1, IRanges(1,9)), w=3) + 1
gr.refactor
gr.tile.map
gr.round
Data manipulation¶
R provides a number of data structures for storing genomic data, each with its advantages and drawbacks.
The most useful structures for this purpose are:
- GRanges
- Store ranges along with metadata, sequences and the coordaintes of the reference genome.
- GRangesList
- Store groups of ranges, with additional metadata belonging to the group.
- data.table
- Fast and efficient general-purpose container similar to data.frame, but with significant performance improvements.
In gUtils functions, we often manipulate the data to move between these data structures where one is more useful than another. A
key example is in gr.findoverlaps
, which converts the input GRanges
into data.table
objects to take advantage of the
blazing fast foverlaps
util. For the most part, these conversions should be invisible to the user.
However, often there are data structures conversions that may be useful to the end user. This includes unlisting GRangesList objects
into GRanges, making data.table objects from GRanges, and binding together multiple GRanges or GRangesList objects, among others. This
section will describe and demonstrate the functionality gUtils
provides for manipulating these data structures.
ref19 <- readRDS(system.file("extdata","refGene.hg19.gr.rds", package="gUtils"))
gr <- GRanges(1, IRanges(c(2,5,10), c(4,9,16)), seqinfo=Seqinfo("1", 20))
dt <- data.table(seqnames=1, start=c(2,5,10), end=c(3,8,15))
grbind
grlbind
dtgr
grdt
si2gr
gr2gatk
gr.flatten
gr.flatmap
grl.split
grl.stripnames
grl.unlist
grl.span
grl.pivot
rrbind
Additional utilities¶
gr.string and grl.string
parse.grl and parse.gr
gr.gatk
gr.chr and gr.nochr
gr.fix
gr.tostring
affine.map
gr.findoverlaps
gr.match
gr.in
gr.duplicated
gr.val
gr.dist
alpha
grl.in
chunk
import.ucsc