Bins the ranges of an scMethrix object.

bin_scMethrix(
  scm = NULL,
  regions = NULL,
  bin_size = 1e+05,
  bin_by = c("bp", "cpg"),
  trans = NULL,
  overlap_type = c("within", "start", "end", "any", "equal"),
  h5_dir = NULL,
  verbose = TRUE,
  batch_size = 20,
  n_threads = 1,
  replace = FALSE
)

Arguments

scm

scMethrix; the single cell methylation experiment

regions

Granges; The regions from which to make the bins.

bin_size

integer; The size of each bin. First bin will begin at the start position of the first genomic region on the chromosome. If NULL, there will be one bin per region. Default 100000.

bin_by

character; can create bins by # of base pairs "bp" or by # of CpG sites "cpg". Default "bp"

trans

named vector of closures; The transforms for each assay in a named vector. Default NULL, meaning that operations for "counts" assay is sum(x, na.rm=TRUE), and for all other assays is mean(x, na.rm=TRUE)

overlap_type

defines the type of the overlap of the CpG sites with the target region. Default value is within. For detailed description, see the findOverlaps function of the IRanges package.

h5_dir

string; The directory to use. Will be created if it does not exist. Default = NULL

verbose

boolean; Flag for outputting function status messages. Default = TRUE

batch_size

integer; The maximum number of elements to process at once.

n_threads

integer; Maximum number of parallel instances. Default = 1

replace

boolean; flag for whether to delete the contents of h5_dir before saving

Value

An scMethrix object

Details

Uses the inputted function to transform an assay in the scMethrix object. Typically, most assays will use either mean (for measurements) or sum (for counts). The transform is applied column-wise to optimize how HDF5 files access sample data. If HDF5 objects are used, transform functions should be from DelayedMatrixStats.

In the output object, the number of CpGs in each region is saved in mcol(scm)$n_cpgs.

Reduced dimensionality data will be discarded.

Examples

data('scMethrix_data') regions <- GRanges(seqnames = c("chr1"), ranges = IRanges(1,200000000)) regions <- unlist(tile(regions,10)) bin_scMethrix(scMethrix_data, regions = regions)
#> Binning experiment...
#> Subsetting CpG sites...
#> Subsetting by regions
#> Subset in 0.15s
#> Checking 10 input regions...
#> Generated 131 bins in 0.35s
#> Filling bins for the score assay...
#> Bins filled in 1.11s
#> Filling bins for the counts assay...
#> Bins filled in 0.04s
#> Rebuilding experiment...
#> Experiment binned for 10 regions containing 131 total bins in 1.74s
#> An object of class scMethrix #> n_CpGs: 131 #> n_samples: 4 #> assays: score, counts #> reduced dims: #> is_h5: FALSE #> Reference: hg19 #> Physical size: 37.5 Kb