Masks CpGs by coverage

mask_by_coverage(
  scm = NULL,
  assay = "score",
  low_threshold = NULL,
  avg_threshold = NULL,
  n_threads = 1,
  verbose = TRUE
)

Arguments

scm

scMethrix; the single cell methylation experiment

assay

string; name of an existing assay. Default = "score"

low_threshold

numeric; The minimal coverage allowed. Everything below will get masked. If NULL, this will be ignored.

avg_threshold

numeric; The max average coverage. Typical value is 2, as there should only be 2 reads per cell. If NULL, this will be ignored.

n_threads

integer; Number of parallel instances. Can only be used if scMethrix is in HDF5 format. Default = 1

verbose

boolean; Flag for outputting function status messages. Default = TRUE

Value

An object of class scMethrix

Details

Takes scMethrix object and masks sites with low overall or high average coverage by putting NA for assay values. The sites will remain in the object and all assays will be affected.

low_threshold is used to mask sites with low overall coverage. avg_threshold is used to mask sites with high aberrant counts. For single cell data, this is typically CpG sites with an average count > 2, as there are only two strands in a cell to sequence.

Examples

data('scMethrix_data') mask_by_coverage(scMethrix_data,low_threshold=2, avg_threshold=2)
#> Masking by coverage...
#> Finding low coverage CpG sites...
#> Found 54 CpGs with coverage < 2
#> Finding high average count CpG sites...
#> Found 0 CpGs with average coverage > 2
#> Masked 54 [18.88%] CpG sites in 0.08s
#> An object of class scMethrix #> n_CpGs: 286 #> n_samples: 4 #> assays: score, counts #> reduced dims: #> is_h5: FALSE #> Reference: hg19 #> Physical size: 47 Kb