Subsets a given list of CpGs by another list of CpGs
subset_ref_cpgs(ref_cpgs, gen_cpgs, verbose = TRUE)
| ref_cpgs | data.table; A reference set of CpG sites (e.g. Hg19 or mm10) in bedgraph format |
|---|---|
| gen_cpgs | data.table; A subset of CpG sites. Usually obtained from |
| verbose | boolean; flag to output messages or not |
Returns list of CpG sites in bedgraph format
Typically used to reduce the number of potential CpG sites to include only those present in the input files so as to maximize performance and minimize resources. Can also be used for quality control to see if there is excessive number of CpG sites that are not present in the reference genome.
ref_cpgs = data.frame(chr="chr1",start=(1:5*2-1), end=(1:5*2)) subset_ref_cpgs(ref_cpgs,ref_cpgs[1:3,])#>#>#> chr start end #> 1 chr1 1 2 #> 2 chr1 3 4 #> 3 chr1 5 6