Gets bedgraph column indexes from common pipeline output formats

get_source_idx(
  protocol = c("Bismark_cov", "MethylDackel", "MethylcTools", "BisSNP",
    "BSseeker2_CGmap")
)

Arguments

protocol

string; the protocol used for bedgraph output. Options are: "Bismark_cov", "MethylDackel", "MethylcTools", "BisSNP", "BSseeker2_CGmap"

Value

List of column names and indexes

Details

Typically used to reduce the number of potential CpG sites to include only those present in the input files so as to maximize performance and minimize resources. Can also be used for quality control to see if there is excessive number of CpG sites that are not present in the reference genome.

Examples

get_source_idx("MethylDackel")
#> $col_idx #> $col_idx$character #> [1] 1 #> #> $col_idx$numeric #> [1] 2 #> #> $col_idx$numeric #> [1] 4 #> #> $col_idx$numeric #> [1] 5 #> #> $col_idx$numeric #> [1] 6 #> #> #> $col_names #> [1] "chr" "start" "beta" "M" "U" #> #> $fix_missing #> [1] "cov := M+U" #> #> $select #> [1] TRUE #>