HDF5ArrayR/read_beds.R
read_hdf5_data.RdWrites values from input BED files into an in-disk HDF5Array
read_hdf5_data( files, ref_cpgs, col_list, batch_size = 20, n_threads = 1, h5_temp = NULL, zero_based = FALSE, strand_collapse = FALSE, verbose = TRUE )
| files | list of strings; file.paths of BED files |
|---|---|
| ref_cpgs | data.table; list of CpG sites in the tab-delimited format of chr-start-end. Must be zero-based genome. |
| col_list | string; The column index object for the input BED files |
| batch_size | integer; The number of file to hold in memory at once |
| n_threads | integer; number of threads to use. Default 1. Be-careful - there is a linear increase in memory usage with number of threads. This option is does not work with Windows OS. |
| h5_temp | string; temporary directory to store hdf5 |
| zero_based | boolean; flag for whether the input data is zero-based or not |
| strand_collapse | boolean; whether to collapse the crick strand into watson strand. Default FALSE |
| verbose | boolean; flag to output messages or not. |
List of HDF5Array. 1 is methylation, 2 is coverage. If no cov_idx is specified, 2 will be NULL
Using the generated index for genomic coordinates, creates a NA-based dense matrtix of methylation values for each BED file/sample. Each column contains the meth. values for a single sample.
if (FALSE) { #Do Nothing }