Parse BED files for unique genomic coordinates
read_index( files, col_list, n_threads = 1, zero_based = FALSE, batch_size = 200, verbose = TRUE )
| files | list of strings; file.paths of BED files |
|---|---|
| col_list | string; The column index object for the input BED files |
| n_threads | integer; number of threads to use. Default 1. Be-careful - there is a linear increase in memory usage with number of threads. This option is does not work with Windows OS. |
| zero_based | boolean; flag for whether the input data is zero-based or not |
| batch_size | integer; Max number of files to hold in memory at once. Default 20 |
| verbose | boolean; flag to output messages or not. |
data.table containing all unique genomic coordinates
Create list of unique genomic regions from input BED files. Populates a list of batch_size+1 with
the genomic coordinates from BED files, then runs unique when the list is full and keeps the running
results in the batch_size+1 position. Also indexes based on 'chr' and 'start' for later searching.
if (FALSE) { #Do Nothing }