Useful packages:

Interactive visualization:

plotly ggplotly Glimma

nice tables in Rmarkdown

DT

enrichment analysis

clusterProfiler

annotation of genomic regions

annotatr

Gene expression, sequencing

DESeq2 EdgeR

mutation data

MAFtools

Visualization genomic regions

Gviz

Handling genomic data (e.g. bam files)

rtracklayer

Methylation data

Array

minfi RnBeads

Sequencing

methrix bsseq

Visualization, heatmaps

pheatmap Complexheatmap

Nice colors palettes

ggsci scico

Funtions in R

R is a function-oriented language. Basically every object manipulation is done with functions. It is possible that during a data analysis we are only using functions from the different packages, but it is useful to write our own functions as well. The main purpose is to avoid code repetition. As a rule of a thumb, if you need to do the same thing more than twice -> write a function instead.

General syntax

f <- function(){
  cat("Hello world! \n")
}
f()

## Hello world!

name: if we want to actually run the function
arguments (optional)
expressions - what the function is doing
return value (options)

f <- function(num1, num2){
  res <- num1+num2
  res
}
f(1,2)

## [1] 3

Return value

Returns the last evaluated function
Explicitly stated. In this case, the evaluation of the function stops and exits.

f <- function(num1, num2){
  res <- num1+num2
  return(res)
  cat("Hello world! \n")
}
f(1,2)

## [1] 3

Arguments

Named arguments
Evaluation can be position or name-wise, or even with their combination.

f <- function(num1, num2){
  print(num1)
  print(num2)
}
f(1,2)

## [1] 1
## [1] 2

f(num2=2, num1=1)

## [1] 1
## [1] 2

f(2, num1=1)

## [1] 1
## [1] 2

try(f(2))

## [1] 2
## Error in print(num2) : argument "num2" is missing, with no default

Arguments

the arguments can have default values:

f <- function(num1, num2=3){
  print(num1)
  #browser()
  print(num2)
}
f(1,2)

## [1] 1
## [1] 2

f(num2=2, num1=1)

## [1] 1
## [1] 2

f(2)

## [1] 2
## [1] 3

Environment

The functions are working in their own environment. This environment can’t be seen from the outside. E.g. if you define a variable within the function, it won’t be available outside the function. There are certain rules of what variables can be seen from inside the function’s environment. Lexical scoping - searches the environment where the function was defined.

g <- function(x) { 
         x*y
}
try(g(2))

## Error in g(2) : object 'y' not found

y <- 10
g(2)

## [1] 20

#but!
x <- 1:10
try(mean())

## Error in mean.default() : argument "x" is missing, with no default

g <- function(x) { 
  ab <- 12
         x*y
}
try(print(ab))

## Error in print(ab) : object 'ab' not found

Why?

ls(environment(g))

## [1] "f" "g" "x" "y"

#ls(environment(mean))

search()

##  [1] ".GlobalEnv"        "package:stats"     "package:graphics" 
##  [4] "package:grDevices" "package:utils"     "package:datasets" 
##  [7] "package:methods"   "Autoloads"         "tools:callr"      
## [10] "package:base"

Troubleshooting

Due to the own environment, it can be difficult what goes wrong within the function. Useful functions:

traceback() browser() debug() options(error=recover) options(error=stop)

Functions, useful packages

Useful packages:

Interactive visualization:

nice tables in Rmarkdown

enrichment analysis

annotation of genomic regions

Gene expression, sequencing

mutation data

Visualization genomic regions

Handling genomic data (e.g. bam files)

Methylation data

Array

Sequencing

Visualization, heatmaps

Nice colors palettes

Funtions in R

General syntax

Return value

Arguments

Arguments

Environment

Troubleshooting