To support molecular and cell biologists in their quest for autonomy when dealing with bio-informatics, bio-statistics, genomics or systems biology, Pr Sandrine Lagarrigue - supported by Agrocampus Ouest - started offering a highly multidisciplinary and modular workshop dedicated to biologists ten years ago. I joined the teaching staff in 2014 and have since contributed to yearly training days on sequence bio-informatics and statistical analyses for genomic and epigenomic data.
The workshop dedicated to epigenomic data analysis addresses the following topics:
- Introduction to epigenome screening technologies (ChIP-seq, ATAC-seq, MNase-seq …)
- Biases and noise in sequencing data
- Introduction to UNIX environment and Bash scripting
- Work on a cluster with job schedulers (SLURM and SGE)
- Use parallel environments
- Use genomic databases (Ensembl, UCSC)
- Quality check of ChIP-seq data
- Pre-processing and alignment of ChIP-seq data
- Peak calling
- Irreproducible discovery rate
- Use of R Bioconductor for functional genomics
- Normalization strategies for epigenomic assays (RPM, TMM, Lowess)
- Functional enrichment for epigenomic assays
- Motif enrichment analysis
The pedagogical objective is to help experimental biologists understand what each computational step actually measures. Rather than presenting pipelines as black boxes, the course links library preparation, assay-specific biases, alignment choices, normalization, peak calling and downstream interpretation to concrete biological questions.
In recent teaching activities, I have also extended this logic to ATAC-seq-focused training, including chromatin accessibility, nucleosome positioning, transcription factor footprinting and the interpretation of peak-centered functional enrichment analyses.
Below, you will find slides from our last ChIP-seq workshop (in french).
Day 1
Day 2
Bioconductor guide for ChIP-seq analysis