Protocols

Protocols

Demultiplexing Illumina Samples

In order to demultiplex your FASTQ file(s) if it has yet to be demultiplexed one can use the FASTX toolkit Barcode splitter. This tool is also available through Galaxy.

Inputs:

  1. Barcode text file with 1st column being the sample name, second column the barcode
#This line is a comment (starts with a 'number' sign)
BC1 GATCT
BC2 ATCGT
BC3 GTGAT
BC4 TGTCT
  1. Compressed or uncompressed FASTQ file of all reads

Outputs:

  1. One FASTQ file per sample
  2. HTML formatted report

Raw FASTQ Quality Control

To assess the quality of your raw, unaligned reads, use FASTQC.

Inputs:

  1. FASTQ file

Outputs:

  1. HTML formatted report
  2. Text formatted report

Raw FASTQ Processing

Sometimes you might notice your FASTQ file still contains several adapters on the 5’ or (more likely) the 3’ end of your reads. You might also notice that the quality of your reads get relatively low towards the 3’ end (as DNA polymerase adds dNTPs from the 5’ to 3’ end of the single stranded fragments).

In order to remove adapter contamination and / or low quality base calls, several different tools can be used:

Inputs:

Outputs:


Short Read Mapping Tools

Once you feel your reads are of high enough quality and are absent of detectable contamination, if a reference sequence is available, one can map these reads to said reference sequence. Your choice of mapper will depend on the type of sequencing data you have.

General Use Mappers

  • BWA
  • Bowtie2
  • Bowtie
  • SMALT

RNA-seq Mappers

  • HISAT2
  • TopHat2
  • STAR

Pseudoaligners

  • Kallisto

Callling SNPs and Small Indels



Updated - 2016-04-08