Home

Data processing

Data generation

Steps taken to process the data.

Creating the most confident set of TSS and TES predictions. TSSs can be interpreted from this set as the most commonly used:

Make final UTRs

Quantifying sense and anti-sense transcription:

Calculate transcript abundance

Calculating finding all neighboring genes in the genome, the distance between them, and their expression correlations:

Calculate neighboring gene distance and co-expression

Defining genome-wide TSSs by clustering CAGE transcription start sites (CTSSs) using CAGEr:

CTSS Clustering

Quality control

Does the data meet our assumptions before we begin our analysis?

How well do our abundance estimates match with previously generated microarray data?

Comparing RNA-seq and microarray readouts

Can we comment on the technical aspects of the sequencing protocol in regards to potential GC-bias?

Checking for GC content bias

How do TSS predictions methods compare?

Comparing TSS prediction methods

Data analysis

RNA-seq overview

Overview plots and statistics about the RNA-seq data.

Extensive transcription

Neighboring genes

What does the genome-wide view of neighboring genes look like before and after predicting full-length UTRs? How do the distances between genes correlate with their co-expression?

Neighboring genes

Promoter architecture

What can the CAGE data tell us about the falciparum genome-wide promoters? Do we see sharp and broad promoters? How many of each?

Promoter architecture

Do we see alternative transcription start sites being used often?

Alternative TSS usage

Transcription factor binding sites

Based on our newly predicted TSSs, can we make refined genome-wide TFBS predictions? Do these predictions give us any additional insight?

Transcription factor binding site predictions

Strain comparison

What genes are differentially expressed between the three strains? What genes are differentially detected between the three strains?

Comparing 3D7, HB3 and IT: