knowledgebase

recipes that save time

View the Project on GitHub hbc/knowledgebase

Methods for IRFinder

The intron retention analysis was performed using IRFinder version 1.2.3 based on the gene annotations for the hg19 reference genome.

For more detailed descriptions, see the IRFinder wiki.

library(tidyverse)
library(annotables)

# Small Amounts of Replicates via Audic and Claverie Test
irfinder_AC_test <- read.table("/path/to/IRFinder/results.tab", header = T)

parsed_rownames <- str_split(irfinder_AC_test$Intron.GeneName.GeneID, "/", simplify = TRUE)

irfinder_AC_test <- data.frame(irfinder_AC_test, parsed_rownames)

irfinder_AC_test <- irfinder_AC_test %>% rename(gene=X1,
           ensembl_id=X2,
           splicing=X3)
grch37_description <- grch37[, c("ensgene", "biotype", "description")]

grch37_description <- grch37_description[which(!(duplicated(grch37_description$ensgene))), ]

irfinder_ACtest_merged <- merge(irfinder_AC_test, grch37_description, by.x="ensembl_id", by.y="ensgene")

irfinder_ACtest_merged <- irfinder_ACtest_merged[, -5]

# Need to adjust for multiple test correction
irfinder_ACtest_merged$padj <- p.adjust(irfinder_ACtest_merged$p.diff, "BH")

# Order by padj
irfinder_ACtest_merged <- irfinder_ACtest_merged[order(irfinder_ACtest_merged$padj), ]

# Write to file all results
write.csv(irfinder_ACtest_merged, "results/irfinder_ACtest_all_results_padj.csv")

# Determine significant results with padj < 0.05
sig_irfinder_ACtest <- irfinder_ACtest_merged[which(irfinder_ACtest_merged$padj < 0.05), ]

sig_irfinder_ACtest <- sig_irfinder_ACtest[order(sig_irfinder_ACtest$p.diff),]

Results

There were ### significantly retained introns. The results output for each intron includes the following information, also described in the IRFinder wiki and an analysis example.

Similar to the criteria used in the IRFinder paper, the significant intron retention events were defined as:

All results can be accessed using the links below the table. The top 20 most significant retained introns are displayed below:

# Only returning those introns represented in more than 10% of transcripts in A or B
sig_irfinder_filtered <- sig_irfinder_ACtest[which(sig_irfinder_ACtest$A.IRratio > 0.1 | sig_irfinder_ACtest$B.IRratio > 0.1), ]

# Only returning those introns with a coverage of more than three reads across the entire intron after excluding non-measurable intronic regions
sig_irfinder_filtered2 <- sig_irfinder_filtered[sig_irfinder_filtered$A.IntronDepth > 3 | sig_irfinder_filtered$B.IntronDepth > 3, ]

sig_irfinder_filtered2 <- sig_irfinder_filtered2[order(sig_irfinder_filtered2$padj), ]
sig_irfinder_filtered2$p.diff <- formatC(sig_irfinder_filtered2$p.diff, format = "e", digits = 2)
sig_irfinder_filtered2$p.increased <- formatC(sig_irfinder_filtered2$p.increased, format = "e", digits = 2)
sig_irfinder_filtered2$p.decreased <- formatC(sig_irfinder_filtered2$p.decreased, format = "e", digits = 2)
knitr::kable(sig_irfinder_filtered2[1:20, ])
write.csv(sig_irfinder_filtered2, "results/significant_irfinder_ACtest_results_padj.csv")

[Download all results]

[Download significant results]

The significantly retained introns were explored for several genes of interest.

interesting_genes <- sig_irfinder_filtered2[sig_irfinder_filtered2$gene %in% c(), ]
knitr::kable(interesting_genes)

sessionInfo()