How to run intron retention analysis

To run any of these commands, need to activate the bioconda IRFinder environment prior to running script.

First script creates reference build required for IRFinder

 #SBATCH -t 24:00:00                    # Runtime in minutes
 #SBATCH -n 4
 #SBATCH -p medium                # Partition (queue) to submit to
 #SBATCH --mem=128G        # 128 GB memory needed (memory PER CORE)
 #SBATCH -o %j.out               # Standard out goes to this file
 #SBATCH -e %j.err               # Standard err goes to this file
 #SBATCH --mail-type=END         # Mail when the job ends

 IRFinder -m BuildRefProcess -r reference_data/

NOTE: The files in the reference_data folder are sym links to the bcbio ref files and need to be named specifically genome.fa and transcripts.gtf:

genome.fa -> /n/app/bcbio/biodata/genomes/Hsapiens/hg19/seq/hg19.fa

transcripts.gtf -> /n/app/bcbio/biodata/genomes/Hsapiens/hg19/rnaseq/ref-transcripts.gtf

Second script (.sh) runs IRFinder and STAR on input file

   #!/bin/bash

   module load star/2.5.4a

   IRFinder -r /path/to/irfinder/reference_data \
   -t 4 -d results \
   $1

Third script (.sh) runs a batch job for each input file in directory

   #!/bin/bash

   for fq in /path/to/*fastq
   do

   sbatch -p medium -t 0-48:00 -n 4 --job-name irfinder --mem=128G -o %j.out -e %j.err --wrap="sh /path/to/irfinder/irfinder_input_file.sh $fq"
   sleep 1 # wait 1 second between each job submission

   done

Fourth script takes output (IRFinder-IR-dir.txt) and uses the replicates to determine differential expression using the Audic and Claverie test (# replicates < 4). analysisWithLowReplicates.pl script comes with the IRFinder github repo clone, so I cloned the repo at https://github.com/williamritchie/IRFinder/. Notes on the Audic and Claverie test can be found at: https://github.com/williamritchie/IRFinder/wiki/Small-Amounts-of-Replicates-via-Audic-and-Claverie-Test.

   #!/bin/bash

   #SBATCH -t 24:00:00                    # Runtime in minutes
   #SBATCH -n 4
   #SBATCH -p medium                # Partition (queue) to submit to
   #SBATCH --mem=128G        # 8 GB memory needed (memory PER CORE)
   #SBATCH -o %j.out               # Standard out goes to this file
   #SBATCH -e %j.err               # Standard err goes to this file
   #SBATCH --mail-type=END         # Mail when the job ends

   analysisWithLowReplicates.pl \
     -A A_ctrl/Pooled/IRFinder-IR-dir.txt A_ctrl/AJ_1/IRFinder-IR-dir.txt A_ctrl/AJ_2/IRFinder-IR-dir.txt A_ctrl/AJ_3/IRFinder-IR-dir.txt \
     -B B_nrde2/Pooled/IRFinder-IR-dir.txt B_nrde2/AJ_4/IRFinder-IR-dir.txt B_nrde2/AJ_5/IRFinder-IR-dir.txt B_nrde2/AJ_6/IRFinder-IR-dir.txt \
     > KD_ctrl-v-nrde2.tab

Output KD_ctrl-v-nrde2.tab file can be read directly into R for filtering and results exploration.
Rmarkdown workflow (included in report): IRFinder_report.md