Trinity and Transcriptome Evaluation

Trinity: http://trinityrnaseq.github.io/

Transrate: http://hibberdlab.com/transrate/installation.html


Step 1: Launch and AMI. For this exercise, we will use a c4.4xlarge with a 500Gb EBS volume. Remember to change the permission of your key code chmod 400 ~/Downloads/????.pem (change ????.pem to whatever you named it)

ssh -i ~/Downloads/?????.pem ubuntu@ec2-???-???-???-???.compute-1.amazonaws.com

Update Software

sudo apt-get update

Install updates

sudo apt-get -y upgrade

Install other software Note that you can install a large amount of software from the Ubuntu “App Store” using a single command. Some of this software we will not use for this tutorial, but...

sudo apt-get -y install build-essential tmux git gcc make g++ python-dev unzip default-jre libcurl4-openssl-dev zlib1g-dev python-pip fastqc samtools bowtie ncbi-blast+ hmmer emboss

Mount hard drive The EBS volume we asked for is not automatically mounted - we need to do that.

sudo mkfs -t ext4 /dev/xvdb
sudo mount /dev/xvdb /mnt
sudo chown -R ubuntu:ubuntu /mnt
df -h

INSTALL TRANSRATE

cd $HOME
curl -LO https://bintray.com/artifact/download/blahah/generic/transrate-1.0.1-linux-x86_64.tar.gz
tar -zxf transrate-1.0.1-linux-x86_64.tar.gz
cd transrate-1.0.1-linux-x86_64
PATH=$PATH:$(pwd)

INSTALL Augustus

cd $HOME
curl -O http://augustus.gobics.de/binaries/augustus.2.5.5.tar.gz
tar -zxf augustus.2.5.5.tar.gz
cd augustus.2.5.5/
make
PATH=$PATH:$(pwd)/bin
export AUGUSTUS_CONFIG_PATH=$(pwd)/config

INSTALL BUSCO:

cd $HOME
curl -O http://busco.ezlab.org/files/BUSCO_v1.1b1.tar.gz
tar -zxf BUSCO_v1.1b1.tar.gz
cd BUSCO_v1.1b1/
PATH=$PATH:$(pwd)

Install Trinity

git clone https://github.com/trinityrnaseq/trinityrnaseq.git
cd trinityrnaseq
make -j4
PATH=$PATH:$(pwd)

Install Trimmomatic

cd $HOME
wget http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.33.zip
unzip Trimmomatic-0.33.zip
cd Trimmomatic-0.33
chmod +x trimmomatic-0.33.jar

Download data: For this lab, we’ll be using

mkdir /mnt/reads
cd /mnt/reads/

curl -LO https://www.dropbox.com/s/4o6eduzcw11gz53/subsamp.2.fq.gz

curl -LO https://www.dropbox.com/s/i4wst01yz10i9x9/subsamp.1.fq.gz

Do 2 different trimming levels – Phred=2 and Phred=30: One of these is very harsh, the other is probably more appropriate. Which one is which?

Look at the output from this command, which should start with Input Read Pairs:

mkdir /mnt/trimming
cd /mnt/trimming

#paste the below lines together as 1 command

java -Xmx10g -jar $HOME/Trimmomatic-0.33/trimmomatic-0.33.jar PE \
-threads 8 -baseout subsamp.Phred2.fq \
/mnt/reads/subsamp.1.fq.gz \
/mnt/reads/subsamp.2.fq.gz \
ILLUMINACLIP:$HOME/Trimmomatic-0.33/adapters/TruSeq3-PE.fa:2:30:10 \
SLIDINGWINDOW:4:2 \
LEADING:2 \
TRAILING:2 \
MINLEN:25

#and

java -Xmx10g -jar $HOME/Trimmomatic-0.33/trimmomatic-0.33.jar PE \
-threads 8 -baseout subsamp.Phred30.fq \
/mnt/reads/subsamp.1.fq.gz \
/mnt/reads/subsamp.2.fq.gz \
ILLUMINACLIP:$HOME/Trimmomatic-0.33/adapters/TruSeq3-PE.fa:2:30:10 \
SLIDINGWINDOW:4:30 \
LEADING:30 \
TRAILING:30 \
MINLEN:25

Run Trinity

mkdir /mnt/assembly
cd /mnt/assembly

#Open tumx window

tmux new -s trinity

#Phred30 dataset

Trinity --seqType fq --max_memory 10G --left /mnt/trimming/subsamp.Phred30_1P.fq \
--right /mnt/trimming/subsamp.Phred30_2P.fq --CPU 16

#Phred2 dataset

Trinity --seqType fq --max_memory 10G --left /mnt/trimming/subsamp.Phred2_1P.fq \
--right /mnt/trimming/subsamp.Phred2_2P.fq --CPU 16

Fix Trinity Headers

sed -i 's_|_-_g' /mnt/assembly/trinity_out_dir/Trinity.fasta

Control-b d #to exit tmux

Run BUSCO for assemblies: There are Eukaryote, Metazoa, Arthropod, Vertebrate, Plant references for use with other genomes.

mkdir /mnt/busco
cd /mnt/busco

#Download busco database

tmux new -s busco

curl -LO http://busco.ezlab.org/files/vertebrata_buscos.tar.gz
tar -zxf vertebrata_buscos.tar.gz

python3 /home/ubuntu/BUSCO_v1.1b1/BUSCO_v1.1b1.py \
-m trans -in /mnt/assembly/trinity_out_dir/Trinity.fasta \
--cpu 16 -l vertebrata -o trin.assemblty

less run*/short*

Control-b d #to exit tmux

Run Transrate

tmux new -s transrate

mkdir /mnt/transrate
cd /mnt/transrate
$HOME/transrate-1.0.1-linux-x86_64/transrate -a /mnt/assembly/trinity_out_dir/Trinity.fasta -t 16 \
--left /mnt/trimming/subsamp.Phred30_1P.fq \
--right /mnt/trimming/subsamp.Phred30_2P.fq

Control-b d #to exit tmux

CHALLENGE: Talk to me for details...

What Genus/Species did I sequence? What tissue?


Terminate your instance


LICENSE: This documentation and all textual/graphic site content is licensed under the Creative Commons - 0 License (CC0) -- fork @ github. Presentations (PPT/PDF) and PDFs are the property of their respective owners and are under the terms indicated within the presentation.
comments powered by Disqus