Difference between revisions of "HWbioinf2"

Revision as of 21:55, 3 April 2024

Task C: Visualizing in IGV

As before, run IGV as follows:

igv -g ref.fasta &

Open additional files using menu File -> Load from File: annot.gff, augustus-anidulans.gtf, augustus-human.gtf, rnaseq.bam

Exons are shown as thicker boxes, introns are thinner.
For each of the following questions, select a part of the sequence illustrating the answer and export a figure using File->Save image
You can check these images using command eog

Questions:

(a) Create an image illustrating differences between Augustus with human parameters and the reference annotation, save as a.png. Briefly describe the differences in words.

(b) Find some differences between Augustus with A. nidulans parameters and the reference annotation. Store an illustrative figure as b.png. Which parameters have yielded a more accurate prediction?

(c) Zoom in to one of the genes with a high expression level and try to find spliced read alignments supporting the annotated intron boundaries. Store the image as c.png.

Submit files a.png, b.png, c.png. Write answers to your protocol.

@@ Line 44: / Line 44: @@
 ===Task B: Aligning RNA-seq reads===
-* Align RNA-seq reads to the genome
+* Align RNA-seq reads to the genome.
-* We will use a specialized tool <tt>hisat2</tt>, which can recognize introns
+* We will use a specialized tool <tt>STAR</tt>, which can recognize introns.
-* Then we will sort and index the BAM file, similarly as in the [[HWbioinf1|previous lecture]]
+* First, run the following commands:
-<!--
 <syntaxhighlight lang="bash">
-bowtie2-build ref.fasta ref.fasta
+STAR --runMode genomeGenerate --genomeDir ref-index --genomeFastaFiles ref.fasta  --genomeSAindexNbases 6
-tophat2 -i 10 -I 10000 --max-multihits 1 --output-dir rnaseq ref.fasta rnaseq.fastq
+STAR --genomeDir ref-index --alignIntronMax 10000 --readFilesIn rnaseq.fastq  --outFileNamePrefix rnaseq e
-samtools sort rnaseq/accepted_hits.bam rnaseq
-samtools index rnaseq.bam
 </syntaxhighlight>
+* Then sort the resulting SAM file using samtools, store it as a BAM file and create its index, similarly as in the [[HWbioinf1|previous homework]].
-In addition to the BAM file, TopHat produced several other files in the <tt>rnaseq</tt> folder. Examine them to find out answers to the following questions (you can do it manually by looking at the the files, e.g. by <tt>less</tt> command):
+* In addition to the BAM file, we produced a file containing the position of detected introns. Examine the files to find out answers to the following questions (you can do it manually by looking at the the files, e.g. by <tt>less</tt> command):
--->
-<syntaxhighlight lang="bash">
-hisat2-build ref.fasta ref.fasta
-hisat2 -x ref.fasta -U rnaseq.fastq -S rnaseq.sam -k 1 --min-intronlen 20 --max-intronlen 10000 --novel-splicesite-outfile introns.txt
-</syntaxhighlight>
-<!-- samtools sort -O BAM -o rnaseq.bam rnaseq.sam
-samtools index rnaseq.bam -->
-After the hisat2 command, sort the resulting SAM file using samtools and store it as a BAM file. Create the index for the BAM file as well.
-In addition to the BAM file, we produced a file containing the position of detected introns. Examine the files to find out answers to the following questions (you can do it manually by looking at the the files, e.g. by <tt>less</tt> command):
 (a) How many reads were in the FASTQ file? How many of them were successfully mapped?

Difference between revisions of "HWbioinf2"

Revision as of 21:55, 3 April 2024

Contents

Input files

Task A: Gene finding

Task B: Aligning RNA-seq reads

Task C: Visualizing in IGV

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools