Challenge 2: Strand specificity and read length
Hi,
For the RNA-seq data in Challenge 2, the sample preparation protocol used was it strand specific or not?
That is can we trust the strand information when trying to infer exon sequence and is it relevant when posting the results?
I also noticed that the read length as stated in the challenge description is 100nt but in the *.fastq files is 101.
Best,
FF
Comments
Strand specificity and read lengths
Hi FF
I am not sure about the strand specificity, and the read length issue. We will let you know as soon as we find out.
One way you can determine it is if you try to match the human reference genome. If you have as many in the coding strand as you have in the opposite, then the sample prep is not strand specific. Let us know if you figure it out.
Gustavo
Strand specificity and read lengths
Here are the answers:
1) The sample preparation protocol used is strand specific. You can trust the strand information when trying to infer exon sequences. More specifically: Not only the two reads in a paired-end read come from the same strand, but also there is a step in the protocol that allows to trace back from which strand the RNA was transcribed in the first place. Since we digest the second strand of the cDNA we can tell from which strand the RNA originated from.
2) The read length is actually 101bp. We inaccurately call it "100".
Thanks for your questions.
Gustavo