RNAnexus: A framework for RNA data mining and visualization — ASN Events

RNAnexus: A framework for RNA data mining and visualization (#6)

Hardip Patel 1 , Aaron Chuah 1 , Trevor Lamb 1 , Gavin Huttley 1
  1. The Australian National University, Acton, ACT, Australia

RNAseq has become the technology of choice for understanding the transcriptome profile in all organisms. It also allows for far more accurate measurements of RNA levels and reveals the complexity of RNA content in a given biological system. More importantly, it offers easier access to the nucleotide sequences of RNA that can be used for studying the nucleotide evolutionary patterns in organisms in lieu of the genome sequences. There are ~42,000 raw RNAseq datasets deposited in the NCBI Short Read Archive database from vertebrates; almost half of them being non-model organisms. However, this data is not readily accessible because sophisticated analysis is required for the assembly, annotations and visualization of the RNAseq data. We have utilized large compute facility available at the National Computational Infrastructure (NCI) to assemble and annotate vast amounts of RNAseq data available for non-model organisms. We have also constructed gene trees to annotate the RNA and understand the species-specific expansion of gene families in vertebrates. Simultaneously, we have developed a web-accessible database for visualization of the RNAseq data. I will present my work on the developments of this database and discuss opportunities and challenges for a systematic effort to collate such dataset.