VaulteR - A pipeline for vault associated RNA detection from RNA- sequencing
Abstract
Vaults are highly conserved ribonucleoprotein complexes of unknown function. They have so far been found to be present in high numbers among higher eukaryotes including mammals, amphibians, and avians, as well as lower eukaryotes including deuterostomes and the slime mold (Dictyostelium discoideum). The aim of this thesis is to design a pipeline for vault associated RNA detection from RNA-sequences. And especially try to detect vtRNA in the Salmon Louse. The genome of the atlantic salmon louse, a major parasite of salmonids, affecting the global aquaculture industry. The thesis presents three methods of detecting vtRNA, one way is to find the peaks in the alignment of reads and search for the high coverage sequences in Rfam to check the existence of vtRNA. Another way is by predicting the secondary structures of the high coverage sequences, drawing a dendrogram with hierarchical clusters according to the dissimilarity matrix of RNA secondary structures, and then analysing key features of secondary structures of the known vtRNA in order to filter the candidates. At last, the third method is by detecting motifs, such as A-Box and B-Box, in candidate sequences with the MEME Suite. The result of this thesis is a pipeline that can effectively detect vtRNA, and a set of novel candidate sequences which can probably act as vtRNA in the salmon louse genome.