De novo assembly of circular genomes using Geneious R7. (#226)
Circular chromosomes or genomes, such as viruses, bacteria, mitochondria and plasmids, are a common occurrence in nature. However, most de novo assemblers are unaware genomes can be circular and produce linear sequences with an arbitrarily defined start and end. This can result in repeated sections of sequence at the arbitrary start and end points, and an artificial drop in coverage in these regions which can affect downstream analyses. The Geneious de novo assembler is, to our knowledge, the only de novo assembler which produces circular contigs during the assembly process. This algorithm uses an overlap based approach to merge sequence reads and contigs together, where at each step the most similar contigs or sequences are merged. The circularize option allows similar sequences and contigs to circularize both during and at the end of the assembly process. In this study we test the Geneious de novo assembler on an Ion Torrent mitochondrial dataset from Pathero leo persica (Asiatic lion), and a whole genome shotgun dataset for Pan troglodytes (chimpanzee) produced from Illumina sequencing, and compare the results with those obtained from popular freely available de novo assemblers Velvet, MIRA and Spades. With both test datasets the Geneious de novo assembler was able to return a single circular contig representing the mitochondrial genome, a significant improvement on assemblies produced by the other assemblers. The circular contigs mapped with high concordance to the published genome and enabled the resolution of difficult to assemble repetitive regions.