Joanna Moreton
A consensus approach to vertebrate de novo transcriptome assembly from RNA-seq data: assembly of the duck (Anas platyrhynchos) transcriptome
Moreton, Joanna; Dunham, Stephen P.; Emes, Richard D.
Abstract
For vertebrate organisms where a reference genome is not available, de novo transcriptome assembly enables a cost effective insight into the identification of tissue specific or differentially expressed genes and variation of the coding part of the genome. However, since there are a number of different tools and parameters that can be used to reconstruct transcripts, it is difficult to determine an optimal method. Here we suggest a pipeline based on (1) assessing the performance of three different assembly tools (2) using both single and multiple k -mer (MK) approaches (3) examining the influence of the number of reads used in the assembly (4) merging assemblies from different tools. We use an example dataset from the vertebrate Anas platyrhynchos domestica (Pekin duck). We find that taking a subset of data enables a robust assembly to be produced by multiple methods without the need for very high memory capacity. The use of reads mapped back to transcripts (RMBT) and CEGMA (Core Eukaryotic Genes Mapping Approach) provides useful metrics to determine the completeness of assembly obtained. For this dataset the use of MK in the assembly generated a more complete assembly as measured by greater number of RMBT and CEGMA score. Merged single k -mer assemblies are generally smaller but consist of longer transcripts, suggesting an assembly consisting of fewer fragmented transcripts. We suggest that the use of a subset of reads during assembly allows the relatively rapid investigation of assembly characteristics and can guide the user to the most appropriate transcriptome for particular downstream use. Transcriptomes generated by the compared assembly methods and the final merged assembly are freely available for download at http://dx.doi.org/10.6084/m9.figshare.1032613. © 2014 Moreton, Dunham and Emes.
Citation
Moreton, J., Dunham, S. P., & Emes, R. D. (2014). A consensus approach to vertebrate de novo transcriptome assembly from RNA-seq data: assembly of the duck (Anas platyrhynchos) transcriptome. Frontiers in Genetics, 5, Article 190. https://doi.org/10.3389/fgene.2014.00190
Journal Article Type | Article |
---|---|
Acceptance Date | Jun 9, 2014 |
Online Publication Date | Jun 25, 2014 |
Publication Date | Jun 25, 2014 |
Deposit Date | Jun 21, 2016 |
Publicly Available Date | Jun 21, 2016 |
Journal | Frontiers in Genetics |
Electronic ISSN | 1664-8021 |
Publisher | Frontiers Media |
Peer Reviewed | Peer Reviewed |
Volume | 5 |
Article Number | 190 |
DOI | https://doi.org/10.3389/fgene.2014.00190 |
Keywords | RNA-seq, de novo transcriptome, assembly, Illumina, high-throughput sequencing |
Public URL | https://nottingham-repository.worktribe.com/output/730121 |
Publisher URL | http://journal.frontiersin.org/article/10.3389/fgene.2014.00190/full |
Additional Information | This document is protected by copyright and was first published by Frontiers. All rights reserved. It is reproduced with permission. |
Contract Date | Jun 21, 2016 |
Files
fgene-05-00190.pdf
(712 Kb)
PDF
Copyright Statement
Copyright information regarding this work can be found at the following address: http://creativecommons.org/licenses/by/4.0
You might also like
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search