FLASH: fast length adjustment of short reads to improve genome assemblies
- PMID: 21903629
- PMCID: PMC3198573
- DOI: 10.1093/bioinformatics/btr507
FLASH: fast length adjustment of short reads to improve genome assemblies
Abstract
Motivation: Next-generation sequencing technologies generate very large numbers of short reads. Even with very deep genome coverage, short read lengths cause problems in de novo assemblies. The use of paired-end libraries with a fragment size shorter than twice the read length provides an opportunity to generate much longer reads by overlapping and merging read pairs before assembling a genome.
Results: We present FLASH, a fast computational tool to extend the length of short reads by overlapping paired-end reads from fragment libraries that are sufficiently short. We tested the correctness of the tool on one million simulated read pairs, and we then applied it as a pre-processor for genome assemblies of Illumina reads from the bacterium Staphylococcus aureus and human chromosome 14. FLASH correctly extended and merged reads >99% of the time on simulated reads with an error rate of <1%. With adequately set parameters, FLASH correctly merged reads over 90% of the time even when the reads contained up to 5% errors. When FLASH was used to extend reads prior to assembly, the resulting assemblies had substantially greater N50 lengths for both contigs and scaffolds.
Availability and implementation: The FLASH system is implemented in C and is freely available as open-source code at http://www.cbcb.umd.edu/software/flash.
Contact: t.magoc@gmail.com.
Figures
Similar articles
-
COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly.Bioinformatics. 2012 Nov 15;28(22):2870-4. doi: 10.1093/bioinformatics/bts563. Epub 2012 Oct 8. Bioinformatics. 2012. PMID: 23044551
-
PEAR: a fast and accurate Illumina Paired-End reAd mergeR.Bioinformatics. 2014 Mar 1;30(5):614-20. doi: 10.1093/bioinformatics/btt593. Epub 2013 Oct 18. Bioinformatics. 2014. PMID: 24142950 Free PMC article.
-
QuorUM: An Error Corrector for Illumina Reads.PLoS One. 2015 Jun 17;10(6):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015. PLoS One. 2015. PMID: 26083032 Free PMC article.
-
Chromosome-level hybrid de novo genome assemblies as an attainable option for nonmodel insects.Mol Ecol Resour. 2020 Sep;20(5):1277-1293. doi: 10.1111/1755-0998.13176. Epub 2020 Jun 7. Mol Ecol Resour. 2020. PMID: 32329220 Review.
-
De novo assembly of short sequence reads.Brief Bioinform. 2010 Sep;11(5):457-72. doi: 10.1093/bib/bbq020. Epub 2010 Aug 19. Brief Bioinform. 2010. PMID: 20724458 Review.
Cited by
-
Growth Stage-Dependent Variation in Soil Quality and Microbial Diversity of Ancient Gleditsia sinensis.Mol Biotechnol. 2024 Jun 4. doi: 10.1007/s12033-024-01097-7. Online ahead of print. Mol Biotechnol. 2024. PMID: 38833086
-
Relationship between jejunum ATPase activity and antioxidant function on the growth performance, feed conversion efficiency, and jejunum microbiota in Hu sheep (Ovis aries).BMC Vet Res. 2024 Jun 4;20(1):242. doi: 10.1186/s12917-024-04100-0. BMC Vet Res. 2024. PMID: 38831422 Free PMC article.
-
Accelerating 3D genomics data analysis with Microcket.Commun Biol. 2024 Jun 1;7(1):675. doi: 10.1038/s42003-024-06382-4. Commun Biol. 2024. PMID: 38824179 Free PMC article.
-
Assembly processes underlying bacterial community differentiation among geographically close mangrove forests.mLife. 2023 Mar 23;2(1):73-88. doi: 10.1002/mlf2.12060. eCollection 2023 Mar. mLife. 2023. PMID: 38818341 Free PMC article.
-
Different outer membrane c-type cytochromes are involved in direct interspecies electron transfer to Geobacter or Methanosarcina species.mLife. 2022 Sep 23;1(3):272-286. doi: 10.1002/mlf2.12037. eCollection 2022 Sep. mLife. 2022. PMID: 38818222 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources