Description
Background
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without

Background
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS, avoids the time consuming steps of de novo whole genome assembly, multiple genome alignment, and annotation.
Results
For simulations SISRS is able to identify large numbers of loci containing variable sites with phylogenetic signal. For genomic data from apes, SISRS identified thousands of variable sites, from which we produced an accurate phylogeny. Finally, we used SISRS to identify phylogenetic markers that we used to estimate the phylogeny of placental mammals. We recovered eight phylogenies that resolved the basal relationships among mammals using datasets with different levels of missing data. The three alternate resolutions of the basal relationships are consistent with the major hypotheses for the relationships among mammals, all of which have been supported previously by different molecular datasets.
Conclusions
SISRS has the potential to transform phylogenetic research. This method eliminates the need for expensive marker development in many studies by using whole genome shotgun sequence data directly. SISRS is open source and freely available at https://github.com/rachelss/SISRS/releases.
Reuse Permissions
  • Downloads
    PDF (623.6 KB)

    Details

    Title
    • A composite genome approach to identify phylogenetically informative data from next-generation sequencing
    Date Created
    2015-06-11
    Resource Type
  • Text
  • Collections this item is in
    Identifier
    • Digital object identifier: 10.1186/s12859-015-0632-y
    • Identifier Type
      International standard serial number
      Identifier Value
      1471-2105
    Note
    • The electronic version of this article is the complete one and can be found online at: http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-015-0632-y

    Citation and reuse

    Cite this item

    This is a suggested citation. Consult the appropriate style guide for specific citation guidelines.

    Schwartz, R. S., Harkins, K. M., Stone, A. C., & Cartwright, R. A. (2015). A composite genome approach to identify phylogenetically informative data from next-generation sequencing. BMC Bioinformatics, 16(1). doi:10.1186/s12859-015-0632-y

    Machine-readable links