An Analysis of the Benchmark Test lzbench for Open-Source Compressors

134524-Thumbnail Image.png
Description
With the rising data output and falling costs of Next Generation Sequencing technologies, research into data compression is crucial to maintaining storage efficiency and costs. High throughput sequencers such as the HiSeqX Ten can produce up to 1.8 terabases of

With the rising data output and falling costs of Next Generation Sequencing technologies, research into data compression is crucial to maintaining storage efficiency and costs. High throughput sequencers such as the HiSeqX Ten can produce up to 1.8 terabases of data per run, and such large storage demands are even more important to consider for institutions that rely on their own servers rather than large data centers (cloud storage)1. Compression algorithms aim to reduce the amount of space taken up by large genomic datasets by encoding the most frequently occurring symbols with the shortest bit codewords and by changing the order of the data to make it easier to encode. Depending on the probability distribution of the symbols in the dataset or the structure of the data, choosing the wrong algorithm could result in a compressed file larger than the original or a poorly compressed file that results in a waste of time and space2. To test efficiency among compression algorithms for each file type, 37 open-source compression algorithms were used to compress six types of genomic datasets (FASTA, VCF, BCF, GFF, GTF, and SAM) and evaluated on compression speed, decompression speed, compression ratio, and file size using the benchmark test lzbench. Compressors that outpreformed the popular bioinformatics compressor Gzip (zlib -6) were evaluated against one another by ratio and speed for each file type and across the geometric means of all file types. Compressors that exhibited fast compression and decompression speeds were also evaluated by transmission time through variable speed internet pipes in scenarios where the file was compressed only once or compressed multiple times.
Date Created
2017-05
Agent

Building Diverse Resources for Exploratory School of Life Sciences Students

Description
This creative thesis project aimed to create career development resources that School of Life Sciences majors could use to enhance their college experience, expand the breadth of relevant career options for School of Life Sciences majors, and confront and divert

This creative thesis project aimed to create career development resources that School of Life Sciences majors could use to enhance their college experience, expand the breadth of relevant career options for School of Life Sciences majors, and confront and divert career problems through the implementation of these career development resources. Students encounter career problems when their intention and action diverge. These career problems may cause a student to stop their pursuit of a given career, change majors, or even stop schooling completely. It is the objective of this project to help resolve these career problems by introducing a career development resource flyer that educates the student about a given career, provides coursework to guide a student towards this career path, familiarize students with extracurricular efforts necessary for this position, propose valuable resources that the student can utilize to learn more about the career, and offer a question and answer portion for further career and professional understanding. In order to create these career development resource flyers a variety of professionals, both with and without relationships with Arizona State University were contacted and interviewed. The answers gathered from these interviews were then utilized to create the career flyers. The project was successful in creating five distinct career development resource flyers, as well as a blank template with instructions to be used in the future by the School of Life Sciences. The career development resource flyers will be utilized by the School of Life Sciences advising staff for future exploratory majors, but is not limited to just these students. Aspirations are set to create an expansive reservoir of these resources for future generations of students to access in hopes that they will be better suited to find a career path that they are passionate about and be better prepared to attain.
Date Created
2017-05
Agent

The Sonoran Desert Tortoise (Gopherus morafkai) and Insights into Conservation Biology and Policy from the Mohave Desert Tortoise (Gopherus agassizii)

136419-Thumbnail Image.png
Description
A literature review summarizing the current status of conservation efforts of the Mojave Desert tortoise (Gopherus agassizii) including a brief overview of the Endangered Species Act (ESA) and its applicability to this species' conservation. A genetic and physiological comparison of

A literature review summarizing the current status of conservation efforts of the Mojave Desert tortoise (Gopherus agassizii) including a brief overview of the Endangered Species Act (ESA) and its applicability to this species' conservation. A genetic and physiological comparison of the morphologically similar Mojave species with the Sonoran (Gopherus morafkai) species proceeded by an analysis of if and how the ESA should apply to the Sonoran population. Analysis of current plans and interagency cooperations followed by a multi-step proposal on how best to conserve the Sonoran population of Desert tortoise.
Date Created
2015-05
Agent

Genie: A Population Genetics Simulation Built with JavaScript

136360-Thumbnail Image.png
Description
The modern web presents an opportunity for educators and researchers to create tools that are highly accessible. Because of the near-ubiquity of modern web browsers, developers who hope to create educational and analytical tools can reach a large au- dience

The modern web presents an opportunity for educators and researchers to create tools that are highly accessible. Because of the near-ubiquity of modern web browsers, developers who hope to create educational and analytical tools can reach a large au- dience by creating web applications. Using JavaScript, HTML, and other modern web development technologies, Genie was developed as a simulator to help educators in biology, genetics, and evolution classrooms teach their students about population genetics. Because Genie was designed for the modern web, it is highly accessible to both educators and students, who can access the web application using any modern web browser on virtually any device. Genie demonstrates the efficacy of web devel- opment technologies for demonstrating and simulating complex processes, and it will be a unique educational tool for educators who teach population genetics.
Date Created
2015-05
Agent

Evolutionary perspective suggests candidate genes for variation in Turner Syndrome phenotype

Description
Tremendous phenotypic variation exists across people with Turner syndrome (45,X). This variation likely stems from differential dosage of genes on the X chromosome. X-inactivation is the process whereby all X chromosomes in excess of one are silenced. However, about 15%

Tremendous phenotypic variation exists across people with Turner syndrome (45,X). This variation likely stems from differential dosage of genes on the X chromosome. X-inactivation is the process whereby all X chromosomes in excess of one are silenced. However, about 15% of the genes on the silenced X chromosome escape this inactivation and are candidates for affecting phenotype in people with Turner syndrome. In this study we take an evolutionary approach to rank candidate genes that may contribute to phenotypic variation among people with Turner Syndrome. We incorporate analysis of patterns of DNA methylation from 46,XX and 45,X individuals, and estimates of variable X-inactivation status across 46,XX individuals, with patterns of gene expression conservation on the X chromosomes across five tissues and ten species. We find that genes that escape XCI are possible candidate genes for Turner syndrome phenotype, indicated by the constant levels of expression in escape genes and inactivated genes. Variation in these genes is expected to affect phenotype when dosage is altered from typical levels.
Date Created
2015-12
Agent

Analyzing the Spread of Chikungunya in the Caribbean 2013-2015

135868-Thumbnail Image.png
Description
This work examines one dimension of the effect that complex human transport systems have on the spread of Chikungunya Virus (CHIKV) in the Caribbean from 2013 to 2015. CHIKV is transmitted by mosquitos and its novel spread through the Caribbean

This work examines one dimension of the effect that complex human transport systems have on the spread of Chikungunya Virus (CHIKV) in the Caribbean from 2013 to 2015. CHIKV is transmitted by mosquitos and its novel spread through the Caribbean islands provided a chance to examine disease transmission through complex human transportation systems. Previous work by Cauchemez et al. had shown a simple distance-based model successfully predict CHIKV spread in the Caribbean using Markov chain Monte Carlo (MCMC) statistical methods. A MCMC simulation is used to evaluate different transportation methods (air travel, cruise ships, and local maritime traffic) for the primary transmission patterns through linear regression. Other metrics including population density to account for island size variation and dengue fever incidence rates as a proxy for vector control and health spending were included. Air travel and cruise travel were gathered from monthly passenger arrivals by island. Local maritime traffic is approximated with a gravity model proxy incorporating GDP-per-capita and distance and historic dengue rates were used for determine existing vector control measures for the islands. The Caribbean represents the largest cruise passenger market in the world, cruise ship arrivals were expected to show the strongest signal; however, the gravity model representing local traffic was the best predictor of infection routes. The early infected islands (<30 days) showed a heavy trend towards an alternate primary transmission but our consensus model able to predict the time until initial infection reporting with 94.5% accuracy for islands 30 days post initial reporting. This result can assist public health entities in enacting measures to mitigate future epidemics and provide a modelling basis for determining transmission modes in future CHIKV outbreaks.
Date Created
2015-12
Agent

Variable Autosomal and X Divergence Near and Far From Genes Affects Estimates of Male Mutation Bias in Great Apes

128462-Thumbnail Image.png
Description

Male mutation bias, when more mutations are passed on via the male germline than via the female germline, is observed across mammals. One common way to infer the magnitude of male mutation bias, α, is to compare levels of neutral

Male mutation bias, when more mutations are passed on via the male germline than via the female germline, is observed across mammals. One common way to infer the magnitude of male mutation bias, α, is to compare levels of neutral sequence divergence between genomic regions that spend different amounts of time in the male and female germline. For great apes, including human, we show that estimates of divergence are reduced in putatively unconstrained regions near genes relative to unconstrained regions far from genes. Divergence increases with increasing distance from genes on both the X chromosome and autosomes, but increases faster on the X chromosome than autosomes. As a result, ratios of X/A divergence increase with increasing distance from genes and corresponding estimates of male mutation bias are significantly higher in intergenic regions near genes versus far from genes. Future studies in other species will need to carefully consider the effect that genomic location will have on estimates of male mutation bias.

Date Created
2016-11-09
Agent

Evolution of Dosage Compensation in Anolis Carolinensis, a Reptile With XX/XY Chromosomal Sex Determination

128475-Thumbnail Image.png
Description

In species with highly heteromorphic sex chromosomes, the degradation of one of the sex chromosomes will result in unequal gene expression between the sexes (e.g. between XX females and XY males) and between the sex chromosomes and the autosomes. Dosage

In species with highly heteromorphic sex chromosomes, the degradation of one of the sex chromosomes will result in unequal gene expression between the sexes (e.g. between XX females and XY males) and between the sex chromosomes and the autosomes. Dosage compensation is a process whereby genes on the sex chromosomes achieve equal gene expression. We compared genome-wide levels of transcription between males and females, and between the X chromosome and the autosomes in the green anole, Anolis carolinensis. We present evidence for dosage compensation between the sexes, and between the sex chromosomes and the autosomes. When dividing the X chromosome into regions based on linkage groups, we discovered that genes in the first reported X-linked region, anole linkage group b (LGb), exhibit complete dosage compensation, although the rest of the X-linked genes exhibit incomplete dosage compensation. Our data further suggest that the mechanism of this dosage compensation is upregulation of the X chromosome in males. We report that approximately 10% of coding genes, most of which are on the autosomes, are differentially expressed between males and females. In addition, genes on the X chromosome exhibited higher ratios of nonsynonymous to synonymous substitution than autosomal genes, consistent with the fast-X effect. Our results from the green anole add an additional observation of dosage compensation in a species with XX/XY sex determination.

Date Created
2016-11-09
Agent

Fruitful Analysis of Sex Chromosomes Reveals X-Treme Genetic Diversity

129033-Thumbnail Image.png
Description

A new study on sex chromosome evolution in papaya helps to illuminate sex chromosome biology, including deviations from expected trajectories.

Date Created
2016-11-29
Agent

A Recent Bottleneck of Y Chromosome Diversity Coincides With a Global Change in Culture

Description

It is commonly thought that human genetic diversity in non-African populations was shaped primarily by an out-of-Africa dispersal 50–100 thousand yr ago (kya). Here, we present a study of 456 geographically diverse high-coverage Y chromosome sequences, including 299 newly reported

It is commonly thought that human genetic diversity in non-African populations was shaped primarily by an out-of-Africa dispersal 50–100 thousand yr ago (kya). Here, we present a study of 456 geographically diverse high-coverage Y chromosome sequences, including 299 newly reported samples. Applying ancient DNA calibration, we date the Y-chromosomal most recent common ancestor (MRCA) in Africa at 254 (95% CI 192–307) kya and detect a cluster of major non-African founder haplogroups in a narrow time interval at 47–52 kya, consistent with a rapid initial colonization model of Eurasia and Oceania after the out-of-Africa bottleneck. In contrast to demographic reconstructions based on mtDNA, we infer a second strong bottleneck in Y-chromosome lineages dating to the last 10 ky. We hypothesize that this bottleneck is caused by cultural changes affecting variance of reproductive success among males.

Date Created
2015-04-01
Agent