A Crowdsourcing Approach to Developing and Assessing Prediction Algorithms for AML Prognosis

128057-Thumbnail Image.png
Description

Acute Myeloid Leukemia (AML) is a fatal hematological cancer. The genetic abnormalities underlying AML are extremely heterogeneous among patients, making prognosis and treatment selection very difficult. While clinical proteomics data has the potential to improve prognosis accuracy, thus far, the

Acute Myeloid Leukemia (AML) is a fatal hematological cancer. The genetic abnormalities underlying AML are extremely heterogeneous among patients, making prognosis and treatment selection very difficult. While clinical proteomics data has the potential to improve prognosis accuracy, thus far, the quantitative means to do so have yet to be developed. Here we report the results and insights gained from the DREAM 9 Acute Myeloid Prediction Outcome Prediction Challenge (AML-OPC), a crowdsourcing effort designed to promote the development of quantitative methods for AML prognosis prediction. We identify the most accurate and robust models in predicting patient response to therapy, remission duration, and overall survival. We further investigate patient response to therapy, a clinically actionable prediction, and find that patients that are classified as resistant to therapy are harder to predict than responsive patients across the 31 models submitted to the challenge. The top two performing models, which held a high sensitivity to these patients, substantially utilized the proteomics data to make predictions. Using these models, we also identify which signaling proteins were useful in predicting patient therapeutic response.

Date Created
2016-06-28
Agent

Machine Learning to Predict Rapid Progression of Carotid Atherosclerosis in Patients With Impaired Glucose Tolerance

128381-Thumbnail Image.png
Description

Objectives: Prediabetes is a major epidemic and is associated with adverse cardio-cerebrovascular outcomes. Early identification of patients who will develop rapid progression of atherosclerosis could be beneficial for improved risk stratification. In this paper, we investigate important factors impacting the prediction,

Objectives: Prediabetes is a major epidemic and is associated with adverse cardio-cerebrovascular outcomes. Early identification of patients who will develop rapid progression of atherosclerosis could be beneficial for improved risk stratification. In this paper, we investigate important factors impacting the prediction, using several machine learning methods, of rapid progression of carotid intima-media thickness in impaired glucose tolerance (IGT) participants.

Methods: In the Actos Now for Prevention of Diabetes (ACT NOW) study, 382 participants with IGT underwent carotid intima-media thickness (CIMT) ultrasound evaluation at baseline and at 15–18 months, and were divided into rapid progressors (RP, n = 39, 58 ± 17.5 μM change) and non-rapid progressors (NRP, n = 343, 5.8 ± 20 μM change, p < 0.001 versus RP). To deal with complex multi-modal data consisting of demographic, clinical, and laboratory variables, we propose a general data-driven framework to investigate the ACT NOW dataset. In particular, we first employed a Fisher Score-based feature selection method to identify the most effective variables and then proposed a probabilistic Bayes-based learning method for the prediction. Comparison of the methods and factors was conducted using area under the receiver operating characteristic curve (AUC) analyses and Brier score.

Results: The experimental results show that the proposed learning methods performed well in identifying or predicting RP. Among the methods, the performance of Naïve Bayes was the best (AUC 0.797, Brier score 0.085) compared to multilayer perceptron (0.729, 0.086) and random forest (0.642, 0.10). The results also show that feature selection has a significant positive impact on the data prediction performance.

Conclusions: By dealing with multi-modal data, the proposed learning methods show effectiveness in predicting prediabetics at risk for rapid atherosclerosis progression. The proposed framework demonstrated utility in outcome prediction in a typical multidimensional clinical dataset with a relatively small number of subjects, extending the potential utility of machine learning approaches beyond extremely large-scale datasets.

Date Created
2016-09-05
Agent

Reflections of the Social Environment in Chimpanzee Memory: Applying Rational Analysis Beyond Humans

128387-Thumbnail Image.png
Description

In cognitive science, the rational analysis framework allows modelling of how physical and social environments impose information-processing demands onto cognitive systems. In humans, for example, past social contact among individuals predicts their future contact with linear and power functions. These

In cognitive science, the rational analysis framework allows modelling of how physical and social environments impose information-processing demands onto cognitive systems. In humans, for example, past social contact among individuals predicts their future contact with linear and power functions. These features of the human environment constrain the optimal way to remember information and probably shape how memory records are retained and retrieved. We offer a primer on how biologists can apply rational analysis to study animal behaviour. Using chimpanzees (Pan troglodytes) as a case study, we modelled 19 years of observational data on their social contact patterns. Much like humans, the frequency of past encounters in chimpanzees linearly predicted future encounters, and the recency of past encounters predicted future encounters with a power function. Consistent with the rational analyses carried out for human memory, these findings suggest that chimpanzee memory performance should reflect those environmental regularities. In re-analysing existing chimpanzee memory data, we found that chimpanzee memory patterns mirrored their social contact patterns. Our findings hint that human and chimpanzee memory systems may have evolved to solve similar information-processing problems. Overall, rational analysis offers novel theoretical and methodological avenues for the comparative study of cognition.

Date Created
2016-08-03
Agent

A Spatial Control for Correct Timing of Gene Expression During the Escherichia Coli Cell Cycle

128659-Thumbnail Image.png
Description

Temporal transcriptions of genes are achieved by different mechanisms such as dynamic interaction of activator and repressor proteins with promoters, and accumulation and/or degradation of key regulators as a function of cell cycle. We find that the TorR protein localizes

Temporal transcriptions of genes are achieved by different mechanisms such as dynamic interaction of activator and repressor proteins with promoters, and accumulation and/or degradation of key regulators as a function of cell cycle. We find that the TorR protein localizes to the old poles of the Escherichia coli cells, forming a functional focus. The TorR focus co-localizes with the nucleoid in a cell-cycle-dependent manner, and consequently regulates transcription of a number of genes. Formation of one TorR focus at the old poles of cells requires interaction with the MreB and DnaK proteins, and ATP, suggesting that TorR delivery requires cytoskeleton organization and ATP. Further, absence of the protein–protein interactions and ATP leads to loss in function of TorR as a transcription factor. We propose a mechanism for timing of cell-cycle-dependent gene transcription, where a transcription factor interacts with its target genes during a specific period of the cell cycle by limiting its own spatial distribution.

Date Created
2016-12-23
Agent

Genetic Candidate Variants in Two Multigenerational Families with Childhood Apraxia of Speech

128866-Thumbnail Image.png
Description

Childhood apraxia of speech (CAS) is a severe and socially debilitating form of speech sound disorder with suspected genetic involvement, but the genetic etiology is not yet well understood. Very few known or putative causal genes have been identified to

Childhood apraxia of speech (CAS) is a severe and socially debilitating form of speech sound disorder with suspected genetic involvement, but the genetic etiology is not yet well understood. Very few known or putative causal genes have been identified to date, e.g., FOXP2 and BCL11A. Building a knowledge base of the genetic etiology of CAS will make it possible to identify infants at genetic risk and motivate the development of effective very early intervention programs. We investigated the genetic etiology of CAS in two large multigenerational families with familial CAS. Complementary genomic methods included Markov chain Monte Carlo linkage analysis, copy-number analysis, identity-by-descent sharing, and exome sequencing with variant filtering. No overlaps in regions with positive evidence of linkage between the two families were found. In one family, linkage analysis detected two chromosomal regions of interest, 5p15.1-p14.1, and 17p13.1-q11.1, inherited separately from the two founders. Single-point linkage analysis of selected variants identified CDH18 as a primary gene of interest and additionally, MYO10, NIPBL, GLP2R, NCOR1, FLCN, SMCR8, NEK8, and ANKRD12, possibly with additive effects. Linkage analysis in the second family detected five regions with LOD scores approaching the highest values possible in the family. A gene of interest was C4orf21 (ZGRF1) on 4q25-q28.2. Evidence for previously described causal copy-number variations and validated or suspected genes was not found. Results are consistent with a heterogeneous CAS etiology, as is expected in many neurogenic disorders. Future studies will investigate genome variants in these and other families with CAS.

Date Created
2016-04-27
Agent

Geomorphology and Structural Geology of Saturnalia Fossae and Adjacent Structures in the Northern Hemisphere of Vesta

129395-Thumbnail Image.png
Description

Vesta is a unique, intermediate class of rocky body in the Solar System, between terrestrial planets and small asteroids, because of its size (average radius of ∼263 km) and differentiation, with a crust, mantle and core. Vesta’s low surface gravity

Vesta is a unique, intermediate class of rocky body in the Solar System, between terrestrial planets and small asteroids, because of its size (average radius of ∼263 km) and differentiation, with a crust, mantle and core. Vesta’s low surface gravity (0.25 m/s2) has led to the continual absence of a protective atmosphere and consequently impact cratering and impact-related processes are prevalent. Previous work has shown that the formation of the Rheasilvia impact basin induced the equatorial Divalia Fossae, whereas the formation of the Veneneia impact basin induced the northern Saturnalia Fossae. Expanding upon this earlier work, we conducted photogeologic mapping of the Saturnalia Fossae, adjacent structures and geomorphic units in two of Vesta’s northern quadrangles: Caparronia and Domitia. Our work indicates that impact processes created and/or modified all mapped structures and geomorphic units. The mapped units, ordered from oldest to youngest age based mainly on cross-cutting relationships, are: (1) Vestalia Terra unit, (2) cratered highlands unit, (3) Saturnalia Fossae trough unit, (4) Saturnalia Fossae cratered unit, (5) undifferentiated ejecta unit, (6) dark lobate unit, (7) dark crater ray unit and (8) lobate crater unit. The Saturnalia Fossae consist of five separate structures: Saturnalia Fossa A is the largest (maximum width of ∼43 km) and is interpreted as a graben, whereas Saturnalia Fossa B-E are smaller (maximum width of ∼15 km) and are interpreted as half grabens formed by synthetic faults. Smaller, second-order structures (maximum width of <1 km) are distinguished from the Saturnalia Fossae, a first-order structure, by the use of the general descriptive term ‘adjacent structures’, which encompasses minor ridges, grooves and crater chains. For classification purposes, the general descriptive term ‘minor ridges’ characterizes ridges that are not part of the Saturnalia Fossae and are an order of magnitude smaller (maximum width of <1 km vs. maximum width of ∼43 km). Shear deformation resulting from the large-scale (diameter of <100 km) Rheasilvia impact is proposed to form minor ridges (∼2 km to ∼25 km in length), which are interpreted as the surface expression of thrust faults, as well as grooves (∼3 km to ∼25 km in length) and pit crater chains (∼1 km to ∼25 km in length), which are interpreted as the surface expression of extension fractures and/or dilational normal faults. Secondary crater material, ejected from small-scale and medium-scale impacts (diameters of <100 km), are interpreted to form ejecta ray systems of grooves and crater chains by bouncing and scouring across the surface. Furthermore, seismic shaking, also resulting from small-scale and medium-scale impacts, is interpreted to form minor ridges because seismic shaking induces flow of regolith, which subsequently accumulates as minor ridges that are roughly parallel to the regional slope. In this work we expand upon the link between impact processes and structural features on Vesta by presenting findings of a photogeologic, structural mapping study which highlights how impact cratering and impact-related processes are expressed on this unique, intermediate Solar System body.

Date Created
2014-01-29
Agent