Distributed databases, such as Log-Structured Merge-Tree Key-Value Stores (LSM-KVS), are widely used in modern infrastructure. One of the primary challenges in these databases is ensuring consistency, meaning that all nodes have the same view of data at any given time.…
Distributed databases, such as Log-Structured Merge-Tree Key-Value Stores (LSM-KVS), are widely used in modern infrastructure. One of the primary challenges in these databases is ensuring consistency, meaning that all nodes have the same view of data at any given time. However, maintaining consistency requires a trade-off: the stronger the consistency, the more resources are necessary to replicate data across replicas, which decreases database performance. Addressing this trade-off poses two challenges: first, developing and managing multiple consistency levels within a single system, and second, assigning consistency levels to effectively balance the consistency-performance trade-off. This thesis introduces Self-configuring Consistency In Distributed LSM-KVS (SCID), a service that leverages unique properties of LSM KVS properties to manage consistency levels and automates level assignment with ML. To address the first challenge, SCID combines Dynamic read-only instances and Logical KV-based partitions to enable on-demand updates of read-only instances and facilitate the logical separation of groups of key-value pairs. SCID uses logical partitions as consistency levels and on-demand updates in dynamic read-only instances to allow for multiple consistency levels. To address the second challenge, the thesis presents an ML-based solution, SCID-ML to manage consistency-performance trade-off with better effectiveness. We evaluate SCID and find it to improve the write throughput up to 50% and achieve 62% accuracy for consistency-level predictions.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Component-based models are commonly employed to simulate discrete dynamicalsystems. These models lend themselves to formalizing the structures of systems at multiple levels of granularity. Visual development of component-based models serves to simplify the iterative and incremental model specification activities. The…
Component-based models are commonly employed to simulate discrete dynamicalsystems. These models lend themselves to formalizing the structures of systems at multiple levels of granularity. Visual development of component-based models serves to simplify the iterative and incremental model specification activities. The Parallel Discrete Events System Specification (DEVS) formalism offers a flexible yet rigorous approach for decomposing a whole model into its components or alternatively, composing a whole model from components. While different concepts, frameworks, and tools offer a variety of visual modeling capabilities, most pose limitations, such as visualizing multiple model hierarchies at any level with arbitrary depths. The visual and persistent layout of any number of hierarchy levels of models can be maintained and navigated seamlessly. Persistence storage is another capability needed for the modeling, simulating, verifying, and validating lifecycle. These are important features to improve the demanding task of creating and changing modular, hierarchical simulation models. This thesis proposes a new approach and develops a tool for the visual development of models. This tool supports storing and reconstructing graphical models using a NoSQL database. It offers unique capabilities important for developing increasingly larger and more complex models essential for analyzing, designing, and building Digital Twins.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
As people begin to live longer and the population shifts to having more olderadults on Earth than young children, radical solutions will be needed to ease the
burden on society. It will be essential to develop technology that can age with…
As people begin to live longer and the population shifts to having more olderadults on Earth than young children, radical solutions will be needed to ease the
burden on society. It will be essential to develop technology that can age with the
individual. One solution is to keep older adults in their homes longer through smart
home and smart living technology, allowing them to age in place. People have many
choices when choosing where to age in place, including their own homes, assisted
living facilities, nursing homes, or family members. No matter where people choose to
age, they may face isolation and financial hardships. It is crucial to keep finances in
mind when developing Smart Home technology.
Smart home technologies seek to allow individuals to stay inside their homes for
as long as possible, yet little work looks at how we can use technology in different
life stages. Robots are poised to impact society and ease burns at home and in the
workforce. Special attention has been given to social robots to ease isolation. As
social robots become accepted into society, researchers need to understand how these
robots should mimic natural conversation. My work attempts to answer this question
within social robotics by investigating how to make conversational robots natural and
reciprocal.
I investigated this through a 2x2 Wizard of Oz between-subjects user study. The
study lasted four months, testing four different levels of interactivity with the robot.
None of the levels were significantly different from the others, an unexpected result. I
then investigated the robot’s personality, the participant’s trust, and the participant’s
acceptance of the robot and how that influenced the study.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Molecular Dynamics (MD) simulations are ubiquitous throughout the physical sci-ences; they are critical in understanding how particle structures evolve over time
given a particular energy function. A software package called ParSplice introduced a
new method to generate these simulations in parallel that…
Molecular Dynamics (MD) simulations are ubiquitous throughout the physical sci-ences; they are critical in understanding how particle structures evolve over time
given a particular energy function. A software package called ParSplice introduced a
new method to generate these simulations in parallel that has significantly inflated
their length. Typically, simulations are short discrete Markov chains, only captur-
ing a few microseconds of a particle’s behavior and containing tens of thousands of
transitions between states; in contrast, a typical ParSplice simulation can be as long
as a few milliseconds, containing tens of millions of transitions. Naturally, sifting
through data of this size is impossible by hand, and there are a number of visualiza-
tion systems that provide comprehensive and intuitive analyses of particle structures
throughout MD simulations. However, no visual analytics systems have been built
that can manage the simulations that ParSplice produces. To analyze these large
data-sets, I built a visual analytics system that provides multiple coordinated views
that simultaneously describe the data temporally, within its structural context, and
based on its properties. The system provides fluid and powerful user interactions
regardless of the size of the data, allowing the user to drill down into the data-set to
get detailed insights, as well as run and save various calculations, most notably the
Nudged Elastic Band method. The system also allows the comparison of multiple
trajectories, revealing more information about the general behavior of particles at different temperatures, energy states etc.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Data integration involves the reconciliation of data from diverse data sources in order to obtain a unified data repository, upon which an end user such as a data analyst can run analytics sessions to explore the data and obtain useful…
Data integration involves the reconciliation of data from diverse data sources in order to obtain a unified data repository, upon which an end user such as a data analyst can run analytics sessions to explore the data and obtain useful insights. Supervised Machine Learning (ML) for data integration tasks such as ontology (schema) or entity (instance) matching requires several training examples in terms of manually curated, pre-labeled matching and non-matching schema concept or entity pairs which are hard to obtain. On similar lines, an analytics system without predictive capabilities about the impending workload can incur huge querying latencies, while leaving the onus of understanding the underlying database schema and writing a meaningful query at every step during a data exploration session on the user. In this dissertation, I will describe the human-in-the-loop Machine Learning (ML) systems that I have built towards data integration and predictive analytics. I alleviate the need for extensive prior labeling by utilizing active learning (AL) for dataintegration. In each AL iteration, I detect the unlabeled entity or schema concept pairs that would strengthen the ML classifier and selectively query the human oracle for such labels in a budgeted fashion. Thus, I make use of human assistance for ML-based data integration. On the other hand, when the human is an end user exploring data through Online Analytical Processing (OLAP) queries, my goal is to pro-actively assist the human by predicting the top-K next queries that s/he is likely to be interested in. I will describe my proposed SQL-predictor, a Business Intelligence (BI) query predictor and a geospatial query cardinality estimator with an emphasis on schema abstraction, query representation and how I adapt the ML models for these tasks. For each system, I will discuss the evaluation metrics and how the proposed systems compare to the state-of-the-art baselines on multiple datasets and query workloads.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
The drone industry is worth nearly 50 billion dollars in the public sector, and drone flight anomalies can cost up to 12 million dollars per drone. The project's objective is to explore various machine-learning techniques to identify anomalies in drone…
The drone industry is worth nearly 50 billion dollars in the public sector, and drone flight anomalies can cost up to 12 million dollars per drone. The project's objective is to explore various machine-learning techniques to identify anomalies in drone flight and express these anomalies effectively by creating relevant visualizations. The research goal is to solve the problem of finding anomalies inside drones to determine severity levels. The solution was visualization and statistical models, and the contribution was visualizations, patterns, models, and the interface.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as
healthcare but, also into areas such as entertainment and leisure. Deep neural…
The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as
healthcare but, also into areas such as entertainment and leisure. Deep neural
networks have been pivotal in making all these advancements possible. But, a well-known problem with deep neural networks is the lack of explanations for the choices
it makes. To combat this, several methods have been tried in the field of research.
One example of this is assigning rankings to the individual features and how influential
they are in the decision-making process. In contrast a newer class of methods focuses
on Concept Activation Vectors (CAV) which focus on extracting higher-level concepts
from the trained model to capture more information as a mixture of several features
and not just one. The goal of this thesis is to employ concepts in a novel domain: to
explain how a deep learning model uses computer vision to classify music into different
genres. Due to the advances in the field of computer vision with deep learning for
classification tasks, it is rather a standard practice now to convert an audio clip into
corresponding spectrograms and use those spectrograms as image inputs to the deep
learning model. Thus, a pre-trained model can classify the spectrogram images
(representing songs) into musical genres. The proposed explanation system called
“Why Pop?” tries to answer certain questions about the classification process such as
what parts of the spectrogram influence the model the most, what concepts were
extracted and how are they different for different classes. These explanations aid the
user gain insights into the model’s learnings, biases, and the decision-making process.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
This thesis serves as an experimental investigation into the potential of machine learning through attempting to predict the future price of a cryptocurrency. Through the use of web scraping, short interval data was collected on both Bitcoin and Dogecoin. Dogecoin…
This thesis serves as an experimental investigation into the potential of machine learning through attempting to predict the future price of a cryptocurrency. Through the use of web scraping, short interval data was collected on both Bitcoin and Dogecoin. Dogecoin was the dataset that was eventually used in this thesis due to its relative stability compared to Bitcoin. At the time of the data collection, Bitcoin became a much more frequent topic in the media and had more significant fluctuations due to it. The data was processed into consistent three separate, consistent timesteps, and used to generate predictive models. The models were able to accurately predict test data given all the preceding test data but were unable to autoregressively predict future data given only the first set of test data points. Ultimately, this project helps illustrate the complexities of extended future price prediction when using simple models like linear regression.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Oftentimes, patients struggle to accurately describe their symptoms to medical professionals, which produces erroneous diagnoses, delaying and preventing treatment. My app, Augnosis, will streamline constructive communication between patient and doctor, and allow for more accurate diagnoses. The goal of this…
Oftentimes, patients struggle to accurately describe their symptoms to medical professionals, which produces erroneous diagnoses, delaying and preventing treatment. My app, Augnosis, will streamline constructive communication between patient and doctor, and allow for more accurate diagnoses. The goal of this project was to create an app capable of gathering data on visual symptoms of facial acne and categorizing it to differentiate between diagnoses using image recognition and identification. “Augnosis”, is a combination of the words “Augmented Reality” and “Self-Diagnosis”, the former being the medium in which it is immersed and the latter detailing its functionality.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Working memory plays an important role in human activities across academic,professional, and social settings. Working memory is dened as the memory extensively
involved in goal-directed behaviors in which information must be retained and
manipulated to ensure successful task execution. The aim of…
Working memory plays an important role in human activities across academic,professional, and social settings. Working memory is dened as the memory extensively
involved in goal-directed behaviors in which information must be retained and
manipulated to ensure successful task execution. The aim of this research is to understand
the effect of image captioning with image description on an individual's
working memory. A study was conducted with eight neutral images comprising situations
relatable to daily life such that each image could have a positive or negative
description associated with the outcome of the situation in the image. The study
consisted of three rounds where the first and second round involved two parts and
the third round consisted of one part. The image was captioned a total of five times
across the entire study. The findings highlighted that only 25% of participants were
able to recall the captions which they captioned for an image after a span of 9-15
days; when comparing the recall rate of the captions, 50% of participants were able
to recall the image caption from the previous round in the present round; and out of
the positive and negative description associated with the image, 65% of participants
recalled the former description rather than the latter. The conclusions drawn from the
study are participants tend to retain information for longer periods than the expected
duration for working memory, which may be because participants were able to relate
the images with their everyday life situations and given a situation with positive and
negative information, the human brain is aligned towards positive information over
negative information.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)