Analytical Methods for High Dimensional Physiological Sensors

155361-Thumbnail Image.png
Description
This dissertation proposes a new set of analytical methods for high dimensional physiological sensors. The methodologies developed in this work were motivated by problems in learning science, but also apply to numerous disciplines where high dimensional signals are present. In

This dissertation proposes a new set of analytical methods for high dimensional physiological sensors. The methodologies developed in this work were motivated by problems in learning science, but also apply to numerous disciplines where high dimensional signals are present. In the education field, more data is now available from traditional sources and there is an important need for analytical methods to translate this data into improved learning. Affecting Computing which is the study of new techniques that develop systems to recognize and model human emotions is integrating different physiological signals such as electroencephalogram (EEG) and electromyogram (EMG) to detect and model emotions which later can be used to improve these learning systems.

The first contribution proposes an event-crossover (ECO) methodology to analyze performance in learning environments. The methodology is relevant to studies where it is desired to evaluate the relationships between sentinel events in a learning environment and a physiological measurement which is provided in real time.

The second contribution introduces analytical methods to study relationships between multi-dimensional physiological signals and sentinel events in a learning environment. The methodology proposed learns physiological patterns in the form of node activations near time of events using different statistical techniques.

The third contribution addresses the challenge of performance prediction from physiological signals. Features from the sensors which could be computed early in the learning activity were developed for input to a machine learning model. The objective is to predict success or failure of the student in the learning environment early in the activity. EEG was used as the physiological signal to train a pattern recognition algorithm in order to derive meta affective states.

The last contribution introduced a methodology to predict a learner's performance using Bayes Belief Networks (BBNs). Posterior probabilities of latent nodes were used as inputs to a predictive model in real-time as evidence was accumulated in the BBN.

The methodology was applied to data streams from a video game and from a Damage Control Simulator which were used to predict and quantify performance. The proposed methods provide cognitive scientists with new tools to analyze subjects in learning environments.
Date Created
2017
Agent

Anomaly detection in categorical datasets with artificial contrasts

155102-Thumbnail Image.png
Description
Anomaly is a deviation from the normal behavior of the system and anomaly detection techniques try to identify unusual instances based on deviation from the normal data. In this work, I propose a machine-learning algorithm, referred to as Artificial Contrasts,

Anomaly is a deviation from the normal behavior of the system and anomaly detection techniques try to identify unusual instances based on deviation from the normal data. In this work, I propose a machine-learning algorithm, referred to as Artificial Contrasts, for anomaly detection in categorical data in which neither the dimension, the specific attributes involved, nor the form of the pattern is known a priori. I use RandomForest (RF) technique as an effective learner for artificial contrast. RF is a powerful algorithm that can handle relations of attributes in high dimensional data and detect anomalies while providing probability estimates for risk decisions.

I apply the model to two simulated data sets and one real data set. The model was able to detect anomalies with a very high accuracy. Finally, by comparing the proposed model with other models in the literature, I demonstrate superior performance of the proposed model.
Date Created
2016
Agent

Distinct feature learning and nonlinear variation pattern discovery using regularized autoencoders

154558-Thumbnail Image.png
Description
Feature learning and the discovery of nonlinear variation patterns in high-dimensional data is an important task in many problem domains, such as imaging, streaming data from sensors, and manufacturing. This dissertation presents several methods for learning and visualizing nonlinear variation

Feature learning and the discovery of nonlinear variation patterns in high-dimensional data is an important task in many problem domains, such as imaging, streaming data from sensors, and manufacturing. This dissertation presents several methods for learning and visualizing nonlinear variation in high-dimensional data. First, an automated method for discovering nonlinear variation patterns using deep learning autoencoders is proposed. The approach provides a functional mapping from a low-dimensional representation to the original spatially-dense data that is both interpretable and efficient with respect to preserving information. Experimental results indicate that deep learning autoencoders outperform manifold learning and principal component analysis in reproducing the original data from the learned variation sources.

A key issue in using autoencoders for nonlinear variation pattern discovery is to encourage the learning of solutions where each feature represents a unique variation source, which we define as distinct features. This problem of learning distinct features is also referred to as disentangling factors of variation in the representation learning literature. The remainder of this dissertation highlights and provides solutions for this important problem.

An alternating autoencoder training method is presented and a new measure motivated by orthogonal loadings in linear models is proposed to quantify feature distinctness in the nonlinear models. Simulated point cloud data and handwritten digit images illustrate that standard training methods for autoencoders consistently mix the true variation sources in the learned low-dimensional representation, whereas the alternating method produces solutions with more distinct patterns.

Finally, a new regularization method for learning distinct nonlinear features using autoencoders is proposed. Motivated in-part by the properties of linear solutions, a series of learning constraints are implemented via regularization penalties during stochastic gradient descent training. These include the orthogonality of tangent vectors to the manifold, the correlation between learned features, and the distributions of the learned features. This regularized learning approach yields low-dimensional representations which can be better interpreted and used to identify the true sources of variation impacting a high-dimensional feature space. Experimental results demonstrate the effectiveness of this method for nonlinear variation pattern discovery on both simulated and real data sets.
Date Created
2016
Agent

Statistical and dynamical modeling of Riemannian trajectories with application to human movement analysis

154471-Thumbnail Image.png
Description
The data explosion in the past decade is in part due to the widespread use of rich sensors that measure various physical phenomenon -- gyroscopes that measure orientation in phones and fitness devices, the Microsoft Kinect which measures depth information,

The data explosion in the past decade is in part due to the widespread use of rich sensors that measure various physical phenomenon -- gyroscopes that measure orientation in phones and fitness devices, the Microsoft Kinect which measures depth information, etc. A typical application requires inferring the underlying physical phenomenon from data, which is done using machine learning. A fundamental assumption in training models is that the data is Euclidean, i.e. the metric is the standard Euclidean distance governed by the L-2 norm. However in many cases this assumption is violated, when the data lies on non Euclidean spaces such as Riemannian manifolds. While the underlying geometry accounts for the non-linearity, accurate analysis of human activity also requires temporal information to be taken into account. Human movement has a natural interpretation as a trajectory on the underlying feature manifold, as it evolves smoothly in time. A commonly occurring theme in many emerging problems is the need to \emph{represent, compare, and manipulate} such trajectories in a manner that respects the geometric constraints. This dissertation is a comprehensive treatise on modeling Riemannian trajectories to understand and exploit their statistical and dynamical properties. Such properties allow us to formulate novel representations for Riemannian trajectories. For example, the physical constraints on human movement are rarely considered, which results in an unnecessarily large space of features, making search, classification and other applications more complicated. Exploiting statistical properties can help us understand the \emph{true} space of such trajectories. In applications such as stroke rehabilitation where there is a need to differentiate between very similar kinds of movement, dynamical properties can be much more effective. In this regard, we propose a generalization to the Lyapunov exponent to Riemannian manifolds and show its effectiveness for human activity analysis. The theory developed in this thesis naturally leads to several benefits in areas such as data mining, compression, dimensionality reduction, classification, and regression.
Date Created
2016
Agent

Optimal design of experiments for functional responses

154115-Thumbnail Image.png
Description
Functional or dynamic responses are prevalent in experiments in the fields of engineering, medicine, and the sciences, but proposals for optimal designs are still sparse for this type of response. Experiments with dynamic responses result in multiple responses taken over

Functional or dynamic responses are prevalent in experiments in the fields of engineering, medicine, and the sciences, but proposals for optimal designs are still sparse for this type of response. Experiments with dynamic responses result in multiple responses taken over a spectrum variable, so the design matrix for a dynamic response have more complicated structures. In the literature, the optimal design problem for some functional responses has been solved using genetic algorithm (GA) and approximate design methods. The goal of this dissertation is to develop fast computer algorithms for calculating exact D-optimal designs.



First, we demonstrated how the traditional exchange methods could be improved to generate a computationally efficient algorithm for finding G-optimal designs. The proposed two-stage algorithm, which is called the cCEA, uses a clustering-based approach to restrict the set of possible candidates for PEA, and then improves the G-efficiency using CEA.



The second major contribution of this dissertation is the development of fast algorithms for constructing D-optimal designs that determine the optimal sequence of stimuli in fMRI studies. The update formula for the determinant of the information matrix was improved by exploiting the sparseness of the information matrix, leading to faster computation times. The proposed algorithm outperforms genetic algorithm with respect to computational efficiency and D-efficiency.



The third contribution is a study of optimal experimental designs for more general functional response models. First, the B-spline system is proposed to be used as the non-parametric smoother of response function and an algorithm is developed to determine D-optimal sampling points of a spectrum variable. Second, we proposed a two-step algorithm for finding the optimal design for both sampling points and experimental settings. In the first step, the matrix of experimental settings is held fixed while the algorithm optimizes the determinant of the information matrix for a mixed effects model to find the optimal sampling times. In the second step, the optimal sampling times obtained from the first step is held fixed while the algorithm iterates on the information matrix to find the optimal experimental settings. The designs constructed by this approach yield superior performance over other designs found in literature.
Date Created
2015
Agent

A multi-sensor data fusion approach for real-time lane-based traffic estimation

153677-Thumbnail Image.png
Description
Modern intelligent transportation systems (ITS) make driving more efficient, easier, and safer. Knowledge of real-time traffic conditions is a critical input for operating ITS. Real-time freeway traffic state estimation approaches have been used to quantify traffic conditions given limited amount

Modern intelligent transportation systems (ITS) make driving more efficient, easier, and safer. Knowledge of real-time traffic conditions is a critical input for operating ITS. Real-time freeway traffic state estimation approaches have been used to quantify traffic conditions given limited amount of data collected by traffic sensors. Currently, almost all real-time estimation methods have been developed for estimating laterally aggregated traffic conditions in a roadway segment using link-based models which assume homogeneous conditions across multiple lanes. However, with new advances and applications of ITS, knowledge of lane-based traffic conditions is becoming important, where the traffic condition differences among lanes are recognized. In addition, most of the current real-time freeway traffic estimators consider only data from loop detectors. This dissertation develops a bi-level data fusion approach using heterogeneous multi-sensor measurements to estimate real-time lane-based freeway traffic conditions, which integrates a link-level model-based estimator and a lane-level data-driven estimator.

Macroscopic traffic flow models describe the evolution of aggregated traffic characteristics over time and space, which are required by model-based traffic estimation approaches. Since current first-order Lagrangian macroscopic traffic flow model has some unrealistic implicit assumptions (e.g., infinite acceleration), a second-order Lagrangian macroscopic traffic flow model has been developed by incorporating drivers’ anticipation and reaction delay. A multi-sensor extended Kalman filter (MEKF) algorithm has been developed to combine heterogeneous measurements from multiple sources. A MEKF-based traffic estimator, explicitly using the developed second-order traffic flow model and measurements from loop detectors as well as GPS trajectories for given fractions of vehicles, has been proposed which gives real-time link-level traffic estimates in the bi-level estimation system.

The lane-level estimation in the bi-level data fusion system uses the link-level estimates as priors and adopts a data-driven approach to obtain lane-based estimates, where now heterogeneous multi-sensor measurements are combined using parallel spatial-temporal filters.

Experimental analysis shows that the second-order model can more realistically reproduce real world traffic flow patterns (e.g., stop-and-go waves). The MEKF-based link-level estimator exhibits more accurate results than the estimator that uses only a single data source. Evaluation of the lane-level estimator demonstrates that the proposed new bi-level multi-sensor data fusion system can provide very good estimates of real-time lane-based traffic conditions.
Date Created
2015
Agent

Monitoring complex supply chains

153604-Thumbnail Image.png
Description
The complexity of supply chains (SC) has grown rapidly in recent years, resulting in an increased difficulty to evaluate and visualize performance. Consequently, analytical approaches to evaluate SC performance in near real time relative to targets and plans are important

The complexity of supply chains (SC) has grown rapidly in recent years, resulting in an increased difficulty to evaluate and visualize performance. Consequently, analytical approaches to evaluate SC performance in near real time relative to targets and plans are important to detect and react to deviations in order to prevent major disruptions.

Manufacturing anomalies, inaccurate forecasts, and other problems can lead to SC disruptions. Traditional monitoring methods are not sufficient in this respect, because com- plex SCs feature changes in manufacturing tasks (dynamic complexity) and carry a large number of stock keeping units (detail complexity). Problems are easily confounded with normal system variations.

Motivated by these real challenges faced by modern SC, new surveillance solutions are proposed to detect system deviations that could lead to disruptions in a complex SC. To address supply-side deviations, the fitness of different statistics that can be extracted from the enterprise resource planning system is evaluated. A monitoring strategy is first proposed for SCs featuring high levels of dynamic complexity. This presents an opportunity for monitoring methods to be applied in a new, rich domain of SC management. Then a monitoring strategy, called Heat Map Contrasts (HMC), which converts monitoring into a series of classification problems, is used to monitor SCs with both high levels of dynamic and detail complexities. Data from a semiconductor SC simulator are used to compare the methods with other alternatives under various failure cases, and the results illustrate the viability of our methods.

To address demand-side deviations, a new method of quantifying forecast uncer- tainties using the progression of forecast updates is presented. It is illustrated that a rich amount of information is available in rolling horizon forecasts. Two proactive indicators of future forecast errors are extracted from the forecast stream. This quantitative method re- quires no knowledge of the forecasting model itself and has shown promising results when applied to two datasets consisting of real forecast updates.
Date Created
2015
Agent

Holistic learning for multi-target and network monitoring problems

153063-Thumbnail Image.png
Description
Technological advances have enabled the generation and collection of various data from complex systems, thus, creating ample opportunity to integrate knowledge in many decision making applications. This dissertation introduces holistic learning as the integration of a comprehensive set of relationships

Technological advances have enabled the generation and collection of various data from complex systems, thus, creating ample opportunity to integrate knowledge in many decision making applications. This dissertation introduces holistic learning as the integration of a comprehensive set of relationships that are used towards the learning objective. The holistic view of the problem allows for richer learning from data and, thereby, improves decision making.

The first topic of this dissertation is the prediction of several target attributes using a common set of predictor attributes. In a holistic learning approach, the relationships between target attributes are embedded into the learning algorithm created in this dissertation. Specifically, a novel tree based ensemble that leverages the relationships between target attributes towards constructing a diverse, yet strong, model is proposed. The method is justified through its connection to existing methods and experimental evaluations on synthetic and real data.

The second topic pertains to monitoring complex systems that are modeled as networks. Such systems present a rich set of attributes and relationships for which holistic learning is important. In social networks, for example, in addition to friendship ties, various attributes concerning the users' gender, age, topic of messages, time of messages, etc. are collected. A restricted form of monitoring fails to take the relationships of multiple attributes into account, whereas the holistic view embeds such relationships in the monitoring methods. The focus is on the difficult task to detect a change that might only impact a small subset of the network and only occur in a sub-region of the high-dimensional space of the network attributes. One contribution is a monitoring algorithm based on a network statistical model. Another contribution is a transactional model that transforms the task into an expedient structure for machine learning, along with a generalizable algorithm to monitor the attributed network. A learning step in this algorithm adapts to changes that may only be local to sub-regions (with a broader potential for other learning tasks). Diagnostic tools to interpret the change are provided. This robust, generalizable, holistic monitoring method is elaborated on synthetic and real networks.
Date Created
2014
Agent

Cascading curtainmap: an interactive visualization for depicting large and flexible hierarchies

152992-Thumbnail Image.png
Description
In visualizing information hierarchies, icicle plots are efficient diagrams in that they provide the user a straightforward layout for different levels of data in a hierarchy and enable the user to compare items based on the item width. However, as

In visualizing information hierarchies, icicle plots are efficient diagrams in that they provide the user a straightforward layout for different levels of data in a hierarchy and enable the user to compare items based on the item width. However, as the size of the hierarchy grows large, the items in an icicle plot end up being small and indistinguishable. In this thesis, by maintaining the positive characteristics of traditional

icicle plots and incorporating new features such as dynamic diagram and active layer, we developed an interactive visualization that allows the user to selectively drill down or roll up to review different levels of data in a large hierarchy, to change the hierarchical

structure to detect potential patterns, and to maintain an overall understanding of the

current hierarchical structure.
Date Created
2014
Agent

Simulation-based Bayesian optimal accelerated life test design and model discrimination

152902-Thumbnail Image.png
Description
Accelerated life testing (ALT) is the process of subjecting a product to stress conditions (temperatures, voltage, pressure etc.) in excess of its normal operating levels to accelerate failures. Product failure typically results from multiple stresses acting on it simultaneously. Multi-stress

Accelerated life testing (ALT) is the process of subjecting a product to stress conditions (temperatures, voltage, pressure etc.) in excess of its normal operating levels to accelerate failures. Product failure typically results from multiple stresses acting on it simultaneously. Multi-stress factor ALTs are challenging as they increase the number of experiments due to the stress factor-level combinations resulting from the increased number of factors. Chapter 2 provides an approach for designing ALT plans with multiple stresses utilizing Latin hypercube designs that reduces the simulation cost without loss of statistical efficiency. A comparison to full grid and large-sample approximation methods illustrates the approach computational cost gain and flexibility in determining optimal stress settings with less assumptions and more intuitive unit allocations.

Implicit in the design criteria of current ALT designs is the assumption that the form of the acceleration model is correct. This is unrealistic assumption in many real-world problems. Chapter 3 provides an approach for ALT optimum design for model discrimination. We utilize the Hellinger distance measure between predictive distributions. The optimal ALT plan at three stress levels was determined and its performance was compared to good compromise plan, best traditional plan and well-known 4:2:1 compromise test plans. In the case of linear versus quadratic ALT models, the proposed method increased the test plan's ability to distinguish among competing models and provided better guidance as to which model is appropriate for the experiment.

Chapter 4 extends the approach of Chapter 3 to ALT sequential model discrimination. An initial experiment is conducted to provide maximum possible information with respect to model discrimination. The follow-on experiment is planned by leveraging the most current information to allow for Bayesian model comparison through posterior model probability ratios. Results showed that performance of plan is adversely impacted by the amount of censoring in the data, in the case of linear vs. quadratic model form at three levels of constant stress, sequential testing can improve model recovery rate by approximately 8% when data is complete, but no apparent advantage in adopting sequential testing was found in the case of right-censored data when censoring is in excess of a certain amount.
Date Created
2014
Agent