Solving SPDEs for Multi-Dimensional Shape Analysis

161945-Thumbnail Image.png
Description
Statistical Shape Modeling is widely used to study the morphometrics of deformable objects in computer vision and biomedical studies. There are mainly two viewpoints to understand the shapes. On one hand, the outer surface of the shape can be taken

Statistical Shape Modeling is widely used to study the morphometrics of deformable objects in computer vision and biomedical studies. There are mainly two viewpoints to understand the shapes. On one hand, the outer surface of the shape can be taken as a two-dimensional embedding in space. On the other hand, the outer surface along with its enclosed internal volume can be taken as a three-dimensional embedding of interests. Most studies focus on the surface-based perspective by leveraging the intrinsic features on the tangent plane. But a two-dimensional model may fail to fully represent the realistic properties of shapes with both intrinsic and extrinsic properties. In this thesis, severalStochastic Partial Differential Equations (SPDEs) are thoroughly investigated and several methods are originated from these SPDEs to try to solve the problem of both two-dimensional and three-dimensional shape analyses. The unique physical meanings of these SPDEs inspired the findings of features, shape descriptors, metrics, and kernels in this series of works. Initially, the data generation of high-dimensional shapes, here, the tetrahedral meshes, is introduced. The cerebral cortex is taken as the study target and an automatic pipeline of generating the gray matter tetrahedral mesh is introduced. Then, a discretized Laplace-Beltrami operator (LBO) and a Hamiltonian operator (HO) in tetrahedral domain with Finite Element Method (FEM) are derived. Two high-dimensional shape descriptors are defined based on the solution of the heat equation and Schrödinger’s equation. Considering the fact that high-dimensional shape models usually contain massive redundancies, and the demands on effective landmarks in many applications, a Gaussian process landmarking on tetrahedral meshes is further studied. A SIWKS-based metric space is used to define a geometry-aware Gaussian process. The study of the periodic potential diffusion process further inspired the idea of a new kernel call the geometry-aware convolutional kernel. A series of Bayesian learning methods are then introduced to tackle the problem of shape retrieval and classification. Experiments of every single item are demonstrated. From the popular SPDE such as the heat equation and Schrödinger’s equation to the general potential diffusion equation and the specific periodic potential diffusion equation, it clearly shows that classical SPDEs play an important role in discovering new features, metrics, shape descriptors and kernels. I hope this thesis could be an example of using interdisciplinary knowledge to solve problems.
Date Created
2021
Agent

Learning Complex Behaviors from Simple Ones: An analysis of Behavior-based Modular Design for RL Agents

161939-Thumbnail Image.png
Description
Traditional Reinforcement Learning (RL) assumes to learn policies with respect to reward available from the environment but sometimes learning in a complex domain requires wisdom which comes from a wide range of experience. In behavior based robotics, it is observed

Traditional Reinforcement Learning (RL) assumes to learn policies with respect to reward available from the environment but sometimes learning in a complex domain requires wisdom which comes from a wide range of experience. In behavior based robotics, it is observed that a complex behavior can be described by a combination of simpler behaviors. It is tempting to apply similar idea such that simpler behaviors can be combined in a meaningful way to tailor the complex combination. Such an approach would enable faster learning and modular design of behaviors. Complex behaviors can be combined with other behaviors to create even more advanced behaviors resulting in a rich set of possibilities. Similar to RL, combined behavior can keep evolving by interacting with the environment. The requirement of this method is to specify a reasonable set of simple behaviors. In this research, I present an algorithm that aims at combining behavior such that the resulting behavior has characteristics of each individual behavior. This approach has been inspired by behavior based robotics, such as the subsumption architecture and motor schema-based design. The combination algorithm outputs n weights to combine behaviors linearly. The weights are state dependent and change dynamically at every step in an episode. This idea is tested on discrete and continuous environments like OpenAI’s “Lunar Lander” and “Biped Walker”. Results are compared with related domains like Multi-objective RL, Hierarchical RL, Transfer learning, and basic RL. It is observed that the combination of behaviors is a novel way of learning which helps the agent achieve required characteristics. A combination is learned for a given state and so the agent is able to learn faster in an efficient manner compared to other similar approaches. Agent beautifully demonstrates characteristics of multiple behaviors which helps the agent to learn and adapt to the environment. Future directions are also suggested as possible extensions to this research.
Date Created
2021
Agent

Neuro Symbolic Artificial Intelligence Pioneer to Overcome the Limits of Machine Learn

161869-Thumbnail Image.png
Description
With the recent boom in artificial intelligence, various learning methods and information are pouring out. However, there are many abbreviations and jargons to read without knowing the history and development trend of artificial intelligence, which is a barrier to entry.

With the recent boom in artificial intelligence, various learning methods and information are pouring out. However, there are many abbreviations and jargons to read without knowing the history and development trend of artificial intelligence, which is a barrier to entry. This study predicts the future development direction by synthesizing the concept of Neuro symbolic AI, which is a new direction of artificial intelligence, the history of artificial intelligence from which such concept came out, and applied studies, and by synthesizing and summarizing the limitations of the current research projects. It is a guide for those who want to study neural symbols. In this paper, it describes the history of artificial intelligence and the historical background of the emergence of neural symbols. In the development trend, the challenges faced by the neural symbolic, measures to overcome, and the Neuro Symbolic A.I. applied in various fields are described. (Knowledge based Question Answering, VQA(Visual Question Answering), image retrieve, etc.). It predicts the future development direction of neuro symbolic artificial intelligence based on the contents obtained through previous studies.
Date Created
2021
Agent

Weakly-Supervised Visual-Retriever-Reader Pipeline for Knowledge-Based VQA Tasks

161838-Thumbnail Image.png
Description
Visual question answering (VQA) is a task that answers the questions by giving an image, and thus involves both language and vision methods to solve, which make the VQA tasks a frontier interdisciplinary field. In recent years, as the great

Visual question answering (VQA) is a task that answers the questions by giving an image, and thus involves both language and vision methods to solve, which make the VQA tasks a frontier interdisciplinary field. In recent years, as the great progress made in simple question tasks (e.g. object recognition), researchers start to shift their interests to the questions that require knowledge and reasoning. Knowledge-based VQA requires answering questions with external knowledge in addition to the content of images. One dataset that is mostly used in evaluating knowledge-based VQA is OK-VQA, but it lacks a gold standard knowledge corpus for retrieval. Existing work leverages different knowledge bases (e.g., ConceptNet and Wikipedia) to obtain external knowledge. Because of varying knowledge bases, it is hard to fairly compare models' performance. To address this issue, this paper collects a natural language knowledge base that can be used for any question answering (QA) system. Moreover, a Visual Retriever-Reader pipeline is proposed to approach knowledge-based VQA, where the visual retriever aims to retrieve relevant knowledge, and the visual reader seeks to predict answers based on given knowledge. The retriever is constructed with two versions: term based retriever which uses best matching 25 (BM25), and neural based retriever where the latest dense passage retriever (DPR) is introduced. To encode the visual information, the image and caption are encoded separately in the two kinds of neural based retriever: Image-DPR and Caption-DPR. There are also two styles of readers, classification reader and extraction reader. Both the retriever and reader are trained with weak supervision. The experimental results show that a good retriever can significantly improve the reader's performance on the OK-VQA challenge.
Date Created
2021
Agent

Safe and Robust Cooperative Algorithm for Connected Autonomous Vehicles

161806-Thumbnail Image.png
Description
Autonomous Vehicles (AVs) have the potential to significantly evolve transportation. AVs are expected to make transportation safer by avoiding accidents that happen due to human errors. When AVs become connected, they can exchange information with the infrastructure or other Connected

Autonomous Vehicles (AVs) have the potential to significantly evolve transportation. AVs are expected to make transportation safer by avoiding accidents that happen due to human errors. When AVs become connected, they can exchange information with the infrastructure or other Connected Autonomous Vehicles (CAVs) to efficiently plan their future motion and therefore, increase the road throughput and reduce energy consumption. Cooperative algorithms for CAVs will not be deployed in real life unless they are proved to be safe, robust, and resilient to different failure models. Since intersections are crucial areas where most accidents happen, this dissertation first focuses on making existing intersection management algorithms safe and resilient against network and computation time, bounded model mismatches and external disturbances, and the existence of a rogue vehicle. Then, a generic algorithm for conflict resolution and cooperation of CAVs is proposed that ensures the safety of vehicles even when other vehicles suddenly change their plan. The proposed approach can also detect deadlock situations among CAVs and resolve them through a negotiation process. A testbed consisting of 1/10th scale model CAVs is built to evaluate the proposed algorithms. In addition, a simulator is developed to perform tests at a large scale. Results from the conducted experiments indicate the robustness and resilience of proposed approaches.
Date Created
2021
Agent

Computer Vision: Improving Detection and Tracking for Occluded and Blurry Settings

161732-Thumbnail Image.png
Description
Computer vision and tracking has become an area of great interest for many reasons, including self-driving cars, identification of vehicles and drivers on roads, and security camera monitoring, all of which are expanding in the modern digital era. When working

Computer vision and tracking has become an area of great interest for many reasons, including self-driving cars, identification of vehicles and drivers on roads, and security camera monitoring, all of which are expanding in the modern digital era. When working with practical systems that are constrained in multiple ways, such as video quality or viewing angle, algorithms that work well theoretically can have a high error rate in practice. This thesis studies several ways in which that error can be minimized.This thesis describes an application in a practical system. This project is to detect, track and count people entering different lanes at an airport security checkpoint, using CCTV videos as a primary source. This thesis improves an existing algorithm that is not optimized for this particular problem and has a high error rate when comparing the algorithm counts with the true volume of users. The high error rate is caused by many people crowding into security lanes at the same time. The camera from which footage was captured is located at a poor angle, and thus many of the people occlude each other and cause the existing algorithm to miss people. One solution is to count only heads; since heads are smaller than a full body, they will occlude less, and in addition, since the camera is angled from above, the heads in back will appear higher and will not be occluded by people in front. One of the primary improvements to the algorithm is to combine both person detections and head detections to improve the accuracy. The proposed algorithm also improves the accuracy of detections. The existing algorithm used the COCO training dataset, which works well in scenarios where people are visible and not occluded. However, the available video quality in this project was not very good, with people often blocking each other from the camera’s view. Thus, a different training set was needed that could detect people even in poor-quality frames and with occlusion. The new training set is the first algorithmic improvement, and although occasionally performing worse, corrected the error by 7.25% on average.
Date Created
2021
Agent

Modeling Human Adaptation with Game-theoretic Intention Decoding in Human-Robot Interactions

161595-Thumbnail Image.png
Description
With the substantial development of intelligent robots, human-robot interaction (HRI) has become ubiquitous in applications such as collaborative manufacturing, surgical robotic operations, and autonomous driving. In all these applications, a human behavior model, which can provide predictions of human actions,

With the substantial development of intelligent robots, human-robot interaction (HRI) has become ubiquitous in applications such as collaborative manufacturing, surgical robotic operations, and autonomous driving. In all these applications, a human behavior model, which can provide predictions of human actions, is a helpful reference that helps robots to achieve intelligent interaction with humans. The requirement elicits an essential problem of how to properly model human behavior, especially when individuals are interacting or cooperating with each other. The major objective of this thesis is to utilize the human intention decoding method to help robots enhance their performance while interacting with humans. Preliminary work on integrating human intention estimation with an HRI scenario is shown to demonstrate the benefit. In order to achieve this goal, the research topic is divided into three phases. First, a novel method of an online measure of the human's reliance on the robot, which can be estimated through the intention decoding process from human actions,is described. An experiment that requires human participants to complete an object-moving task with a robot manipulator was conducted under different conditions of distractions. A relationship is discovered between human intention and trust while participants performed a familiar task with no distraction. This finding suggests a relationship between the psychological construct of trust and joint physical coordination, which bridges the human's action to its mental states. Then, a novel human collaborative dynamic model is introduced based on game theory and bounded rationality, which is a novel method to describe human dyadic behavior with the aforementioned theories. The mutual intention decoding process was also considered to inform this model. Through this model, the connection between the mental states of the individuals to their cooperative actions is indicated. A haptic interface is developed with a virtual environment and the experiments are conducted with 30 human subjects. The result suggests the existence of mutual intention decoding during the human dyadic cooperative behaviors. Last, the empirical results show that allowing agents to have empathy in inference, which lets the agents understand that others might have a false understanding of their intentions, can help to achieve correct intention inference. It has been verified that knowledge about vehicle dynamics was also important to correctly infer intentions. A new courteous policy is proposed that bounded the courteous motion using its inferred set of equilibrium motions. A simulation, which is set to reproduce an intersection passing case between an autonomous car and a human driving car, is conducted to demonstrate the benefit of the novel courteous control policy.
Date Created
2021
Agent

Language Conditioned Self-Driving Cars Using Environmental Object Descriptions For Controlling Cars

Description
Self-Driving cars are a long-lasting ambition for many AI scientists and engineers. In the last decade alone, many self-driving cars like Google Waymo, Tesla Autopilot, Uber, etc. have been roaming the streets of many cities. As a rapidly expanding field,

Self-Driving cars are a long-lasting ambition for many AI scientists and engineers. In the last decade alone, many self-driving cars like Google Waymo, Tesla Autopilot, Uber, etc. have been roaming the streets of many cities. As a rapidly expanding field, researchers all over the world are attempting to develop more safe and efficient AI agents that can navigate through our cities. However, driving is a very complex task to master even for a human, let alone the challenges in developing robots to do the same. It requires attention and inputs from the surroundings of the car, and it is nearly impossible for us to program all the possible factors affecting this complex task. As a solution, imitation learning was introduced, wherein the agents learn a policy, mapping the observations to the actions through demonstrations given by humans. Through imitation learning, one could easily teach self-driving cars the expected behavior in many scenarios. Despite their autonomous nature, it is undeniable that humans play a vital role in the development and execution of safe and trustworthy self-driving cars and hence form the strongest link in this application of Human-Robot Interaction. Several approaches were taken to incorporate this link between humans and self-driving cars, one of which involves the communication of human's navigational instruction to self-driving cars. The communicative channel provides humans with control over the agent’s decisions as well as the ability to guide them in real-time. In this work, the abilities of imitation learning in creating a self-driving agent that can follow natural language instructions given by humans based on environmental objects’ descriptions were explored. The proposed model architecture is capable of handling latent temporal context in these instructions thus making the agent capable of taking multiple decisions along its course. The work shows promising results that push the boundaries of natural language instructions and their complexities in navigating self-driving cars through towns.
Date Created
2021
Agent

Asymmetric Error Control for Classification in Medical Disease Diagnosis

161528-Thumbnail Image.png
Description
In classification applications, such as medical disease diagnosis, the cost of one type of error (false negative) could greatly outweigh the other (false positive) enabling the need of asymmetric error control. Due to this unique nature of the problem, traditional

In classification applications, such as medical disease diagnosis, the cost of one type of error (false negative) could greatly outweigh the other (false positive) enabling the need of asymmetric error control. Due to this unique nature of the problem, traditional machine learning techniques, even with much improved accuracy, may not be ideal as they do not provide a way to control the false negatives below a certain threshold. To address this need, a classification algorithm that can provide asymmetric error control is proposed. The theoretical foundation for this algorithm is based on Neyman-Pearson (NP) Lemma and it is complemented with sample splitting and order statistics to pick a threshold that enables an upper bound on the number of false negatives. Additionally, this classifier addresses the imbalance of the data, which is common in medical datasets, by using Hellinger distance as the splitting criterion. This eliminates the need of sampling methods, which add complexity and the need for parameter selection. This approach is used to create a novel tree-based classifier that enables asymmetric error control. Applications, such as prediction of the severity of cardiac arrhythmia, require classification over multiple classes. The NP oracle inequalities for binary classes are not immediately applicable for the multiclass NP classification, leading to a multi-step procedure proposed in this dissertation to extend the algorithm in the context of multiple classes. This classifier is used in predicting various forms of cardiac disease for both binary and multi-class classification problems with not only comparable accuracy metrics but also with full control over the number of false negatives. Moreover, this research allows us to pick the threshold for the classifier in a data adaptive way. This dissertation also shows that this methodology can be extended to non medical applications that require classification with asymmetric error control.
Date Created
2021
Agent

Physically Realizable Targeted Adversarial Attacks on Autonomous Driving

Description
Autonomous Driving (AD) systems are being researched and developed actively in recent days to solve the task of controlling the vehicles safely without human intervention. One method to solve such task is through deep Reinforcement Learning (RL) approach. In dee

Autonomous Driving (AD) systems are being researched and developed actively in recent days to solve the task of controlling the vehicles safely without human intervention. One method to solve such task is through deep Reinforcement Learning (RL) approach. In deep RL, the main objective is to find an optimal control behavior, often called policy performed by an agent, which is AD system in this case. This policy is usually learned through Deep Neural Networks (DNNs) based on the observations that the agent perceives along with rewards feedback received from environment.However, recent studies demonstrated the vulnerability of such control policies learned through deep RL against adversarial attacks. This raises concerns about the application of such policies to risk-sensitive tasks like AD. Previous adversarial attacks assume that the threats can be broadly realized in two ways: First one is targeted attacks through manipu- lation of the agent’s complete observation in real time and the other is untargeted attacks through manipulation of objects in environment. The former assumes full access to the agent’s observations at almost all time, while the latter has no control over outcomes of attack. This research investigates the feasibility of targeted attacks through physical adver- sarial objects in the environment, a threat that combines the effectiveness and practicality. Through simulations on one of the popular AD systems, it is demonstrated that a fixed optimal policy can be malfunctioned over time by an attacker e.g., performing an unintended self-parking, when an adversarial object is present. The proposed approach is formulated in such a way that the attacker can learn a dynamics of the environment and also utilizes common knowledge of agent’s dynamics to realize the attack. Further, several experiments are conducted to show the effectiveness of the proposed attack on different driving scenarios empirically. Lastly, this work also studies robustness of object location, and trade-off between the attack strength and attack length based on proposed evaluation metrics.
Date Created
2021
Agent