Towards Fine-Grained Control of Visual Data in Mobile Systems

168629-Thumbnail Image.png
Description
With the rapid development of both hardware and software, mobile devices with their advantages in mobility, interactivity, and privacy have enabled various applications, including social networking, mixed reality, entertainment, authentication, and etc.In diverse forms such as smartphones, glasses, and watches,

With the rapid development of both hardware and software, mobile devices with their advantages in mobility, interactivity, and privacy have enabled various applications, including social networking, mixed reality, entertainment, authentication, and etc.In diverse forms such as smartphones, glasses, and watches, the number of mobile devices is expected to increase by 1 billion per year in the future. These devices not only generate and exchange small data such as GPS data, but also large data including videos and point clouds. Such massive visual data presents many challenges for processing on mobile devices. First, continuously capturing and processing high resolution visual data is energy-intensive, which can drain the battery of a mobile device very quickly. Second, data offloading for edge or cloud computing is helpful, but users are afraid that their privacy can be exposed to malicious developers. Third, interactivity and user experience is degraded if mobile devices cannot process large scale visual data in real-time such as off-device high precision point clouds. To deal with these challenges, this work presents three solutions towards fine-grained control of visual data in mobile systems, revolving around two core ideas, enabling resolution-based tradeoffs and adopting split-process to protect visual data.In particular, this work introduces: (1) Banner media framework to remove resolution reconfiguration latency in the operating system for enabling seamless dynamic resolution-based tradeoffs; (2) LesnCap split-process application development framework to protect user's visual privacy against malicious data collection in cloud-based Augmented Reality (AR) applications by isolating the visual processing in a distinct process; (3) A novel voxel grid schema to enable adaptive sampling at the edge device that can sample point clouds flexibly for interactive 3D vision use cases across mobile devices and mobile networks. The evaluation in several mobile environments demonstrates that, by controlling visual data at a fine granularity, energy efficiency can be improved by 49% switching between resolutions, visual privacy can be protected through split-process with negligible overhead, and point clouds can be delivered at a high throughput meeting various requirements.Thus, this work can enable more continuous mobile vision applications for the future of a new reality.
Date Created
2022
Agent

Analyzing Multi-viewpoint Capabilities of Light Estimation Frameworks for Augmented Reality Using TCP/IP and UDP

Description
Realistic lighting is important to improve immersion and make mixed reality applications seem more plausible. To properly blend the AR objects in the real scene, it is important to study the lighting of the environment. The existing illuminationframeworks proposed by

Realistic lighting is important to improve immersion and make mixed reality applications seem more plausible. To properly blend the AR objects in the real scene, it is important to study the lighting of the environment. The existing illuminationframeworks proposed by Google’s ARCore (Google’s Augmented Reality Software Development Kit) and Apple’s ARKit (Apple’s Augmented Reality Software Development Kit) are computationally expensive and have very slow refresh rates, which make them incompatible for dynamic environments and low-end mobile devices. Recently, there have been other illumination estimation frameworks such as GLEAM, Xihe, which aim at providing better illumination with faster refresh rates. GLEAM is an illumination estimation framework that understands the real scene by collecting pixel data from a reflecting spherical light probe. GLEAM uses this data to form environment cubemaps which are later mapped onto a reflection probe to generate illumination for AR objects. It is noticed that from a single viewpoint only one half of the light probe can be observed at a time which does not give complete information about the environment. This leads to the idea of having a multi-viewpoint estimation for better performance. This thesis work analyzes the multi-viewpoint capabilities of AR illumination frameworks that use physical light probes to understand the environment. The current work builds networking using TCP and UDP protocols on GLEAM. This thesis work also documents how processor load sharing has been done while networking devices and how that benefits the performance of GLEAM on mobile devices. Some enhancements using multi-threading have also been made to the already existing GLEAM model to improve its performance.
Date Created
2022
Agent

Augmented Coach: An Augmented Reality Tool for Immersive Sports Coaching

165566-Thumbnail Image.png
Description

Video playback is currently the primary method coaches and athletes use in sports training to give feedback on the athlete’s form and timing. Athletes will commonly record themselves using a phone or camera when practicing a sports movement, such as

Video playback is currently the primary method coaches and athletes use in sports training to give feedback on the athlete’s form and timing. Athletes will commonly record themselves using a phone or camera when practicing a sports movement, such as shooting a basketball, to then send to their coach for feedback on how to improve. In this work, we present Augmented Coach, an augmented reality tool for coaches to give spatiotemporal feedback through a 3-dimensional point cloud of the athlete. The system allows coaches to view a pre-recorded video of their athlete in point cloud form, and provides them with the proper tools in order to go frame by frame to both analyze the athlete’s form and correct it. The result is a fundamentally new concept of an interactive video player, where the coach can remotely view the athlete in a 3-dimensional form and create annotations to help improve their form. We then conduct a user study with subject matter experts to evaluate the usability and capabilities of our system. As indicated by the results, Augmented Coach successfully acts as a supplement to in-person coaching, since it allows coaches to break down the video recording in a 3-dimensional space and provide feedback spatiotemporally. The results also indicate that Augmented Coach can be a complete coaching solution in a remote setting. This technology will be extremely relevant in the future as coaches look for new ways to improve their feedback methods, especially in a remote setting.

Date Created
2022-05
Agent

Augmented Coach: An Augmented Reality Tool for Immersive Sports
Coaching

165564-Thumbnail Image.png
Description

Video playback is currently the primary method coaches and athletes use in sports training to give feedback on the athlete’s form and timing. Athletes will commonly record themselves using a phone or camera when practicing a sports movement, such as

Video playback is currently the primary method coaches and athletes use in sports training to give feedback on the athlete’s form and timing. Athletes will commonly record themselves using a phone or camera when practicing a sports movement, such as shooting a basketball, to then send to their coach for feedback on how to improve. In this work, we present Augmented Coach, an augmented reality tool for coaches to give spatiotemporal feedback through a 3-dimensional point cloud of the athlete. The system allows coaches to view a pre-recorded video of their athlete in point cloud form, and provides them with the proper tools in order to go frame by frame to both analyze the athlete’s form and correct it. The result is a fundamentally new concept of an interactive video player, where the coach can remotely view the athlete in a 3-dimensional form and create annotations to help improve their form. We then conduct a user study with subject matter experts to evaluate the usability and capabilities of our system. As indicated by the results, Augmented Coach successfully acts as a supplement to in-person coaching, since it allows coaches to break down the video recording in a 3-dimensional space and provide feedback spatiotemporally. The results also indicate that Augmented Coach can be a complete coaching solution in a remote setting. This technology will be extremely relevant in the future as coaches look for new ways to improve their feedback methods, especially in a remote setting.

Date Created
2022-05
Agent

Augmented Coach: An Augmented Reality Tool for Immersive Sports Coaching

165544-Thumbnail Image.png
Description

Video playback is currently the primary method coaches and athletes use in sports training to give feedback on the athlete's form and timing. Athletes will commonly record themselves using a phone or camera when practicing a sports movement, such as

Video playback is currently the primary method coaches and athletes use in sports training to give feedback on the athlete's form and timing. Athletes will commonly record themselves using a phone or camera when practicing a sports movement, such as shooting a basketball, to then send to their coach for feedback on how to improve. In this work, we present Augmented Coach, an augmented reality tool for coaches to give spatiotemporal feedback through a 3-dimensional point cloud of the athlete. The system allows coaches to view a pre-recorded video of their athlete in point cloud form, and provides them with the proper tools in order to go frame by frame to both analyze the athlete's form and correct it. The result is a fundamentally new concept of an interactive video player, where the coach can remotely view the athlete in a 3-dimensional form and create annotations to help improve their form. We then conduct a user study with subject matter experts to evaluate the usability and capabilities of our system. As indicated by the results, Augmented Coach successfully acts as a supplement to in-person coaching, since it allows coaches to break down the video recording in a 3-dimensional space and provide feedback spatiotemporally. The results also indicate that Augmented Coach can be a complete coaching solution in a remote setting. This technology will be extremely relevant in the future as coaches look for new ways to improve their feedback methods, especially in a remote setting.

Date Created
2022-05
Agent

Haptic Feedback Fashion Wear That Detects the Presence of Objects Using A Sweeping LiDAR Sensor

164917-Thumbnail Image.png
Description

This paper discusses the process of creating and testing the haptic feedback wearable that utilizes a sweeping Light Detection and Ranging sensor. This design comes as an extension to the capstone project for electrical engineers. The design works by attaching

This paper discusses the process of creating and testing the haptic feedback wearable that utilizes a sweeping Light Detection and Ranging sensor. This design comes as an extension to the capstone project for electrical engineers. The design works by attaching a LiDAR sensor to a sweeping servo motor, and whenever an object is detected by the sensor, a motor will vibrate to notify the user that an object is nearby. The design incorporates four motors so that the user will have a sense of where an obstacle is coming from and be able to navigate around that obstacle. The design was tested for its accuracy in distance and angle measurement, its efficiency when it came to processing the data, and the uncertainty of the sensor due to beam spreading. Plotting the results for the distance and angle accuracy showed that the design is capable of accurate measurements. The implementation of the code was also very efficient and had no issues with latency when processing the data from the sensor. There was also uncertainty at the larger ranges for the sensor.

Date Created
2022-05
Agent

Machine Learning and Vision Using Edge Devices for Multimodal Chatbots and Bio-meteorological Sensing

161987-Thumbnail Image.png
Description
Machine learning (ML) and deep learning (DL) has become an intrinsic part of multiple fields. The ability to solve complex problems makes machine learning a panacea. In the last few years, there has been an explosion of data generation, which

Machine learning (ML) and deep learning (DL) has become an intrinsic part of multiple fields. The ability to solve complex problems makes machine learning a panacea. In the last few years, there has been an explosion of data generation, which has greatly improvised machine learning models. But this comes with a cost of high computation, which invariably increases power usage and cost of the hardware. In this thesis we explore applications of ML techniques, applied to two completely different fields - arts, media and theater and urban climate research using low-cost and low-powered edge devices. The multi-modal chatbot uses different machine learning techniques: natural language processing (NLP) and computer vision (CV) to understand inputs of the user and accordingly perform in the play and interact with the audience. This system is also equipped with other interactive hardware setups like movable LED systems, together they provide an experiential theatrical play tailored to each user. I will discuss how I used edge devices to achieve this AI system which has created a new genre in theatrical play. I will then discuss MaRTiny, which is an AI-based bio-meteorological system that calculates mean radiant temperature (MRT), which is an important parameter for urban climate research. It is also equipped with a vision system that performs different machine learning tasks like pedestrian and shade detection. The entire system costs around $200 which can potentially replace the existing setup worth $20,000. I will further discuss how I overcame the inaccuracies in MRT value caused by the system, using machine learning methods. These projects although belonging to two very different fields, are implemented using edge devices and use similar ML techniques. In this thesis I will detail out different techniques that are shared between these two projects and how they can be used in several other applications using edge devices.
Date Created
2021
Agent

Characterizing Atmospheric Turbulence and Removing Distortion in Long-range Imaging

161757-Thumbnail Image.png
Description
Atmospheric turbulence distorts the path of light passing through the air. When capturing images at long range, the effects of this turbulence can cause substantial geometric distortion and blur in images and videos, degrading image quality. These become more pronounced

Atmospheric turbulence distorts the path of light passing through the air. When capturing images at long range, the effects of this turbulence can cause substantial geometric distortion and blur in images and videos, degrading image quality. These become more pronounced with greater turbulence, scaling with the refractive index structure constant, Cn2. Removing effects of atmospheric turbulence in images has a range of applications from astronomical imaging to surveillance. Thus, there is great utility in transforming a turbulent image into a ``clean image" undegraded by turbulence. However, as the turbulence is space- and time-variant and statistically random, no closed-form solution exists for a function that performs this transformation. Prior attempts to approximate the solution include spatio-temporal models and lucky frames models, which require many images to provide a good approximation, and supervised neural networks, which rely on large amounts of simulated or difficult-to-acquire real training data and can struggle to generalize. The first contribution in this thesis is an unsupervised neural-network-based model to perform image restoration for atmospheric turbulence with state-of-the-art performance. The model consists of a grid deformer, which produces an estimated distortion field, and an image generator, which estimates the distortion-free image. This model is transferable across different datasets; its efficacy is demonstrated across multiple datasets and on both air and water turbulence. The second contribution is a supervised neural network to predict Cn2 directly from the warp field. This network was trained on a wide range of Cn2 values and estimates Cn2 with relatively good accuracy. When used on the warp field produced by the unsupervised model, this allows for a Cn2 estimate requiring only a few images without any prior knowledge of ground truth or information about the turbulence.
Date Created
2021
Agent

Distributed Consensus Algorithms for Wireless Sensor Networks

161561-Thumbnail Image.png
Description
A distributed wireless sensor network (WSN) is a network of a large number of lowcost,multi-functional sensors with power, bandwidth, and memory constraints, operating in remote environments with sensing and communication capabilities. WSNs are a source for a large amount of data and

A distributed wireless sensor network (WSN) is a network of a large number of lowcost,multi-functional sensors with power, bandwidth, and memory constraints, operating in remote environments with sensing and communication capabilities. WSNs are a source for a large amount of data and due to the inherent communication and resource constraints, developing a distributed algorithms to perform statistical parameter estimation and data analysis is necessary. In this work, consensus based distributed algorithms are developed for distributed estimation and processing over WSNs. Firstly, a distributed spectral clustering algorithm to group the sensors based on the location attributes is developed. Next, a distributed max consensus algorithm robust to additive noise in the network is designed. Furthermore, distributed spectral radius estimation algorithms for analog, as well as, digital communication models are developed. The proposed algorithms work for any connected graph topologies. Theoretical bounds are derived and simulation results supporting the theory are also presented.
Date Created
2021
Agent

Data Representation for Predicting Harmonic Clusters with LSTM

147587-Thumbnail Image.png
Description

The purpose of this project is to create a useful tool for musicians that utilizes the harmonic content of their playing to recommend new, relevant chords to play. This is done by training various Long Short-Term Memory (LSTM) Recurrent Neural

The purpose of this project is to create a useful tool for musicians that utilizes the harmonic content of their playing to recommend new, relevant chords to play. This is done by training various Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNNs) on the lead sheets of 100 different jazz standards. A total of 200 unique datasets were produced and tested, resulting in the prediction of nearly 51 million chords. A note-prediction accuracy of 82.1% and a chord-prediction accuracy of 34.5% were achieved across all datasets. Methods of data representation that were rooted in valid music theory frameworks were found to increase the efficacy of harmonic prediction by up to 6%. Optimal LSTM input sizes were also determined for each method of data representation.

Date Created
2021-05
Agent