Investigating the Role of Silent Users on Social Media

190719-Thumbnail Image.png
Description
Social media platforms provide a rich environment for analyzing user behavior. Recently, deep learning-based methods have been a mainstream approach for social media analysis models involving complex patterns. However, these methods are susceptible to biases in the training data, such

Social media platforms provide a rich environment for analyzing user behavior. Recently, deep learning-based methods have been a mainstream approach for social media analysis models involving complex patterns. However, these methods are susceptible to biases in the training data, such as participation inequality. Basically, a mere 1% of users generate the majority of the content on social networking sites, while the remaining users, though engaged to varying degrees, tend to be less active in content creation and largely silent. These silent users consume and listen to information that is propagated on the platform.However, their voice, attitude, and interests are not reflected in the online content, making the decision of the current methods predisposed towards the opinion of the active users. So models can mistake the loudest users for the majority. To make the silent majority heard is to reveal the true landscape of the platform. In this dissertation, to compensate for this bias in the data, which is related to user-level data scarcity, I introduce three pieces of research work. Two of these proposed solutions deal with the data on hand while the other tries to augment the current data. Specifically, the first proposed approach modifies the weight of users' activity/interaction in the input space, while the second approach involves re-weighting the loss based on the users' activity levels during the downstream task training. Lastly, the third approach uses large language models (LLMs) and learns the user's writing behavior to expand the current data. In other words, by utilizing LLMs as a sophisticated knowledge base, this method aims to augment the silent user's data.
Date Created
2023
Agent

Nexus Modeling and Distributed Simulation: A RESTful Framework for Understanding and Predicting Dynamics of Interacting Water and Energy Systems

187443-Thumbnail Image.png
Description
Water, energy, and food are essential resources to sustain the development of the society. The Food-Energy-Water Nexus (FEW-Nexus) must account for synergies and trade-offs among these resources. The nexus concept highlights the importance of integrative solutions that secure supplies to

Water, energy, and food are essential resources to sustain the development of the society. The Food-Energy-Water Nexus (FEW-Nexus) must account for synergies and trade-offs among these resources. The nexus concept highlights the importance of integrative solutions that secure supplies to meet demands sustainably. The existing frameworks and tools do not focus on formal model composability, a key capability for creating simulations created from separately developed models. The Knowledge Interchange Broker (KIB) approach is used to model the interactions among models to achieve composition flexibility for the FEW-Nexus.Domain experts generally use the Water Evaluation and Planning (WEAP) and Low Emissions Analysis Platform (LEAP) systems to study water and energy systems, respectively. The food part of FEW systems can be modeled inside the WEAP system. An internal linkage mechanism is available for combining and simulating WEAP and LEAP models. This mechanism is used for the validation and performance evaluation of independent modeling and simulation proposed in this research. The Componentized WEAP and LEAP RESTful frameworks are component-based representations for the legacy and closed-source WEAP and LEAP systems. These modularized systems simplify their use with other simulation frameworks. This research proposes two interaction model frameworks based on the Knowledge Interchange Broker approach. First, an Algorithmic Interaction Model (Algorithmic-IM) was developed to integrate the WEAP and LEAP models. The Algorithmic-IM model can be defined via programming language and has a fixed cyclic execution protocol. However, this approach has tightly interwoven the interaction model with its execution and has limited support for flexibly creating model hierarchies. To overcome these restrictions, the system-theoretic Parallel DEVS formalism is used to develop a DEVS-Based Interaction Model (DEVS-IM). As in the Algorithmic-IM, the DEVS-IM is implemented as a RESTful framework, uses MongoDB for defining structural DEVS models, and supports automatic code generation for the DEVSSuite simulator. The DEVS-IM offers modular, hierarchical structural modeling, reusability, flexibility, and maintainability for integrating disparate systems. The Phoenix Active Management Area (AMA) is used to demonstrate the real-world application of the proposed research. Furthermore, the correctness and performance of the presented frameworks in this research are evaluated using the Phoenix-AMA model.
Date Created
2023
Agent

Evaluating the Efficiency of Quantum Simulators using Practical Application Benchmarks

187351-Thumbnail Image.png
Description
Quantum computing holds the potential to revolutionize various industries by solving problems that classical computers cannot solve efficiently. However, building quantum computers is still in its infancy, and simulators are currently the best available option to explore the potential of

Quantum computing holds the potential to revolutionize various industries by solving problems that classical computers cannot solve efficiently. However, building quantum computers is still in its infancy, and simulators are currently the best available option to explore the potential of quantum computing. Therefore, developing comprehensive benchmarking suites for quantum computing simulators is essential to evaluate their performance and guide the development of future quantum algorithms and hardware. This study presents a systematic evaluation of quantum computing simulators’ performance using a benchmarking suite. The benchmarking suite is designed to meet the industry-standard performance benchmarks established by the Defense Advanced Research Projects Agency (DARPA) and includes standardized test data and comparison metrics that encompass a wide range of applications, deep neural network models, and optimization techniques. The thesis is divided into two parts to cover basic quantum algorithms and variational quantum algorithms for practical machine-learning tasks. In the first part, the run time and memory performance of quantum computing simulators are analyzed using basic quantum algorithms. The performance is evaluated using standardized test data and comparison metrics that cover fundamental quantum algorithms, including Quantum Fourier Transform (QFT), Inverse Quantum Fourier Transform (IQFT), Quantum Adder, and Variational Quantum Eigensolver (VQE). The analysis provides valuable insights into the simulators’ strengths and weaknesses and highlights the need for further development to enhance their performance. In the second part, benchmarks are developed using variational quantum algorithms for practical machine learning tasks such as image classification, natural language processing, and recommendation. The benchmarks address several unique challenges posed by benchmarking quantum machine learning (QML), including the effect of optimizations on time-to-solution, the stochastic nature of training, the inclusion of hybrid quantum-classical layers, and the diversity of software and hardware systems. The findings offer valuable insights into the simulators’ ability to solve practical machine-learning tasks and pinpoint areas for future research and enhancement. In conclusion, this study provides a rigorous evaluation of quantum computing simulators’ performance using a benchmarking suite that meets industry-standard performance benchmarks.
Date Created
2023
Agent

Identifying Sources of Anomalies in Complex Networks

171925-Thumbnail Image.png
Description
The problem of monitoring complex networks for the detection of anomalous behavior is well known. Sensors are usually deployed for the purpose of monitoring these networks for anomalies and Sensor Placement Optimization (SPO) is the problem of determining where these

The problem of monitoring complex networks for the detection of anomalous behavior is well known. Sensors are usually deployed for the purpose of monitoring these networks for anomalies and Sensor Placement Optimization (SPO) is the problem of determining where these sensors should be placed (deployed) in the network. Prior works have utilized the well known Set Cover formulation in order to determine the locations where sensors should be placed in the network, so that anomalies can be effectively detected. However, such works cannot be utilized to address the problem when the objective is to not only detect the presence of anomalies, but also to detect (distinguish) the source(s) of the detected anomalies, i.e., uniquely monitoring the network. In this dissertation, I attempt to fill in this gap by utilizing the mathematical concept of Identifying Codes and illustrating how it not only can overcome the aforementioned limitation, but also it, and its variants, can be utilized to monitor complex networks modeled from multiple domains. Over the course of this dissertation, I make key contributions which further enhance the efficacy and applicability of Identifying Codes as a monitoring strategy. First, I show how Identifying Codes are superior to not only the Set Cover formulation but also standard graph centrality metrics, for the purpose of uniquely monitoring complex networks. Second, I study novel problems such as the budget constrained Identifying Code, scalable Identifying Code, robust Identifying Code etc., and present algorithms and results for the respective problems. Third, I present useful Identifying Code results for restricted graph classes such as Unit Interval Bigraphs and Unit Disc Bigraphs. Finally, I show the universality of Identifying Codes by applying it to multiple domains.
Date Created
2022
Agent

Heuristics for Arc Routing Problems and Their Applications

171460-Thumbnail Image.png
Description
Arc Routing Problems (ARPs) are a type of routing problem that finds routes of minimum total cost covering the edges or arcs in a graph representing street or road networks. They find application in many essential services such as residential

Arc Routing Problems (ARPs) are a type of routing problem that finds routes of minimum total cost covering the edges or arcs in a graph representing street or road networks. They find application in many essential services such as residential waste collection, winter gritting, and others. Being NP-hard, solutions are usually found using heuristic methods. This dissertation contributes to heuristics for ARP, with a focus on the Capacitated Arc Routing Problem (CARP) with additional constraints. In operations such as residential waste collection, vehicle breakdown disruptions occur frequently. A new variant Capacitated Arc Re-routing Problem for Vehicle Break-down (CARP-VB) is introduced to address the need to re-route using only remaining vehicles to avoid missing services. A new heuristic Probe is developed to solve CARP-VB. Experiments on benchmark instances show that Probe is better in reducing the makespan and hence effective in reducing delays and avoiding missing services. In addition to total cost, operators are also interested in solutions that are attractive, that is, routes that are contiguous, compact, and non-overlapping to manage the work. Operators may not adopt a solution that is not attractive even if it is optimum. They are also interested in solutions that are balanced in workload to meet equity requirements. A new multi-objective memetic algorithm, MA-ABC is developed, that optimizes three objectives: Attractiveness, makespan, and total cost. On testing with benchmark instances, MA-ABC was found to be effective in providing attractive and balanced route solutions without affecting the total cost. Changes in the problem specification such as demand and topology occurs frequently in business operations. Machine learning be applied to learn the distribution behind these changes and generate solutions quickly at time of inference. Splice is a machine learning framework for CARP that generates closer to optimum solutions quickly using a graph neural network and deep Q-learning. Splice can solve several variants of node and arc routing problems using the same architecture without any modification. Splice was trained and tested using randomly generated instances. Splice generated solutions faster that are also better in comparison to popular metaheuristics.
Date Created
2022
Agent

An Effective Approach to Protecting Low-Power and Lossy IoT Networks Against Blackhole Attacks

168504-Thumbnail Image.png
Description
Realizing the applications of Internet of Things (IoT) with the goal of achieving a more efficient and automated world requires billions of connected smart devices and the minimization of hardware cost in these devices. As a result, many IoT devices

Realizing the applications of Internet of Things (IoT) with the goal of achieving a more efficient and automated world requires billions of connected smart devices and the minimization of hardware cost in these devices. As a result, many IoT devices do not have sufficient resources to support various protocols required in many IoT applications. Because of this, new protocols have been introduced to support the integration of these devices. One of these protocols is the increasingly popular routing protocol for low-power and lossy networks (RPL). However, this protocol is well known to attract blackhole and sinkhole attacks and cause serious difficulties when using more computationally intensive techniques to protect against these attacks, such as intrusion detection systems and rank authentication schemes. In this paper, an effective approach is presented to protect RPL networks against blackhole attacks. The approach does not address sinkhole attacks because they cause low damage and are often used along blackhole attacks and can be detected when blackhole attaches are detected. This approach uses the feature of multiple parents per node and a parent evaluation system enabling nodes to select more reliable routes. Simulations have been conducted, compared to existing approaches this approach would provide better protection against blackhole attacks with much lower overheads for small RPL networks.
Date Created
2021
Agent

Bounds on the Defective, Multifold, Paint Number of Planar Graphs

161893-Thumbnail Image.png
Description
A $k$-list assignment for a graph $G=(V, E)$ is a function $L$ that assigns a $k$-set $L(v)$ of "available colors" to each vertex $v \in V$. A $d$-defective, $m$-fold, $L$-coloring is a function $\phi$ that assigns an $m$-subset $\phi(v) \subseteq

A $k$-list assignment for a graph $G=(V, E)$ is a function $L$ that assigns a $k$-set $L(v)$ of "available colors" to each vertex $v \in V$. A $d$-defective, $m$-fold, $L$-coloring is a function $\phi$ that assigns an $m$-subset $\phi(v) \subseteq L(v)$ to each vertex $v$ so that each color class $V_{i}=\{v \in V:$ $i \in \phi(v)\}$ induces a subgraph of $G$ with maximum degree at most $d$. An edge $xy$ is an $i$-flaw of $\phi$ if $i\in \phi(x) \cap \phi(y)$. An online list-coloring algorithm $\mathcal{A}$ works on a known graph $G$ and an unknown $k$-list assignment $L$ to produce a coloring $\phi$ as follows. At step $r$ the set of vertices $v$ with $r \in L(v)$ is revealed to $\mathcal{A}$. For each vertex $v$, $\mathcal{A}$ must decide irrevocably whether to add $r$ to $\phi(v)$. The online choice number $\pt_{m}^{d}(G)$ of $G$ is the least $k$ for which some such algorithm produces a $d$-defective, $m$-fold, $L$-coloring $\phi$ of $G$ for all $k$-list assignments $L$. Online list coloring was introduced independently by Uwe Schauz and Xuding Zhu. It was known that if $G$ is planar then $\pt_{1}^{0}(G) \leq 5$ and $\pt_{1}^{1}(G) \leq 4$ are sharp bounds; here it is proved that $\pt_{1}^{3}(G) \leq 3$ is sharp, but there is a planar graph $H$ with $\pt_{1}^{2}(H)\ge 4$. Zhu conjectured that for some integer $m$, every planar graph $G$ satisfies $\pt_{m}^{0}(G) \leq 5 m-1$, and even that this is true for $m=2$. This dissertation proves that $\pt_{2}^{1}(G) \leq 9$, so the conjecture is "nearly" true, and the proof extends to $\pt_{m}^{1}(G) \leq\left\lceil\frac{9}{2} m\right\rceil$. Using Alon's Combinatorial Nullstellensatz, this is strengthened by showing that $G$ contains a linear forest $(V, F)$ such that there is an online algorithm that witnesses $\mathrm{pt}_{2}^{1}(G) \leq 9$ while producing a coloring whose flaws are in $F$, and such that no edge is an $i$-flaw and a $j$-flaw for distinct colors $i$ and $j$.
Date Created
2021
Agent

A Machine Learning based High-Speed State Estimator for Partially Observed Electric Transmission Systems

158867-Thumbnail Image.png
Description
The accurate monitoring of the bulk transmission system of the electric power grid by sensors, such as Remote Terminal Units (RTUs) and Phasor Measurement Units (PMUs), is essential for maintaining the reliability of the modern power system. One of the

The accurate monitoring of the bulk transmission system of the electric power grid by sensors, such as Remote Terminal Units (RTUs) and Phasor Measurement Units (PMUs), is essential for maintaining the reliability of the modern power system. One of the primary objectives of power system monitoring is the identification of the snapshots of the system at regular intervals by performing state estimation using the available measurements from the sensors. The process of state estimation corresponds to the estimation of the complex voltages at all buses of the system. PMU measurements play an important role in this regard, because of the time-synchronized nature of these measurements as well as the faster rates at which they are produced. However, a model-based linear state estimator created using PMU-only data requires complete observability of the system by PMUs for its continuous functioning. The conventional model-based techniques also make certain assumptions in the modeling of the physical system, such as the constant values of the line parameters. The measurement error models in the conventional state estimators are also assumed to follow a Gaussian distribution. In this research, a data mining technique using Deep Neural Networks (DNNs) is proposed for performing a high-speed, time-synchronized state estimation of the transmission system of the power system. The proposed technique uses historical data to identify the correlation between the measurements and the system states as opposed to directly using the physical model of the system. Therefore, the highlight of the proposed technique is its ability to provide an accurate, fast, time-synchronized estimate of the system states even in the absence of complete system observability by PMUs.
The state estimator is formulated for the IEEE 118-bus system and its reliable performance is demonstrated in the presence of redundant observability, complete observability, and incomplete observability. The robustness of the state estimator is also demonstrated by performing the estimation in presence of Non-Gaussian measurement errors and varying line parameters. The consistency of the DNN state estimator is demonstrated by performing state estimation for an entire day.
Date Created
2020
Agent

Improved Bi-criteria Approximation for the All-or-Nothing Multicommodity Flow Problem in Arbitrary Networks

158544-Thumbnail Image.png
Description
This thesis addresses the following fundamental maximum throughput routing problem: Given an arbitrary edge-capacitated n-node directed network and a set of k commodities, with source-destination pairs (s_i,t_i) and demands d_i> 0, admit and route the largest possible number of commodities

This thesis addresses the following fundamental maximum throughput routing problem: Given an arbitrary edge-capacitated n-node directed network and a set of k commodities, with source-destination pairs (s_i,t_i) and demands d_i> 0, admit and route the largest possible number of commodities -- i.e., the maximum throughput -- to satisfy their demands.

The main contributions of this thesis are three-fold: First, a bi-criteria approximation algorithm is presented for this all-or-nothing multicommodity flow (ANF) problem. This algorithm is the first to achieve a constant approximation of the maximum throughput with an edge capacity violation ratio that is at most logarithmic in n, with high probability. The approach used is based on a version of randomized rounding that keeps splittable flows, rather than approximating those via a non-splittable path for each commodity: This allows it to work for arbitrary directed edge-capacitated graphs, unlike most of the prior work on the ANF problem. The algorithm also works if a weighted throughput is considered, where the benefit gained by fully satisfying the demand for commodity i is determined by a given weight w_i>0. Second, a derandomization of the algorithm is presented that maintains the same approximation bounds, using novel pessimistic estimators for Bernstein's inequality. In addition, it is shown how the framework can be adapted to achieve a polylogarithmic fraction of the maximum throughput while maintaining a constant edge capacity violation, if the network capacity is large enough. Lastly, one important aspect of the randomized and derandomized algorithms is their simplicity, which lends to efficient implementations in practice. The implementations of both randomized rounding and derandomized algorithms for the ANF problem are presented and show their efficiency in practice.
Date Created
2020
Agent

Cooperative Driving of Connected Autonomous Vehicles Using Responsibility Sensitive Safety Rules

158517-Thumbnail Image.png
Description
In the recent times, traffic congestion and motor accidents have been a major problem for transportation in major cities. Intelligent Transportation Systems has the potential to be an effective solution in order to tackle this issue. Connected Autonomous Vehicles can

In the recent times, traffic congestion and motor accidents have been a major problem for transportation in major cities. Intelligent Transportation Systems has the potential to be an effective solution in order to tackle this issue. Connected Autonomous Vehicles can cooperate at intersections, ramp merging, lane change and other conflicting scenarios in order to resolve the conflicts and avoid collisions with other vehicles. A lot of works has been proposed for specific scenarios such as intersections, ramp merging or lane change which partially solve the conflict resolution problem. Also, one of the major issues in autonomous decision making - deadlocks have not been considered in some of the works. The existing works either do not consider deadlocks or lack a safety proof. This thesis proposes a cooperative driving solution that provides a complete navigation, conflict resolution and deadlock resolution for connected autonomous vehicles. A graph-based model is used to resolve the deadlocks between vehicles and the responsibility sensitive safety (RSS) rules have been used in order to ensure safety of the autonomous vehicles during conflict detection and resolution. This algorithm provides a complete navigation solution for an autonomous vehicle from its source to destination. The algorithm ensures that accidents do not occur even in the worst-case scenario and the decision making is deadlock free.
Date Created
2020
Agent