Sen, Arunabha

Smart resource allocation in internet-of-things: perspectives of network, security, and economics

Description

Emerging from years of research and development, the Internet-of-Things (IoT) has finally paved its way into our daily lives. From smart home to Industry 4.0, IoT has been fundamentally transforming numerous domains with its unique superpower of interconnecting world-wide devices. However, the capability of IoT is largely constrained by the limited resources it can employ in various application scenarios, including computing power, network resource, dedicated hardware, etc. The situation is further exacerbated by the stringent quality-of-service (QoS) requirements of many IoT applications, such as delay, bandwidth, security, reliability, and more. This mismatch in resources and demands has greatly hindered the deployment and utilization of IoT services in many resource-intense and QoS-sensitive scenarios like autonomous driving and virtual reality.

I believe that the resource issue in IoT will persist in the near future due to technological, economic and environmental factors. In this dissertation, I seek to address this issue by means of smart resource allocation. I propose mathematical models to formally describe various resource constraints and application scenarios in IoT. Based on these, I design smart resource allocation algorithms and protocols to maximize the system performance in face of resource restrictions. Different aspects are tackled, including networking, security, and economics of the entire IoT ecosystem. For different problems, different algorithmic solutions are devised, including optimal algorithms, provable approximation algorithms, and distributed protocols. The solutions are validated with rigorous theoretical analysis and/or extensive simulation experiments.

Date Created

2019

Agent

Author (aut): Yu, Ruozhou, Ph.D
Thesis advisor (ths): Xue, Guoliang
Committee member: Huang, Dijiang
Committee member: Sen, Arunabha
Committee member: Zhang, Yanchao
Publisher (pbl): Arizona State University

A Model for Calculating Damage Potential in Computer Systems

Description

For systems having computers as a significant component, it becomes a critical task to identify the potential threats that the users of the system can present, while being both inside and outside the system. One of the most important factors that differentiate an insider from an outsider is the fact that the insider being a part of the system, owns privileges that enable him/her access to the resources and processes of the system through valid capabilities. An insider with malicious intent can potentially be more damaging compared to outsiders. The above differences help to understand the notion and scope of an insider.

The significant loss to organizations due to the failure to detect and mitigate the insider threat has resulted in an increased interest in insider threat detection. The well-studied effective techniques proposed for defending against attacks by outsiders have not been proven successful against insider attacks. Although a number of security policies and models to deal with the insider threat have been developed, the approach taken by most organizations is the use of audit logs after the attack has taken place. Such approaches are inspired by academic research proposals to address the problem by tracking activities of the insider in the system. Although tracking and logging are important, it is argued that they are not sufficient. Thus, the necessity to predict the potential damage of an insider is considered to help build a stronger evaluation and mitigation strategy for the insider attack. In this thesis, the question that seeks to be answered is the following: `Considering the relationships that exist between the insiders and their role, their access to the resources and the resource set, what is the potential damage that an insider can cause?'

A general system model is introduced that can capture general insider attacks including those documented by Computer Emergency Response Team (CERT) for the Software Engineering Institute (SEI). Further, initial formulations of the damage potential for leakage and availability in the model is introduced. The model usefulness is shown by expressing 14 of actual attacks in the model and show how for each case the attack could have been mitigated.

Date Created

2019

Agent

Author (aut): Nolastname, Sharad
Thesis advisor (ths): Bazzi, Rida
Committee member: Sen, Arunabha
Committee member: Doupe, Adam
Publisher (pbl): Arizona State University

Multiobjective Optimization Based Approach for Truth Discovery

Description

There are many applications where the truth is unknown. The truth values are

guessed by different sources. The values of different properties can be obtained from

various sources. These will lead to the disagreement in sources. An important task

is to obtain the truth from these sometimes contradictory sources. In the extension

of computing the truth, the reliability of sources needs to be computed. There are

models which compute the precision values. In those earlier models Banerjee et al.

(2005) Dong and Naumann (2009) Kasneci et al. (2011) Li et al. (2012) Marian and

Wu (2011) Zhao and Han (2012) Zhao et al. (2012), multiple properties are modeled

individually. In one of the existing works, the heterogeneous properties are modeled in

a joined way. In that work, the framework i.e. Conflict Resolution on Heterogeneous

Data (CRH) framework is based on the single objective optimization. Due to the

single objective optimization and non-convex optimization problem, only one local

optimal solution is found. As this is a non-convex optimization problem, the optimal

point depends upon the initial point. This single objective optimization problem is

converted into a multi-objective optimization problem. Due to the multi-objective

optimization problem, the Pareto optimal points are computed. In an extension of

that, the single objective optimization problem is solved with numerous initial points.

The above two approaches are used for finding the solution better than the solution

obtained in the CRH with median as the initial point for the continuous variables and

majority voting as the initial point for the categorical variables. In the experiments,

the solution, coming from the CRH, lies in the Pareto optimal points of the multiobjective

optimization and the solution coming from the CRH is the optimum solution

in these experiments.

Date Created

2019

Agent

Author (aut): Jain, Karan
Thesis advisor (ths): Xue, Guoliang
Committee member: Sen, Arunabha
Committee member: Sarwat, Mohamed
Publisher (pbl): Arizona State University

Design, Analysis and Computation in Wireless and Optical Networks

Description

In the realm of network science, many topics can be abstracted as graph problems, such as routing, connectivity enhancement, resource/frequency allocation and so on. Though most of them are NP-hard to solve, heuristics as well as approximation algorithms are proposed to achieve reasonably good results. Accordingly, this dissertation studies graph related problems encountered in real applications. Two problems studied in this dissertation are derived from wireless network, two more problems studied are under scenarios of FIWI and optical network, one more problem is in Radio- Frequency Identification (RFID) domain and the last problem is inspired by satellite deployment.

The objective of most of relay nodes placement problems, is to place the fewest number of relay nodes in the deployment area so that the network, formed by the sensors and the relay nodes, is connected. Under the fixed budget scenario, the expense involved in procuring the minimum number of relay nodes to make the network connected, may exceed the budget. In this dissertation, we study a family of problems whose goal is to design a network with “maximal connectedness” or “minimal disconnectedness”, subject to a fixed budget constraint. Apart from “connectivity”, we also study relay node problem in which degree constraint is considered. The balance of reducing the degree of the network while maximizing communication forms the basis of our d-degree minimum arrangement(d-MA) problem. In this dissertation, we look at several approaches to solving the generalized d-MA problem where we embed a graph onto a subgraph of a given degree.

In recent years, considerable research has been conducted on optical and FIWI networks. Utilizing a recently proposed concept “candidate trees” in optical network, this dissertation studies counting problem on complete graphs. Closed form expressions are given for certain cases and a polynomial counting algorithm for general cases is also presented. Routing plays a major role in FiWi networks. Accordingly to a novel path length metric which emphasizes on “heaviest edge”, this dissertation proposes a polynomial algorithm on single path computation. NP-completeness proof as well as approximation algorithm are presented for multi-path routing.

Radio-frequency identification (RFID) technology is extensively used at present for identification and tracking of a multitude of objects. In many configurations, simultaneous activation of two readers may cause a “reader collision” when tags are present in the intersection of the sensing ranges of both readers. This dissertation ad- dresses slotted time access for Readers and tries to provide a collision-free scheduling scheme while minimizing total reading time.

Finally, this dissertation studies a monitoring problem on the surface of the earth for significant environmental, social/political and extreme events using satellites as sensors. It is assumed that the impact of a significant event spills into neighboring regions and there will be corresponding indicators. Careful deployment of sensors, utilizing “Identifying Codes”, can ensure that even though the number of deployed sensors is fewer than the number of regions, it may be possible to uniquely identify the region where the event has taken place.

Date Created

2019

Agent

Author (aut): Zhou, Chenyang
Thesis advisor (ths): Richa, Andrea
Thesis advisor (ths): Sen, Arunabha
Committee member: Xue, Guoliang
Committee member: Walkowiak, Krzysztof
Publisher (pbl): Arizona State University

Connectivity in Complex Networks: Measures, Inference and Optimization

Description

Networks naturally appear in many high-impact applications. The simplest model of networks is single-layered networks, where the nodes are from the same domain and the links are of the same type. However, as the world is highly coupled, nodes from different application domains tend to be interdependent on each other, forming a more complex network model called multi-layered networks.

Among the various aspects of network studies, network connectivity plays an important role in a myriad of applications. The diversified application areas have spurred numerous connectivity measures, each designed for some specific tasks. Although effective in their own fields, none of the connectivity measures is generally applicable to all the tasks. Moreover, existing connectivity measures are predominantly based on single-layered networks, with few attempts made on multi-layered networks.

Most connectivity analyzing methods assume that the input network is static and accurate, which is not realistic in many applications. As real-world networks are evolving, their connectivity scores would vary by time as well, making it imperative to keep track of those changing parameters in a timely manner. Furthermore, as the observed links in the input network may be inaccurate due to noise and incomplete data sources, it is crucial to infer a more accurate network structure to better approximate its connectivity scores.

The ultimate goal of connectivity studies is to optimize the connectivity scores via manipulating the network structures. For most complex measures, the hardness of the optimization problem still remains unknown. Meanwhile, current optimization methods are mainly ad-hoc solutions for specific types of connectivity measures on single-layered networks. No optimization framework has ever been proposed to tackle a wider range of connectivity measures on complex networks.

In this thesis, an in-depth study of connectivity measures, inference, and optimization problems will be proposed. Specifically, a unified connectivity measure model will be introduced to unveil the commonality among existing connectivity measures. For the connectivity inference aspect, an effective network inference method and connectivity tracking framework will be described. Last, a generalized optimization framework will be built to address the connectivity minimization/maximization problems on both single-layered and multi-layered networks.

Date Created

2019

Agent

Author (aut): Chen, Chen
Thesis advisor (ths): Tong, Hanghang
Committee member: Davulcu, Hasan
Committee member: Sen, Arunabha
Committee member: Subrahmanian, V.S.
Committee member: Ying, Lei
Publisher (pbl): Arizona State University

Network Representation Learning in Social Media

Description

The popularity of social media has generated abundant large-scale social networks, which advances research on network analytics. Good representations of nodes in a network can facilitate many network mining tasks. The goal of network representation learning (network embedding) is to learn low-dimensional vector representations of social network nodes that capture certain properties of the networks. With the learned node representations, machine learning and data mining algorithms can be applied for network mining tasks such as link prediction and node classification. Because of its ability to learn good node representations, network representation learning is attracting increasing attention and various network embedding algorithms are proposed.

Despite the success of these network embedding methods, the majority of them are dedicated to static plain networks, i.e., networks with fixed nodes and links only; while in social media, networks can present in various formats, such as attributed networks, signed networks, dynamic networks and heterogeneous networks. These social networks contain abundant rich information to alleviate the network sparsity problem and can help learn a better network representation; while plain network embedding approaches cannot tackle such networks. For example, signed social networks can have both positive and negative links. Recent study on signed networks shows that negative links have added value in addition to positive links for many tasks such as link prediction and node classification. However, the existence of negative links challenges the principles used for plain network embedding. Thus, it is important to study signed network embedding. Furthermore, social networks can be dynamic, where new nodes and links can be introduced anytime. Dynamic networks can reveal the concept drift of a user and require efficiently updating the representation when new links or users are introduced. However, static network embedding algorithms cannot deal with dynamic networks. Therefore, it is important and challenging to propose novel algorithms for tackling different types of social networks.

In this dissertation, we investigate network representation learning in social media. In particular, we study representative social networks, which includes attributed network, signed networks, dynamic networks and document networks. We propose novel frameworks to tackle the challenges of these networks and learn representations that not only capture the network structure but also the unique properties of these social networks.

Date Created

2018

Agent

Author (aut): Wang, Suhang
Thesis advisor (ths): Liu, Huan
Committee member: Aggarwal, Charu
Committee member: Sen, Arunabha
Committee member: Tong, Hanghang
Publisher (pbl): Arizona State University

An Investigation of Flow-based Algorithms for Sybil Defense

Description

Distributed systems are prone to attacks, called Sybil attacks, wherein an adversary may generate an unbounded number of bogus identities to gain control over the system. In this thesis, an algorithm, DownhillFlow, for mitigating such attacks is presented and

tested experimentally. The trust rankings produced by the algorithm are significantly better than those of the distributed SybilGuard protocol and only slightly worse than those of the best-known Sybil defense algorithm, ACL. The results obtained for ACL are

consistent with those obtained in previous studies. The running times of the algorithms are also tested and two results are obtained: first, DownhillFlow’s running time is found to be significantly faster than any existing algorithm including ACL, terminating in

slightly over one second on the 300,000-node DBLP graph. This allows it to be used in settings such as dynamic networks as-is with no additional functionality needed. Second, when ACL is configured such that it matches DownhillFlow’s speed, it fails to recognize

large portions of the input graphs and its accuracy among the portion of the graphs it does recognize becomes lower than that of DownhillFlow.

Date Created

2018

Agent

Author (aut): Bradley, Michael
Thesis advisor (ths): Bazzi, Rida
Committee member: Richa, Andrea
Committee member: Sen, Arunabha
Publisher (pbl): Arizona State University

Perspective Scaling and Trait Detection on Social Media Data

Description

This research start utilizing an efficient sparse inverse covariance matrix (precision matrix) estimation technique to identify a set of highly correlated discriminative perspectives between radical and counter-radical groups. A ranking system has been developed that utilizes ranked perspectives to map Islamic organizations on a set of socio-cultural, political and behavioral scales based on their web site corpus. Simultaneously, a gold standard ranking of these organizations was created through domain experts and compute expert-to-expert agreements and present experimental results comparing the performance of the QUIC based scaling system to another baseline method for organizations. The QUIC based algorithm not only outperforms the baseline methods, but it is also the only system that consistently performs at area expert-level accuracies for all scales. Also, a multi-scale ideological model has been developed and it investigates the correlates of Islamic extremism in Indonesia, Nigeria and UK. This analysis demonstrate that violence does not correlate strongly with broad Muslim theological or sectarian orientations; it shows that religious diversity intolerance is the only consistent and statistically significant ideological correlate of Islamic extremism in these countries, alongside desire for political change in UK and Indonesia, and social change in Nigeria. Next, dynamic issues and communities tracking system based on NMF(Non-negative Matrix Factorization) co-clustering algorithm has been built to better understand the dynamics of virtual communities. The system used between Iran and Saudi Arabia to build and apply a multi-party agent-based model that can demonstrate the role of wedges and spoilers in a complex environment where coalitions are dynamic. Lastly, a visual intelligence platform for tracking the diffusion of online social movements has been developed called LookingGlass to track the geographical footprint, shifting positions and flows of individuals, topics and perspectives between groups. The algorithm utilize large amounts of text collected from a wide variety of organizations’ media outlets to discover their hotly debated topics, and their discriminative perspectives voiced by opposing camps organized into multiple scales. Discriminating perspectives is utilized to classify and map individual Tweeter’s message content to social movements based on the perspectives expressed in their tweets.

Date Created

2018

Agent

Author (aut): Kim, Nyunsu
Thesis advisor (ths): Davulcu, Hasan
Committee member: Sen, Arunabha
Committee member: Hsiao, Sharon
Committee member: Corman, Steven
Publisher (pbl): Arizona State University

An Algorithm for Merging Identities

Description

In online social networks the identities of users are concealed, often by design. This anonymity makes it possible for a single person to have multiple accounts and to engage in malicious activity such as defrauding a service providers, leveraging social influence, or hiding activities that would otherwise be detected. There are various methods for detecting whether two online users in a network are the same people in reality and the simplest way to utilize this information is to simply merge their identities and treat the two users as a single user. However, this then raises the issue of how we deal with these composite identities. To solve this problem, we introduce a mathematical abstraction for representing users and their identities as partitions on a set. We then define a similarity function, SIM, between two partitions, a set of properties that SIM must have, and a threshold that SIM must exceed for two users to be considered the same person. The main theoretical result of our work is a proof that for any given partition and similarity threshold, there is only a single unique way to merge the identities of similar users such that no two identities are similar. We also present two algorithms, COLLAPSE and SIM_MERGE, that merge the identities of users to find this unique set of identities. We prove that both algorithms execute in polynomial time and we also perform an experiment on dark web social network data from over 6000 users that demonstrates the runtime of SIM_MERGE.

Date Created

2018-05

Agent

Author (aut): Polican, Andrew Dominic
Thesis director: Shakarian, Paulo
Committee member: Sen, Arunabha
Contributor (ctb): Computer Science and Engineering Program
Contributor (ctb): Barrett, The Honors College

Vulnerability and Protection Analysis of Critical Infrastructure Systems

Description

The power and communication networks are highly interdependent and form a part of the critical infrastructure of a country. Similarly, dependencies exist within the networks itself. Owing to cascading failures, interdependent and intradependent networks are extremely susceptible to widespread vulnerabilities. In recent times the research community has shown significant interest in modeling to capture these dependencies. However, many of them are simplistic in nature which limits their applicability to real world systems. This dissertation presents a Boolean logic based model termed as Implicative Interdependency Model (IIM) to capture the complex dependencies and cascading failures resulting from an initial failure of one or more entities of either network.

Utilizing the IIM, four pertinent problems encompassing vulnerability and protection of critical infrastructures are formulated and solved. For protection analysis, the Entity Hardening Problem, Targeted Entity Hardening Problem and Auxiliary Entity Allocation Problem are formulated. Qualitatively, under a resource budget, the problems maximize the number of entities protected from failure from an initial failure of a set of entities. Additionally, the model is also used to come up with a metric to analyze the Robustness of critical infrastructure systems. The computational complexity of all these problems is NP-complete. Accordingly, Integer Linear Program solutions (to obtain the optimal solution) and polynomial time sub-optimal Heuristic solutions are proposed for these problems. To analyze the efficacy of the Heuristic solution, comparative studies are performed on real-world and test system data.

Date Created

2017

Agent

Author (aut): Banerjee, Joydeep
Thesis advisor (ths): Sen, Arunabha
Committee member: Dasgupta, Partha
Committee member: Xue, Guoliang
Committee member: Raravi, Gurulingesh
Publisher (pbl): Arizona State University

Subscribe to Sen, Arunabha