Detecting Prominent Features and Classifying Network Traffic for Securing Internet of Things Based on Ensemble Methods

157413-Thumbnail Image.png
Description
Rapid growth of internet and connected devices ranging from cloud systems to internet of things have raised critical concerns for securing these systems. In the recent past, security attacks on different kinds of devices have evolved in terms of complexity

Rapid growth of internet and connected devices ranging from cloud systems to internet of things have raised critical concerns for securing these systems. In the recent past, security attacks on different kinds of devices have evolved in terms of complexity and diversity. One of the challenges is establishing secure communication in the network among various devices and systems. Despite being protected with authentication and encryption, the network still needs to be protected against cyber-attacks. For this, the network traffic has to be closely monitored and should detect anomalies and intrusions. Intrusion detection can be categorized as a network traffic classification problem in machine learning. Existing network traffic classification methods require a lot of training and data preprocessing, and this problem is more serious if the dataset size is huge. In addition, the machine learning and deep learning methods that have been used so far were trained on datasets that contain obsolete attacks. In this thesis, these problems are addressed by using ensemble methods applied on an up to date network attacks dataset. Ensemble methods use multiple learning algorithms to get better classification accuracy that could be obtained when the corresponding learning algorithm is applied alone. This dataset for network traffic classification has recent attack scenarios and contains over fifteen attacks. This approach shows that ensemble methods can be used to classify network traffic and detect intrusions with less training times of the model, and lesser pre-processing without feature selection. In addition, this thesis also shows that only with less than ten percent of the total features of input dataset will lead to similar accuracy that is achieved on whole dataset. This can heavily reduce the training times and classification duration in real-time scenarios.
Date Created
2019
Agent

Design, Analysis and Computation in Wireless and Optical Networks

157245-Thumbnail Image.png
Description
In the realm of network science, many topics can be abstracted as graph problems, such as routing, connectivity enhancement, resource/frequency allocation and so on. Though most of them are NP-hard to solve, heuristics as well as approximation algorithms are proposed

In the realm of network science, many topics can be abstracted as graph problems, such as routing, connectivity enhancement, resource/frequency allocation and so on. Though most of them are NP-hard to solve, heuristics as well as approximation algorithms are proposed to achieve reasonably good results. Accordingly, this dissertation studies graph related problems encountered in real applications. Two problems studied in this dissertation are derived from wireless network, two more problems studied are under scenarios of FIWI and optical network, one more problem is in Radio- Frequency Identification (RFID) domain and the last problem is inspired by satellite deployment.

The objective of most of relay nodes placement problems, is to place the fewest number of relay nodes in the deployment area so that the network, formed by the sensors and the relay nodes, is connected. Under the fixed budget scenario, the expense involved in procuring the minimum number of relay nodes to make the network connected, may exceed the budget. In this dissertation, we study a family of problems whose goal is to design a network with “maximal connectedness” or “minimal disconnectedness”, subject to a fixed budget constraint. Apart from “connectivity”, we also study relay node problem in which degree constraint is considered. The balance of reducing the degree of the network while maximizing communication forms the basis of our d-degree minimum arrangement(d-MA) problem. In this dissertation, we look at several approaches to solving the generalized d-MA problem where we embed a graph onto a subgraph of a given degree.

In recent years, considerable research has been conducted on optical and FIWI networks. Utilizing a recently proposed concept “candidate trees” in optical network, this dissertation studies counting problem on complete graphs. Closed form expressions are given for certain cases and a polynomial counting algorithm for general cases is also presented. Routing plays a major role in FiWi networks. Accordingly to a novel path length metric which emphasizes on “heaviest edge”, this dissertation proposes a polynomial algorithm on single path computation. NP-completeness proof as well as approximation algorithm are presented for multi-path routing.

Radio-frequency identification (RFID) technology is extensively used at present for identification and tracking of a multitude of objects. In many configurations, simultaneous activation of two readers may cause a “reader collision” when tags are present in the intersection of the sensing ranges of both readers. This dissertation ad- dresses slotted time access for Readers and tries to provide a collision-free scheduling scheme while minimizing total reading time.

Finally, this dissertation studies a monitoring problem on the surface of the earth for significant environmental, social/political and extreme events using satellites as sensors. It is assumed that the impact of a significant event spills into neighboring regions and there will be corresponding indicators. Careful deployment of sensors, utilizing “Identifying Codes”, can ensure that even though the number of deployed sensors is fewer than the number of regions, it may be possible to uniquely identify the region where the event has taken place.
Date Created
2019
Agent

Enumeration Methods and Series Analysis of Self-Avoiding Polygons on the Hexagonal Lattice, with Applications to Self-organizing Particle Systems

132360-Thumbnail Image.png
Description
We consider programmable matter as a collection of simple computational elements (or particles) that self-organize to solve system-wide problems of movement, configuration, and coordination. Here, we focus on the compression problem, in which the particle system gathers as tightly together

We consider programmable matter as a collection of simple computational elements (or particles) that self-organize to solve system-wide problems of movement, configuration, and coordination. Here, we focus on the compression problem, in which the particle system gathers as tightly together as possible, as in a sphere or its equivalent in the presence of some underlying geometry. Within this model a configuration of particles can be represented as a unique closed self-avoiding walk on the triangular lattice. In this paper we will examine the bias parameter of a Markov chain based algorithm that solves the compression problem under the geometric amoebot model, for particle systems that begin in a connected configuration with no holes. This bias parameter, $\lambda$, determines the behavior of the algorithm. It has been shown that for $\lambda > 2+\sqrt{2}$, with all but exponentially small probability, the algorithm achieves compression. Additionally the same algorithm can be used for expansion for small values of $\lambda$; in particular, for all $0 < \lambda < \sqrt{\tau}$, where $\lim_{n\to\infty} {(p_n)^{1
}}=\tau$. This research will focus on improving approximations on the lower bound of $\tau$. Toward this end we will examine algorithmic enumeration, and series analysis for self-avoiding polygons.
Date Created
2019-05
Agent

An Investigation of Flow-based Algorithms for Sybil Defense

156582-Thumbnail Image.png
Description
Distributed systems are prone to attacks, called Sybil attacks, wherein an adversary may generate an unbounded number of bogus identities to gain control over the system. In this thesis, an algorithm, DownhillFlow, for mitigating such attacks is presented and

tested experimentally.

Distributed systems are prone to attacks, called Sybil attacks, wherein an adversary may generate an unbounded number of bogus identities to gain control over the system. In this thesis, an algorithm, DownhillFlow, for mitigating such attacks is presented and

tested experimentally. The trust rankings produced by the algorithm are significantly better than those of the distributed SybilGuard protocol and only slightly worse than those of the best-known Sybil defense algorithm, ACL. The results obtained for ACL are

consistent with those obtained in previous studies. The running times of the algorithms are also tested and two results are obtained: first, DownhillFlow’s running time is found to be significantly faster than any existing algorithm including ACL, terminating in

slightly over one second on the 300,000-node DBLP graph. This allows it to be used in settings such as dynamic networks as-is with no additional functionality needed. Second, when ACL is configured such that it matches DownhillFlow’s speed, it fails to recognize

large portions of the input graphs and its accuracy among the portion of the graphs it does recognize becomes lower than that of DownhillFlow.
Date Created
2018
Agent

A Study of Some Edge-Deletion Algorithms for Reducing Disease Spread on Networks

133381-Thumbnail Image.png
Description
This thesis discusses three recent optimization problems that seek to reduce disease spread on arbitrary graphs by deleting edges, and it discusses three approximation algorithms developed for these problems. Important definitions are presented including the Linear Threshold and Triggering Set

This thesis discusses three recent optimization problems that seek to reduce disease spread on arbitrary graphs by deleting edges, and it discusses three approximation algorithms developed for these problems. Important definitions are presented including the Linear Threshold and Triggering Set models and the set function properties of submodularity and monotonicity. Also, important results regarding the Linear Threshold model and computation of the influence function are presented along with proof sketches. The three main problems are formally presented, and NP-hardness results along with proof sketches are presented where applicable. The first problem seeks to reduce spread of infection over the Linear Threshold process by making use of an efficient tree data structure. The second problem seeks to reduce the spread of infection over the Linear Threshold process while preserving the PageRank distribution of the input graph. The third problem seeks to minimize the spectral radius of the input graph. The algorithms designed for these problems are described in writing and with pseudocode, and their approximation bounds are stated along with time complexities. Discussion of these algorithms considers how these algorithms could see real-world use. Challenges and the ways in which these algorithms do or do not overcome them are noted. Two related works, one which presents an edge-deletion disease spread reduction problem over a deterministic threshold process and the other which considers a graph modification problem aimed at minimizing worst-case disease spread, are compared with the three main works to provide interesting perspectives. Furthermore, a new problem is proposed that could avoid some issues faced by the three main problems described, and directions for future work are suggested.
Date Created
2018-05
Agent

Approaches to Minimum d-Degree Arrangement

135251-Thumbnail Image.png
Description
Many systems in the world \u2014 such as cellular networks, the post service, or transportation pathways \u2014 can be modeled as networks or graphs. The practical applications of graph algorithms generally seek to achieve some goal while minimizing some cost

Many systems in the world \u2014 such as cellular networks, the post service, or transportation pathways \u2014 can be modeled as networks or graphs. The practical applications of graph algorithms generally seek to achieve some goal while minimizing some cost such as money or distance. While the minimum linear arrangement (MLA) problem has been widely-studied amongst graph ordering and embedding problems, there have been no developments into versions of the problem involving degree higher than 2. An application of our problem can be seen in overlay networks in telecommunications. An overlay network is a virtual network that is built on top of another network. It is a logical network where the links between nodes represent the physical paths connecting the nodes in the underlying infrastructure. The underlying physical network may be incomplete, but as long as it is connected, we can build a complete overlay network on top of it. Since some nodes may be overloaded by traffic, we can reduce the strain on the overlay network by limiting the communication between nodes. Some edges, however, may have more importance than others so we must be careful about our selection of which nodes are allowed to communicate with each other. The balance of reducing the degree of the network while maximizing communication forms the basis of our d-degree minimum arrangement problem. In this thesis we will look at several approaches to solving the generalized d-degree minimum arrangement d-MA problem where we embed a graph onto a subgraph of a given degree. We first look into the requirements and challenges of solving the d-MA problem. We will then present a polynomial-time heuristic and compare its performance with the optimal solution derived from integer linear programming. We will show that a simple (d-1)-ary tree construction provides the optimal structure for uniform graphs with large requests sets. Finally, we will present experimental data gathered from running simulations on a variety of graphs to evaluate the efficiency of our heuristic and tree construction.
Date Created
2016-05
Agent

Collaborative Computation in Self-Organizing Particle Systems

134914-Thumbnail Image.png
Description
Many forms of programmable matter have been proposed for various tasks. We use an abstract model of self-organizing particle systems for programmable matter which could be used for a variety of applications, including smart paint and coating materials for engineering

Many forms of programmable matter have been proposed for various tasks. We use an abstract model of self-organizing particle systems for programmable matter which could be used for a variety of applications, including smart paint and coating materials for engineering or programmable cells for medical uses. Previous research using this model has focused on shape formation and other spatial configuration problems, including line formation, compression, and coating. In this work we study foundational computational tasks that exceed the capabilities of the individual constant memory particles described by the model. These tasks represent new ways to use these self-organizing systems, which, in conjunction with previous shape and configuration work, make the systems useful for a wider variety of tasks. We present an implementation of a counter using a line of particles, which makes it possible for the line of particles to count to and store values much larger than their individual capacities. We then present an algorithm that takes a matrix and a vector as input and then sets up and uses a rectangular block of particles to compute the matrix-vector multiplication. This setup also utilizes the counter implementation to store the resulting vector from the matrix-vector multiplication. Operations such as counting and matrix multiplication can leverage the distributed and dynamic nature of the self-organizing system to be more efficient and adaptable than on traditional linear computing hardware. Such computational tools also give the systems more power to make complex decisions when adapting to new situations or to analyze the data they collect, reducing reliance on a central controller for setup and output processing. Finally, we demonstrate an application of similar types of computations with self-organizing systems to image processing, with an implementation of an image edge detection algorithm.
Date Created
2016-12
Agent

Mazes of Waverly Place: Interactive Algorithmic Art Generator

134701-Thumbnail Image.png
Description
This paper is a supplement to our interactive algorithmic art generator project which can be found at weiverlyroe.github.io/waverlyplace. For this thesis, we demonstrate how with certain input we can algorithmically generate art, specifically a playable random maze with exactly one

This paper is a supplement to our interactive algorithmic art generator project which can be found at weiverlyroe.github.io/waverlyplace. For this thesis, we demonstrate how with certain input we can algorithmically generate art, specifically a playable random maze with exactly one solution. We explore interactive and algorithmic art and how our mazes are a form of both. Through examining several maze generation algorithms, we show that an ideal representation of a single-solution maze, called a perfect maze, is a spanning tree of a planar graph. The final algorithm is a re-imagining of Kruskal's Minimum Spanning Tree Algorithm with these adjustments: (1) potential edges are ordered randomly rather than sorted and (2) certain edges are forced in the maze in order for the wall structure to display the player's text input. Lastly, we discuss improvements which could be made and features which we plan to add to the project in the future.
Date Created
2016-12
Agent

Design Patterns for Distributed Systems

134339-Thumbnail Image.png
Description
Implementing a distributed algorithm is more complicated than implementing a non-distributed algorithm. This is because distributed systems involve coordination of different processes each of which has a partial view of the global system state. The only way to share information

Implementing a distributed algorithm is more complicated than implementing a non-distributed algorithm. This is because distributed systems involve coordination of different processes each of which has a partial view of the global system state. The only way to share information in a distributed system is by message passing. Task that are straightforward in a non-distributed system, like deciding on the value of a global system state, can be quite complicated to achieve in a distributed system [1]. On top of the difficulties caused by the distributed nature of the computations, distributed systems typically need to be able to operate normally even if some of the nodes in the system are faulty which further adds to the uncertainty that processes have about the global state. Many factors make the implementation of a distributed algorithms difficult. Design patterns [2] are useful in simplifying the development of general algorithms. A design pattern describes a high level solution to a common, abstract problem that many systems may face. Common structural, creational, and behavioral problems are identified and elegantly solved by design patterns. By identifying features that an algorithm uses, and framing each feature as one of the common problems that a specific design pattern solves, designing a robust implementation of an algorithm becomes more manageable. In this way, design patterns can aid the implementation of algorithms. Unfortunately, design patterns are typically not discussed when developing distributed algorithms. Because correctly developing a distributed algorithm is difficult, many papers (eg. [1], [3], [4]) focus on verifying the correctness of the developed algorithm. Papers that are more practical ([5], [6]) establish the correctness of their algorithm and that their algorithm is efficient enough to be practical. However, papers on distributed algorithms usually make little mention of design patterns. The goal of this work was to gain experience implementing distributed systems including learning the application of design patterns and the application of related technical topics. This was achieved by implementing a currently unpublished algorithm that is tentatively called Bakery Consensus. Bakery Consensus is a replicated state-machine protocol that can tolerate servers with Byzantine faults, but assumes non-faulty clients. The algorithm also establishes non-skipping timestamps for each operation completed by the replicated state-machine. The design of the structure, communication, and creation of the different system parts depended heavily upon the book Design Patterns [2]. After implementing the system, the success of the in implementing its various parts was based upon their ability to satisfy the SOLID [7] principles as well as their ability to establish low coupling and high cohesion [8]. The rest of this paper is organized as follows. We begin by providing background information about distributed algorithms, including replicated state-machine protocols and the Bakery Consensus algorithm. Section 3 gives a background on several design patterns and software engineering principles that were used in the development process. Section 4 discusses the well designed parts of the system that used design patterns, and how these design patterns were chosen. Section 5 discusses well designed system parts that relied upon other technical topics. Section 6 discusses system parts that need redesign. The conclusion summarizes what was accomplished by the implementation process and the lessons learned about design patterns for distributed algorithms.
Date Created
2017-05
Agent

Extending ns-3 for Three-Dimensional Wireless Networks

137465-Thumbnail Image.png
Description
As technologies advance, so does the curiosity and exploration of humankind. There are many domains across this planet that are unexplored \u2014 the depths of Earth's ocean being one of the most predominant. While the ocean covers seventy percent of

As technologies advance, so does the curiosity and exploration of humankind. There are many domains across this planet that are unexplored \u2014 the depths of Earth's ocean being one of the most predominant. While the ocean covers seventy percent of Earth's surface, a vast ninety-five percent of this realm remains untouched and unseen by the human eye. The biggest causality of this can be identified in the limitations of current technologies and the large expense associated with delving into these dangerous and uncharted areas. Underwater communication between unmanned devices is the solution to this problem. With the oceanic deployment of wirelessly connected unmanned underwater vehicles (UUVs), researchers can limit risk to human safely and retrieve invaluable oceanographic data from unimaginable depths. However, before this system can be physically deployed, the network topology and environmental interactions must be simulated. More specific to the application, how does attenuation of optical propagation degrade between transmissions? A widely used open source network simulator is the ns series: ns-1, ns-2, and ns-3. Ns-3 is the most recent version, and is a valuable tool for modeling network interactions. However, underwater simulation proposes a limitation \u2014 a three-dimensional consideration for pressure. To properly model this interaction, it is vital that an extension to ns-3 be provided in order to account for the affects pressure has on the propagation of a signal at varying depths.
Date Created
2013-05
Agent