Protecting identity and location privacy in online environment

153986-Thumbnail Image.png
Description
The recent years have witnessed a rapid development of mobile devices and smart devices. As more and more people are getting involved in the online environment, privacy issues are becoming increasingly important. People’s privacy in the digital world is much

The recent years have witnessed a rapid development of mobile devices and smart devices. As more and more people are getting involved in the online environment, privacy issues are becoming increasingly important. People’s privacy in the digital world is much easier to leak than in the real world, because every action people take online would leave a trail of information which could be recorded, collected and used by malicious attackers. Besides, service providers might collect users’ information and analyze them, which also leads to a privacy breach. Therefore, preserving people’s privacy is very important in the online environment.

In this dissertation, I study the problems of preserving people’s identity privacy and loca- tion privacy in the online environment. Specifically, I study four topics: identity privacy in online social networks (OSNs), identity privacy in anonymous message submission, lo- cation privacy in location based social networks (LBSNs), and location privacy in location based reminders. In the first topic, I propose a system which can hide users’ identity and data from untrusted storage site where the OSN provider puts users’ data. I also design a fine grained access control mechanism which prevents unauthorized users from accessing the data. Based on the secret sharing scheme, I construct a shuffle protocol that disconnects the relationship between members’ identities and their submitted messages in the topic of identity privacy in anonymous message submission. The message is encrypted on the mem- ber side and decrypted on the message collector side. The collector eventually gets all of the messages but does not know who submitted which message. In the third topic, I pro- pose a framework that hides users’ check-in information from the LBSN. Considering the limited computation resources on smart devices, I propose a delegatable pseudo random function to outsource computations to the much more powerful server while preserving privacy. I also implement efficient revocations. In the topic of location privacy in location based reminders, I propose a system to hide users’ reminder locations from an untrusted cloud server. I propose a cross based approach and an improved bar based approach, re- spectively, to represent a reminder area. The reminder location and reminder message are encrypted before uploading to the cloud server, which then can determine whether the dis- tance between the user’s current location and the reminder location is within the reminder distance without knowing anything about the user’s location information and the content of the reminder message.
Date Created
2015
Agent

SDN-based proactive defense mechanism in a cloud system

153909-Thumbnail Image.png
Description
Cloud computing is known as a new and powerful computing paradigm. This new generation of network computing model delivers both software and hardware as on-demand resources and various services over the Internet. However, the security concerns prevent users from adopting

Cloud computing is known as a new and powerful computing paradigm. This new generation of network computing model delivers both software and hardware as on-demand resources and various services over the Internet. However, the security concerns prevent users from adopting the cloud-based solutions to fulfill the IT requirement for many business critical computing. Due to the resource-sharing and multi-tenant nature of cloud-based solutions, cloud security is especially the most concern in the Infrastructure as a Service (IaaS). It has been attracting a lot of research and development effort in the past few years.

Virtualization is the main technology of cloud computing to enable multi-tenancy.

Computing power, storage, and network are all virtualizable to be shared in an IaaS system. This important technology makes abstract infrastructure and resources available to users as isolated virtual machines (VMs) and virtual networks (VNs). However, it also increases vulnerabilities and possible attack surfaces in the system, since all users in a cloud share these resources with others or even the attackers. The promising protection mechanism is required to ensure strong isolation, mediated sharing, and secure communications between VMs. Technologies for detecting anomalous traffic and protecting normal traffic in VNs are also needed. Therefore, how to secure and protect the private traffic in VNs and how to prevent the malicious traffic from shared resources are major security research challenges in a cloud system.

This dissertation proposes four novel frameworks to address challenges mentioned above. The first work is a new multi-phase distributed vulnerability, measurement, and countermeasure selection mechanism based on the attack graph analytical model. The second work is a hybrid intrusion detection and prevention system to protect VN and VM using virtual machines introspection (VMI) and software defined networking (SDN) technologies. The third work further improves the previous works by introducing a VM profiler and VM Security Index (VSI) to keep track the security status of each VM and suggest the optimal countermeasure to mitigate potential threats. The final work is a SDN-based proactive defense mechanism for a cloud system using a reconfiguration model and moving target defense approaches to actively and dynamically change the virtual network configuration of a cloud system.
Date Created
2015
Agent

Making thin data thick: user behavior analysis with minimum information

153872-Thumbnail Image.png
Description
With the rise of social media, user-generated content has become available at an unprecedented scale. On Twitter, 1 billion tweets are posted every 5 days and on Facebook, 20 million links are shared every 20 minutes. These massive collections of

With the rise of social media, user-generated content has become available at an unprecedented scale. On Twitter, 1 billion tweets are posted every 5 days and on Facebook, 20 million links are shared every 20 minutes. These massive collections of user-generated content have introduced the human behavior's big-data.

This big data has brought about countless opportunities for analyzing human behavior at scale. However, is this data enough? Unfortunately, the data available at the individual-level is limited for most users. This limited individual-level data is often referred to as thin data. Hence, researchers face a big-data paradox, where this big-data is a large collection of mostly limited individual-level information. Researchers are often constrained to derive meaningful insights regarding online user behavior with this limited information. Simply put, they have to make thin data thick.

In this dissertation, how human behavior's thin data can be made thick is investigated. The chief objective of this dissertation is to demonstrate how traces of human behavior can be efficiently gleaned from the, often limited, individual-level information; hence, introducing an all-inclusive user behavior analysis methodology that considers social media users with different levels of information availability. To that end, the absolute minimum information in terms of both link or content data that is available for any social media user is determined. Utilizing only minimum information in different applications on social media such as prediction or recommendation tasks allows for solutions that are (1) generalizable to all social media users and that are (2) easy to implement. However, are applications that employ only minimum information as effective or comparable to applications that use more information?

In this dissertation, it is shown that common research challenges such as detecting malicious users or friend recommendation (i.e., link prediction) can be effectively performed using only minimum information. More importantly, it is demonstrated that unique user identification can be achieved using minimum information. Theoretical boundaries of unique user identification are obtained by introducing social signatures. Social signatures allow for user identification in any large-scale network on social media. The results on single-site user identification are generalized to multiple sites and it is shown how the same user can be uniquely identified across multiple sites using only minimum link or content information.

The findings in this dissertation allows finding the same user across multiple sites, which in turn has multiple implications. In particular, by identifying the same users across sites, (1) patterns that users exhibit across sites are identified, (2) how user behavior varies across sites is determined, and (3) activities that are observed only across sites are identified and studied.
Date Created
2015
Agent

Comparing a commercial and an SDN-based load balancer in a campus network

153754-Thumbnail Image.png
Description
Commercial load balancers are often in use, and the production network at Arizona State University (ASU) is no exception. However, because the load balancer uses IP addresses, the solution does not apply to all applications. One such application is Rsyslog.

Commercial load balancers are often in use, and the production network at Arizona State University (ASU) is no exception. However, because the load balancer uses IP addresses, the solution does not apply to all applications. One such application is Rsyslog. This software processes syslog packets and stores them in files. The loss rate of incoming log packets is high due to the incoming rate of the data. The Rsyslog servers are overwhelmed by the continuous data stream. To solve this problem a software defined networking (SDN) based load balancer is designed to perform a transport-level load balancing over the incoming load to Rsyslog servers. In this solution the load is forwarded to one Rsyslog server at a time, according to one of a Round-Robin, Random, or Load-Based policy. This gives time to other servers to process the data they have received and prevent them from being overwhelmed. The evaluation of the proposed solution is conducted a physical testbed with the same data feed as the commercial solution. The results suggest that the SDN-based load balancer is competitive with the commercial load balancer. Replacing the software OpenFlow switch with a hardware switch is likely to further improve the results.
Date Created
2015
Agent

Economical apects of resource allocation under discounts

153342-Thumbnail Image.png
Description
Resource allocation is one of the most challenging issues policy decision makers must address. The objective of this thesis is to explore the resource allocation from an economical perspective, i.e., how to purchase resources in order to satisfy customers' requests.

Resource allocation is one of the most challenging issues policy decision makers must address. The objective of this thesis is to explore the resource allocation from an economical perspective, i.e., how to purchase resources in order to satisfy customers' requests. In this thesis, we attend to answer the question: when and how to buy resources to fulfill customers' demands with minimum costs?

The first topic studied in this thesis is resource allocation in cloud networks. Cloud computing heralded an era where resources (such as computation and storage) can be scaled up and down elastically and on demand. This flexibility is attractive for its cost effectiveness: the cloud resource price depends on the actual utilization over time. This thesis studies two critical problems in cloud networks, focusing on the economical aspects of the resource allocation in the cloud/virtual networks, and proposes six algorithms to address the resource allocation problems for different discount models. The first problem attends a scenario where the virtual network provider offers different contracts to the service provider. Four algorithms for resource contract migration are proposed under two pricing models: Pay-as-You-Come and Pay-as-You-Go. The second problem explores a scenario where a cloud provider offers k contracts each with a duration and a rate respectively and a customer buys these contracts in order to satisfy its resource demand. This work shows that this problem can be seen as a 2-dimensional generalization of the classic online parking permit problem, and present a k-competitive online algorithm and an optimal online algorithm.

The second topic studied in this thesis is to explore how resource allocation and purchasing strategies work in our daily life. For example, is it worth buying a Yoga pass which costs USD 100 for ten entries, although it will expire at the end of this year? Decisions like these are part of our daily life, yet, not much is known today about good online strategies to buy discount vouchers with expiration dates. This work hence introduces a Discount Voucher Purchase Problem (DVPP). It aims to optimize the strategies for buying discount vouchers, i.e., coupons, vouchers, groupons which are valid only during a certain time period. The DVPP comes in three flavors: (1) Once Expire Lose Everything (OELE): Vouchers lose their entire value after expiration. (2) Once Expire Lose Discount (OELD): Vouchers lose their discount value after expiration. (3) Limited Purchasing Window (LPW): Vouchers have the property of OELE and can only be bought during a certain time window.

This work explores online algorithms with a provable competitive ratio against a clairvoyant offline algorithm, even in the worst case. In particular, this work makes the following contributions: we present a 4-competitive algorithm for OELE, an 8-competitive algorithm for OELD, and a lower bound for LPW. We also present an optimal offline algorithm for OELE and LPW, and show it is a 2-approximation solution for OELD.
Date Created
2015
Agent

Computing distrust in social media

153339-Thumbnail Image.png
Description
A myriad of social media services are emerging in recent years that allow people to communicate and express themselves conveniently and easily. The pervasive use of social media generates massive data at an unprecedented rate. It becomes increasingly difficult for

A myriad of social media services are emerging in recent years that allow people to communicate and express themselves conveniently and easily. The pervasive use of social media generates massive data at an unprecedented rate. It becomes increasingly difficult for online users to find relevant information or, in other words, exacerbates the information overload problem. Meanwhile, users in social media can be both passive content consumers and active content producers, causing the quality of user-generated content can vary dramatically from excellence to abuse or spam, which results in a problem of information credibility. Trust, providing evidence about with whom users can trust to share information and from whom users can accept information without additional verification, plays a crucial role in helping online users collect relevant and reliable information. It has been proven to be an effective way to mitigate information overload and credibility problems and has attracted increasing attention.

As the conceptual counterpart of trust, distrust could be as important as trust and its value has been widely recognized by social sciences in the physical world. However, little attention is paid on distrust in social media. Social media differs from the physical world - (1) its data is passively observed, large-scale, incomplete, noisy and embedded with rich heterogeneous sources; and (2) distrust is generally unavailable in social media. These unique properties of social media present novel challenges for computing distrust in social media: (1) passively observed social media data does not provide necessary information social scientists use to understand distrust, how can I understand distrust in social media? (2) distrust is usually invisible in social media, how can I make invisible distrust visible by leveraging unique properties of social media data? and (3) little is known about distrust and its role in social media applications, how can distrust help make difference in social media applications?

The chief objective of this dissertation is to figure out solutions to these challenges via innovative research and novel methods. In particular, computational tasks are designed to {\it understand distrust}, a innovative task, i.e., {\it predicting distrust} is proposed with novel frameworks to make invisible distrust visible, and principled approaches are develop to {\it apply distrust} in social media applications. Since distrust is a special type of negative links, I demonstrate the generalization of properties and algorithms of distrust to negative links, i.e., {\it generalizing findings of distrust}, which greatly expands the boundaries of research of distrust and largely broadens its applications in social media.
Date Created
2015
Agent

Personalized POI recommendation on location-based social networks

153140-Thumbnail Image.png
Description
The rapid urban expansion has greatly extended the physical boundary of our living area, along with a large number of POIs (points of interest) being developed. A POI is a specific location (e.g., hotel, restaurant, theater, mall) that a user

The rapid urban expansion has greatly extended the physical boundary of our living area, along with a large number of POIs (points of interest) being developed. A POI is a specific location (e.g., hotel, restaurant, theater, mall) that a user may find useful or interesting. When exploring the city and neighborhood, the increasing number of POIs could enrich people's daily life, providing them with more choices of life experience than before, while at the same time also brings the problem of "curse of choices", resulting in the difficulty for a user to make a satisfied decision on "where to go" in an efficient way. Personalized POI recommendation is a task proposed on purpose of helping users filter out uninteresting POIs and reduce time in decision making, which could also benefit virtual marketing.

Developing POI recommender systems requires observation of human mobility w.r.t. real-world POIs, which is infeasible with traditional mobile data. However, the recent development of location-based social networks (LBSNs) provides such observation. Typical location-based social networking sites allow users to "check in" at POIs with smartphones, leave tips and share that experience with their online friends. The increasing number of LBSN users has generated large amounts of LBSN data, providing an unprecedented opportunity to study human mobility for personalized POI recommendation in spatial, temporal, social, and content aspects.

Different from recommender systems in other categories, e.g., movie recommendation in NetFlix, friend recommendation in dating websites, item recommendation in online shopping sites, personalized POI recommendation on LBSNs has its unique challenges due to the stochastic property of human mobility and the mobile behavior indications provided by LBSN information layout. The strong correlations between geographical POI information and other LBSN information result in three major human mobile properties, i.e., geo-social correlations, geo-temporal patterns, and geo-content indications, which are neither observed in other recommender systems, nor exploited in current POI recommendation. In this dissertation, we investigate these properties on LBSNs, and propose personalized POI recommendation models accordingly. The performance evaluated on real-world LBSN datasets validates the power of these properties in capturing user mobility, and demonstrates the ability of our models for personalized POI recommendation.
Date Created
2014
Agent

Privacy preserving controls for Android applications

153094-Thumbnail Image.png
Description
Android is currently the most widely used mobile operating system. The permission model in Android governs the resource access privileges of applications. The permission model however is amenable to various attacks, including re-delegation attacks, background snooping attacks and disclosure of

Android is currently the most widely used mobile operating system. The permission model in Android governs the resource access privileges of applications. The permission model however is amenable to various attacks, including re-delegation attacks, background snooping attacks and disclosure of private information. This thesis is aimed at understanding, analyzing and performing forensics on application behavior. This research sheds light on several security aspects, including the use of inter-process communications (IPC) to perform permission re-delegation attacks.

Android permission system is more of app-driven rather than user controlled, which means it is the applications that specify their permission requirement and the only thing which the user can do is choose not to install a particular application based on the requirements. Given the all or nothing choice, users succumb to pressures and needs to accept permissions requested. This thesis proposes a couple of ways for providing the users finer grained control of application privileges. The same methods can be used to evade the Permission Re-delegation attack.

This thesis also proposes and implements a novel methodology in Android that can be used to control the access privileges of an Android application, taking into consideration the context of the running application. This application-context based permission usage is further used to analyze a set of sample applications. We found the evidence of applications spoofing or divulging user sensitive information such as location information, contact information, phone id and numbers, in the background. Such activities can be used to track users for a variety of privacy-intrusive purposes. We have developed implementations that minimize several forms of privacy leaks that are routinely done by stock applications.
Date Created
2014
Agent

Establishing the software-defined networking based defensive system in clouds

153029-Thumbnail Image.png
Description
Cloud computing is regarded as one of the most revolutionary technologies in the past decades. It provides scalable, flexible and secure resource provisioning services, which is also the reason why users prefer to migrate their locally processing workloads onto

Cloud computing is regarded as one of the most revolutionary technologies in the past decades. It provides scalable, flexible and secure resource provisioning services, which is also the reason why users prefer to migrate their locally processing workloads onto remote clouds. Besides commercial cloud system (i.e., Amazon EC2), ProtoGENI and PlanetLab have further improved the current Internet-based resource provisioning system by allowing end users to construct a virtual networking environment. By archiving the similar goal but with more flexible and efficient performance, I present the design and implementation of MobiCloud that is a geo-distributed mobile cloud computing platform, and G-PLaNE that focuses on how to construct the virtual networking environment upon the self-designed resource provisioning system consisting of multiple geo-distributed clusters. Furthermore, I conduct a comprehensive study to layout existing Mobile Cloud Computing (MCC) service models and corresponding representative related work. A new user-centric mobile cloud computing service model is proposed to advance the existing mobile cloud computing research.

After building the MobiCloud, G-PLaNE and studying the MCC model, I have been using Software Defined Networking (SDN) approaches to enhance the system security in the cloud virtual networking environment. I present an OpenFlow based IPS solution called SDNIPS that includes a new IPS architecture based on Open vSwitch (OVS) in the cloud software-based networking environment. It is enabled with elasticity service provisioning and Network Reconfiguration (NR) features based on POX controller. Finally, SDNIPS demonstrates the feasibility and shows more efficiency than traditional approaches through a thorough evaluation.

At last, I propose an OpenFlow-based defensive module composition framework called CloudArmour that is able to perform query, aggregation, analysis, and control function over distributed OpenFlow-enabled devices. I propose several modules and use the DDoS attack as an example to illustrate how to composite the comprehensive defensive solution based on CloudArmour framework. I introduce total 20 Python-based CloudArmour APIs. Finally, evaluation results prove the feasibility and efficiency of CloudArmour framework.
Date Created
2014
Agent

An SDN-based IPS development framework in cloud networking environment

152956-Thumbnail Image.png
Description
Security has been one of the top concerns in cloud community while cloud resource abuse and malicious insiders are considered as top threats. Traditionally, Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS) have been widely deployed to manipulate cloud

Security has been one of the top concerns in cloud community while cloud resource abuse and malicious insiders are considered as top threats. Traditionally, Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS) have been widely deployed to manipulate cloud security, with the latter one providing additional prevention capability. However, as one of the most creative networking technologies, Software-Defined Networking (SDN) is rarely used to implement IDPS in the cloud computing environment because the lack of comprehensive development framework and processing flow. Simply migration from traditional IDS/IPS systems to SDN environment are not effective enough for detecting and defending malicious attacks. Hence, in this thesis, we present an IPS development framework to help user easily design and implement their defensive systems in cloud system by SDN technology. This framework enables SDN approaches to enhance the system security and performance. A Traffic Information Platform (TIP) is proposed as the cornerstone with several upper layer security modules such as Detection, Analysis and Prevention components. Benefiting from the flexible, compatible and programmable features of SDN, Customized Detection Engine, Network Topology Finder, Source Tracer and further user-developed security appliances are plugged in our framework to construct a SDN-based defensive system. Two main categories Python-based APIs are designed to support developers for further development. This system is designed and implemented based on the POX controller and Open vSwitch in the cloud computing environment. The efficiency of this framework is demonstrated by a sample IPS implementation and the performance of our framework is also evaluated.
Date Created
2014
Agent