Perspective Scaling and Trait Detection on Social Media Data

156475-Thumbnail Image.png
Description
This research start utilizing an efficient sparse inverse covariance matrix (precision matrix) estimation technique to identify a set of highly correlated discriminative perspectives between radical and counter-radical groups. A ranking system has been developed that utilizes ranked perspectives to ma

This research start utilizing an efficient sparse inverse covariance matrix (precision matrix) estimation technique to identify a set of highly correlated discriminative perspectives between radical and counter-radical groups. A ranking system has been developed that utilizes ranked perspectives to map Islamic organizations on a set of socio-cultural, political and behavioral scales based on their web site corpus. Simultaneously, a gold standard ranking of these organizations was created through domain experts and compute expert-to-expert agreements and present experimental results comparing the performance of the QUIC based scaling system to another baseline method for organizations. The QUIC based algorithm not only outperforms the baseline methods, but it is also the only system that consistently performs at area expert-level accuracies for all scales. Also, a multi-scale ideological model has been developed and it investigates the correlates of Islamic extremism in Indonesia, Nigeria and UK. This analysis demonstrate that violence does not correlate strongly with broad Muslim theological or sectarian orientations; it shows that religious diversity intolerance is the only consistent and statistically significant ideological correlate of Islamic extremism in these countries, alongside desire for political change in UK and Indonesia, and social change in Nigeria. Next, dynamic issues and communities tracking system based on NMF(Non-negative Matrix Factorization) co-clustering algorithm has been built to better understand the dynamics of virtual communities. The system used between Iran and Saudi Arabia to build and apply a multi-party agent-based model that can demonstrate the role of wedges and spoilers in a complex environment where coalitions are dynamic. Lastly, a visual intelligence platform for tracking the diffusion of online social movements has been developed called LookingGlass to track the geographical footprint, shifting positions and flows of individuals, topics and perspectives between groups. The algorithm utilize large amounts of text collected from a wide variety of organizations’ media outlets to discover their hotly debated topics, and their discriminative perspectives voiced by opposing camps organized into multiple scales. Discriminating perspectives is utilized to classify and map individual Tweeter’s message content to social movements based on the perspectives expressed in their tweets.
Date Created
2018
Agent

Detecting Frames and Causal Relationships in Climate Change Related Text Databases Based on Semantic Features

156205-Thumbnail Image.png
Description
The subliminal impact of framing of social, political and environmental issues such as climate change has been studied for decades in political science and communications research. Media framing offers an “interpretative package" for average citizens on how to make sense

The subliminal impact of framing of social, political and environmental issues such as climate change has been studied for decades in political science and communications research. Media framing offers an “interpretative package" for average citizens on how to make sense of climate change and its consequences to their livelihoods, how to deal with its negative impacts, and which mitigation or adaptation policies to support. A line of related work has used bag of words and word-level features to detect frames automatically in text. Such works face limitations since standard keyword based features may not generalize well to accommodate surface variations in text when different keywords are used for similar concepts.

This thesis develops a unique type of textual features that generalize triplets extracted from text, by clustering them into high-level concepts. These concepts are utilized as features to detect frames in text. Compared to uni-gram and bi-gram based models, classification and clustering using generalized concepts yield better discriminating features and a higher classification accuracy with a 12% boost (i.e. from 74% to 83% F-measure) and 0.91 clustering purity for Frame/Non-Frame detection.

The automatic discovery of complex causal chains among interlinked events and their participating actors has not yet been thoroughly studied. Previous studies related to extracting causal relationships from text were based on laborious and incomplete hand-developed lists of explicit causal verbs, such as “causes" and “results in." Such approaches result in limited recall because standard causal verbs may not generalize well to accommodate surface variations in texts when different keywords and phrases are used to express similar causal effects. Therefore, I present a system that utilizes generalized concepts to extract causal relationships. The proposed algorithms overcome surface variations in written expressions of causal relationships and discover the domino effects between climate events and human security. This semi-supervised approach alleviates the need for labor intensive keyword list development and annotated datasets. Experimental evaluations by domain experts achieve an average precision of 82%. Qualitative assessments of causal chains show that results are consistent with the 2014 IPCC report illuminating causal mechanisms underlying the linkages between climatic stresses and social instability.
Date Created
2018
Agent

Improving Usability and Adoption of Tablet-based Electronic Health Record (EHR) Applications

156121-Thumbnail Image.png
Description
The technological revolution has caused the entire world to migrate to a digital environment and health care is no exception to this. Electronic Health Records (EHR) or Electronic Medical Records (EMR) are the digital repository for health data of patients.

The technological revolution has caused the entire world to migrate to a digital environment and health care is no exception to this. Electronic Health Records (EHR) or Electronic Medical Records (EMR) are the digital repository for health data of patients. Nation wide efforts have been made by the federal government to promote the usage of EHRs as they have been found to improve quality of health service. Although EHR systems have been implemented almost everywhere, active use of EHR applications have not replaced paper documentation. Rather, they are often used to store transcribed data from paper documentation after each clinical procedure. This process is found to be prone to errors such as data omission, incomplete data documentation and is also time consuming. This research aims to help improve adoption of real-time EHRs usage while documenting data by improving the usability of an iPad based EHR application that is used during resuscitation process in the intensive care unit. Using Cognitive theories and HCI frameworks, this research identified areas of improvement and customizations in the application that were required to exclusively match the work flow of the resuscitation team at the Mayo Clinic. In addition to this, a Handwriting Recognition Engine (HRE) was integrated into the application to support a stylus based information input into EHR, which resembles our target users’ traditional pen and paper based documentation process. The EHR application was updated and then evaluated with end users at the Mayo clinic. The users found the application to be efficient, usable and they showed preference in using this application over the paper-based documentation.
Date Created
2018
Agent

STUDY GENIE: An Analysis of a Web-based Note-Sharing and Cheat Sheet Tool

134905-Thumbnail Image.png
Description
Research has shown that the cheat sheet preparation process helps students with performance in exams. However, results have been inconclusive in determining the most effective guiding principles in creating and using cheat sheets. The traditional method of collecting and annotating

Research has shown that the cheat sheet preparation process helps students with performance in exams. However, results have been inconclusive in determining the most effective guiding principles in creating and using cheat sheets. The traditional method of collecting and annotating cheat sheets is time consuming and exhaustive, and fails to capture students' preparation process. This thesis examines the development and usage of a new web-based cheat sheet creation tool, Study Genie, and its effects on student performance in an introductory computer science and programming course. Results suggest that actions associated with editing and organizing cheat sheets are positively correlated with exam performance, and that there is a significant difference between the activity of high-performing and low-performing students. Through these results, Study Genie presents itself as an opportunity for mass data collection and to provide insight into the assembly process rather than just the finished product in cheat sheet creation.
Date Created
2016-12
Agent

Web-Based Classroom Tool for Beginner Java Classes

134533-Thumbnail Image.png
Description
Learning to program is no easy task, and many students experience their first programming during their university education. Unfortunately, programming classes have a large number of students enrolled, so it is nearly impossible for professors to associate with the students

Learning to program is no easy task, and many students experience their first programming during their university education. Unfortunately, programming classes have a large number of students enrolled, so it is nearly impossible for professors to associate with the students at an individual level and provide the personal attention each student needs. This project aims to provide professors with a tool to quickly respond to the current understanding of the students. This web-based application gives professors the control to quickly ask Java programming questions, and the ability to see the aggregate data on how many of the students have successfully completed the assigned questions. With this system, the students are provided with extra programming practice in a controlled environment, and if there is an error in their program, the system will provide feedback describing what the error means and what steps the student can take to fix it.
Date Created
2017-05
Agent

A timeline extraction approach to derive drug usage patterns in pregnant women using social media

154641-Thumbnail Image.png
Description
Proliferation of social media websites and discussion forums in the last decade has resulted in social media mining emerging as an effective mechanism to extract consumer patterns. Most research on social media and pharmacovigilance have concentrated on

Adverse Drug Reaction (ADR)

Proliferation of social media websites and discussion forums in the last decade has resulted in social media mining emerging as an effective mechanism to extract consumer patterns. Most research on social media and pharmacovigilance have concentrated on

Adverse Drug Reaction (ADR) identification. Such methods employ a step of drug search followed by classification of the associated text as consisting an ADR or not. Although this method works efficiently for ADR classifications, if ADR evidence is present in users posts over time, drug mentions fail to capture such ADRs. It also fails to record additional user information which may provide an opportunity to perform an in-depth analysis for lifestyle habits and possible reasons for any medical problems.

Pre-market clinical trials for drugs generally do not include pregnant women, and so their effects on pregnancy outcomes are not discovered early. This thesis presents a thorough, alternative strategy for assessing the safety profiles of drugs during pregnancy by utilizing user timelines from social media. I explore the use of a variety of state-of-the-art social media mining techniques, including rule-based and machine learning techniques, to identify pregnant women, monitor their drug usage patterns, categorize their birth outcomes, and attempt to discover associations between drugs and bad birth outcomes.

The technique used models user timelines as longitudinal patient networks, which provide us with a variety of key information about pregnancy, drug usage, and post-

birth reactions. I evaluate the distinct parts of the pipeline separately, validating the usefulness of each step. The approach to use user timelines in this fashion has produced very encouraging results, and can be employed for a range of other important tasks where users/patients are required to be followed over time to derive population-based measures.
Date Created
2016
Agent