Reports and Papers Archive
Modern information systems are distributed and highly dynamic. They comprise a number of hosts from heterogeneous domains, which collaborate, interact, and share data to handle client requests. Examples include cloud-hosted solutions, service-oriented architectures, electronic healthcare systems, product lifecycle management systems, and so on. A client request translates into multiple internal interactions involving different parties; each party can access and further share the client’s data. However, such interactions may share data with unauthorized parties and violate the client’s disclosure policies. In this case, the client has no knowledge of or control over interactions beyond its trust domain; therefore, the client has no means of detecting violations. Opaque data sharing in such distributed systems introduces new security challenges not present in the traditional systems. Existing solutions provide point-to-point secure data transmission and ensure security within a single domain, but are insufficient for distributed data dissemination because of the involvement of multiple cross-domain parties. This dissertation addresses the problem of policy-based distributed data dissemination (PD3) and proposes a data-centric solution for end-to-end secure data disclosure in distributed interactions. The solution ensures that the data are distributed along with the policies that dictate data access and an execution monitor (a policy evaluation and enforcement mechanism) that controls data disclosure and protects data dissemination throughout the interaction lifecycle. It empowers data owners with control of data disclosure decisions outside their trust domains and reduces the risk of unauthorized access. This dissertation makes the following contributions. First, it presents a formal description of the PD3 problem and identifies the main requirements for a new solution. Second, it introduces EPICS, an extensible framework for enforcing policies in composite web services, and describes its design, implementation, and evaluation. Third, it demonstrates a novel application of the proposed solution to address privacy and identity management in cloud computing.
DDoS attacks are increasingly becoming a major problem. According to Arbor Networks, the largest DDoS attack reported by a respondent in 2015 was 500 Gbps. Hacker News stated that the largest DDoS attack as of March 2016 was over 600 Gbps, and the attack targeted the entire BBC website.
With this increasing frequency and threat, and the average DDoS attack duration at about 16 hours, we know for certain that DDoS attacks will not be going away anytime soon. Commercial companies are not effectively providing mitigation techniques against these attacks, considering that major corporations face the same challenges. Current security appliances are not strong enough to handle the overwhelming traffic that accompanies current DDoS attacks. There is also a limited research on solutions to mitigate DDoS attacks. Therefore, there is a need for a means of mitigating DDoS attacks in order to minimize downtime. One possible solution is for organizations to implement their own architectures that are meant to mitigate DDoS attacks.
In this dissertation, we presented and implemented an architecture that utilizes an activity monitor to change the states of firewalls based on their performance in a hybrid network. Both firewalls are connected inline. The monitor is mirrored to monitor the firewall states. The monitor reroutes traffic when one of the firewalls becomes overwhelmed due to a HTTP DDoS flooding attack. The monitor connects to the API of both firewalls. The communication between the firewalls and monitor is encrypted using AES, based on PyCrypto python implementation.
This dissertation is structured in three parts. The first part found the weakness of the hardware firewall and determined its threshold based on spike and endurance tests. This was achieved by flooding the hardware firewall with HTTP packets until the firewall became overwhelmed and unresponsive. The second part implements the same test as the first but targeted towards the virtual firewall. The same parameters, test factors, and determinants were used; however, a different load tester was utilized. The final part was the implementation and design of the firewall performance monitor.
The main goal of the dissertation is to minimize downtime when network firewalls are overwhelmed as a result of a DDoS attack.
Hypergames are a branch of game theory used to model and analyze game theoretic conflicts between multiple players who may have misconceptions of the other players’ actions or preferences. They have been used to model military conflicts such as the Allied invasion of Normandy in 1945 , the fall of France in WWII , and the Cuban missile crisis . Unlike traditional game theory models, hypergames give us the ability to model misperceptions that result from the use of deception, mimicry, and misinformation. In the security world, there is little work that shows how to use deception in a principled manner as a strategic defensive mechanism in computing systems. In this paper, we present how hypergames model deception in computer security conflicts. We discuss how hypergames can be used to model the interaction between adversaries and system defenders. We discuss a specific example of modeling a system where an insider adversary wishes to steal some confidential data from an enterprise and a security administrator is protecting the system. We show the advantages of incorporating deception as a defense mechanism.
 M. A. Takahashi, N. M. Fraser, and K. W. Hipel. A Procedure for Analyzing Hypergames. European Journal of Operational Research, 18:111–122, 1984
 P. G. Bennett and M. R. Dando. Complex Strategic Analysis: A Hypergame Study of the Fall of France. Journal of the Operational Research Society, 30(1):23–32, 1979.
 N. Fraser and K. Hipel. Conflict analysis: Models and Resolutions. North-Holland Series in System Science and Engineering. North-Holland, 1984.
This dissertation investigates whether or not malicious phishing emails are detected better when a meaningful representation of the email bodies is available. The natural language processing theory of Ontological Semantics Technology is used for its ability to model the knowledge representation present in the email messages. Known good and phishing emails were analyzed and their meaning representations fed into machine learning binary classifiers. Unigram language models of the same emails were used as a baseline for comparing the performance of the meaningful data. The end results show how a binary classifier trained on meaningful data is better at detecting phishing emails than a unigram language model binary classifier at least using some of the selected machine learning algorithms.
Gap Analysis Identifying the Current State of Information Security within Organizations Working with Victims of Violence
Around the world, domestic violence, human trafficking, and stalking affect millions of lives every day. According to a report published by the Center for Disease Control and Prevention in January 2015, every minute 20 people fall victim to physical violence perpetrated by an intimate partner in the United States (US). As offenders use advancements in technology to perpetuate abuse and isolate victims, the scale of services provided by crisis organizations must rise to meet the demand while keeping a close eye on potential digital security vulnerabilities. It has been reported in general media and research that phishing emails, social engineering attacks, denial of service attacks, and other data breaches are gaining popularity and affecting business environments of all sizes and in any sector, including organizations dedicated to working with victims of violence.
To address this, an exploratory research study to identify the current state of information security within the US-based non-profit crisis organizations was conducted. This study identified the gaps between a theoretical maximum level of information security and the observed level of information security in organizations working with victims of violence inspired by a recognized and respected framework, National Institute of Standards and Technology (NIST) Cybersecurity Framework. This research establishes the critical foundation for researchers, security professionals, technology companies, and crisis organizations to develop assessment tools, technology solutions, training curriculum, awareness programs, and other strategic initiatives specific to crisis organizations and other non-profit organizations to aid them in improving information security for themselves and the victims they serve.
Cyber breaches are increasing in frequency and scope on a regular basis. The targeted systems include both commercial and governmental networks. As the threat of these breaches rises, the public sector and private industry seek solutions that stop to the ones responsible for the attacks. While all would agree that organizations have the right to protect their networks from these cyber-attacks, the options for defending networks are not quite as clear. Few would question that a passive defense (i.e. the filtering of traffic, rejecting packets based on the source, etc.) is well within the realm of options open to a defender. What active defensive measures are ethically available to the defenders when passive options fail to stop a persistent threat is not as clear. This paper outlines the two (law enforcement and military) ethical frameworks commonly applied by cyber security professionals when considering the option of a cyber counter-offensive or “hacking back.” This examination includes current applicable literature in the fields of information security, international law, and information assurance ethics.
This work investigates whether or not the semantic representation of an email’s content is more useful than the surface features of the text in classifying an email as a phishing attack email or not. A series of experiments were conducted using machine learning binary classifiers to measure the performance of the competing approaches. The conclusion is that semantic information is just as good if not better in every case than text surface features.
Cyber crime is a growing problem, with the impact to both businesses and individuals increasing exponentially, but the ability of law enforcement agencies to investigate and successfully prosecute criminals for these crimes is unclear. Many national needs assessments were conducted in the late 1990’s and early 2000’s by the Department of Justice (DOJ) and the National Institute of Justice (NIJ), which all indicated that state and local law enforcement did not have the training, tools, or staff to effectively conduct digital investigations (Institute for Security and Technology Studies [ISTS], 2002; NIJ, 2004). Additionally, there have been some studies conducted at the state level, however, to date, none have been conducted in Indiana (Gogolin & Jones, 2010). A quick search of the Internet located multiple training opportunities and publications that are available at no cost to state and local law enforcement, but it is not clear how many agencies use these resources (“State, Local, & Tribal” for FLETC, n.d.; https://www.ncfi. usss.gov). This study provided a current and localized assessment of the ability of Indiana law enforcement agencies to effectively investigate when a crime that involves digital evidence is alleged to have occurred, the availability of training for both law enforcement officers and prosecuting attorneys, and the ability of prosecuting attorneys to pursue and vii obtain convictions in cases involving digital evidence. Through an analysis of the survey responses by Indiana law enforcement agencies and prosecutors’ offices, it is evident that Indiana agencies have improved their ability to investigate crimes with digital evidence, with more than half with employees on staff who have attended a digital forensic training course within the past five years. However, a large majority of the agencies still perceive their abilities to investigate crimes with digital evidence in the mid-range or lower. The results support the recommendation that a comprehensive resource guide needs to be made available that the agencies can use to locate experts, obtain assistance with standard operating procedures, learn about free training courses, and find funding opportunities to increase their capabilities in investigating crimes involving digital evidence.
The Impact of Mobile Network Forensics Evidence on the Criminal Case Processing Performance in Macedonia: An Institutional Analysis Study
The purpose of this study was to explore the contribution of the localization data, network-management data, and content-of-communication data in the case processing performance in Macedonia. The mobile network forensics evidence was analyzed respective to the impact of the mobile network data variety, the mobile network data volume, and the forensic processing on the case disposition time. The results from this study indicate that the case disposition time is negatively correlated with the network- management data volume and positively correlated with the content-of-communication data volume. The relevance of the network-management data was recognized in the highly granular service behavior profile developed using larger number of records, while the relevance of the content-of-communication data was recognized in the substantial number of excerpts of intercepted communication. The results also reveal a difference in the case processing time for the cases where there is only localization or network- management data versus when they are combined with the content-of-communication data.
Issues of privacy in communication are becoming increasingly important. For many people and businesses, the use of strong cryptographic protocols is sufficient to protect their communications. However, the overt use of strong cryptography may be prohibited or individual entities may be prohibited from communicating directly. In these cases, a secure alternative to the overt use of strong cryptography is re- quired. One promising alternative is to hide the use of cryptography by transforming ciphertext into innocuous-seeming messages to be transmitted in the clear. In this dissertation, we consider the problem of synthetic steganography: generat- ing and detecting covert channels in generated media. We start by demonstrating how to generate synthetic time series data that not only mimic an authentic source of the data, but also hide data at any of several different locations in the reversible genera- tion process. We then design a steganographic context-sensitive tiling system capable of hiding secret data in a variety of procedurally-generated multimedia objects. Next, we show how to securely hide data in the structure of a Huffman tree without affecting the length of the codes. Next, we present a method for hiding data in Sudoku puzzles, both in the solved board and the clue configuration. Finally, we present a general framework for exploiting steganographic capacity in structured interactions like on- line multiplayer games, network protocols, auctions, and negotiations. Recognizing that structured interactions represent a vast field of novel media for steganography, we also design and implement an open-source extensible software testbed for analyz- x
ing steganographic interactions and use it to measure the steganographic capacity of several classic games. We analyze the steganographic capacity and security of each method that we present and show that existing steganalysis techniques cannot accurately detect the usage of the covert channels. We develop targeted steganalysis techniques which improve detection accuracy and then use the insights gained from those methods to improve the security of the steganographic systems. We find that secure synthetic steganography, and accurate steganalysis thereof, depends on having access to an accurate model of the cover media.
Data privacy in social networks is a growing concern that threatens to limit access to important information contained in these data structures. Analysis of the graph structure of social networks can provide valuable information for revenue generation and social science research, but unfortunately, ensuring this analysis does not violate individual privacy is difficult. Simply removing obvious identifiers from graphs or even releasing only aggregate results of analysis may not provide sufficient protection. Dif- ferential privacy is an alternative privacy model, popular in data-mining over tabular data, that uses noise to obscure individuals’ contributions to aggregate results and offers a strong mathematical guarantee that individuals’ presence in the data-set is hidden. Analyses that were previously vulnerable to identification of individuals and extraction of private data may be safely released under differential-privacy guaran- tees. However, existing adaptations of differential privacy to social network analysis are often complex and have considerable impact on the utility of the results, making it less likely that they will see widespread adoption in the social network analysis world. In fact, social scientists still often use the weakest form of privacy protection, simple anonymization, in their social network analysis publications, [1–6]. We review the existing work in graph-privatization, including the two existing standards for adapting differential privacy to network data. We then propose contributor-privacy and partition-privacy, novel standards for differential privacy over network data, and introduce simple, powerful private algorithms using these stan- dards for common network analysis techniques that were infeasible to privatize under previous differential privacy standards. We also ensure that privatized social net- x
work analysis does not violate the level of rigor required in social science research, by proposing a method of determining statistical significance for paired samples under differential privacy using the Wilcoxon Signed-Rank Test, which is appropriate for non-normally distributed data. Finally, we return to formally consider the case where differential privacy is not applied to data. Naive, deterministic approaches to privacy protection, including anonymization and aggregation of data, are often used in real world practice. De- anonymization research demonstrates that some naive approaches to privacy are highly vulnerable to reidentification attacks, and none of these approaches offer the robust guarantee of differential privacy. However, we propose that these methods fall across a range of protection: Some are better than others. In cases where adding noise to data is especially problematic, or acceptance and adoption of differential privacy is especially slow, it is critical to have a formal understanding of the alternatives. We define De Facto Privacy, a metric for comparing the relative privacy protection provided by deterministic approaches.
Personal identification is needed in many civil activities, and the common identification cards, such as a driver’s license, have become the standard document de facto. Radio frequency identification has complicated this matter. Unlike their printed predecessors, contemporary RFID cards lack a practical way for users to control access to their individual fields of data. This leaves them more available to unauthorized parties, and more prone to abuse. Here, then was undertaken a means to test a novel RFID card technology that allows overlays to be used for reliable, reversible data access settings. Similar to other proposed switching mechanisms, it offers advantages that may greatly improve outcomes. RFID use is increasing in identity documents such as drivers’ licenses and passports, and with it concern over the theft of personal information, which can enable unauthorized tracking or fraud. Effort put into designing a strong foundation technology now may allow for widespread development on them later. In this dissertation, such a technology was designed and constructed, to drive the central thesis that selective detuning could serve as a feasible, reliable mechanism. The concept had been illustrated effective in limiting access to all fields simultaneously before, and was here effective in limiting access to specific fields selectively. A novel card was produced in familiar dimensions, with an intuitive interface by which users may conceal the visible print of the card to conceal the wireless emissions it allows. A discussion was included of similar technologies, involving capacitive switching, that could further improve the outcomes if such a product were put to large-scale commercial fabrication. xvi The card prototype was put to a battery of laboratory tests to measure the degree of independence between data fields and the reliability of the switching mechanism when used under realistically variable coverage, demonstrating statistically consistent performance in both. The success rate of RFID card read operations, which are already greater than 99.9%, were exceeded by the success rate of selection using the featured technology. With controls in place for the most influential factors related to card readability (namely the distance from the reader antennas and the orientation of the card antenna with respect to them), the card was shown to completely resist data acquisition from unauthorized fields while allowing unimpeded access to authorized fields, even after thousands of varied attempts. The effect was proven to be temporary and reversible. User intervention allowed for the switching to occur in a matter of seconds by sliding a conductive sleeve or applying tape to regions of the card. Strategies for widespread implementation were discussed, emphasizing factors that included cost, durability, size, simplicity, and familiarity, all of which arise in card management decisions for common state and national identification such as a driver’s license. The relationship between the card and external database systems was detailed, as no such identification document could function in isolation. A practical solution involving it will include details of how multiple fields will be written to the card and separated sufficiently in external databases so as to allow for user-directed selection of data field disclosure. Opportunities for implementation in corporate and academic environments were discussed, along with the ways in which this technology could invite further investigation.
In this work we present a simple, yet effective and practical, scheme to improve the security of stored password hashes, rendering their cracking detectable and insuperable at the same time. We utilize a machine-dependent function, such as a physically unclonable function (PUF) or a hardware security module (HSM) at the authentication server to prevent off-site password discovery, and a deception mechanism to alert us if such an action is attempted. Our scheme can be easily integrated with legacy systems without the need of any additional servers, changing the structure of the hashed password file or any client modifications. When using the scheme the structure of the hashed passwords file, etc/shadow or etc/master.passwd, will appear no different than in the traditional scheme. However, when an attacker exfiltrates the hashed passwords file and tries to crack it, the only passwords he will get are the ersatzpasswords — the “fake passwords”. When an attempt to login using these ersatzpasswords is detected an alarm will be triggered in the system. Even with an adversary who knows about the scheme, cracking cannot be launched without physical ac- cess to the authentication server. The scheme also includes a secure backup mechanism in the event of a failure of the hardware dependent function. We discuss our implementation and provide some discussion in comparison to the traditional authentication scheme.
This paper describes a new version of the Network File System (NFS) that supports access to files larger than 4GB and increases sequential write throughput seven fold when compared to unaccelerated NFS Version 2. NFS Version 3 maintains the stateless server design and simple crash recovery of NFS Version 2, and the philosophy of building a distributed file service from cooperating protocols. We describe the protocol and its implementation, and provide initial performance measurements. We then describe the implementation effort. Finally, we contrast this work with other distributed file systems and discuss future revisions of NFS.