Alexander N. Mikoyan,
Using CBR Techniques to Detect Plagiarism in Computing
Abstract: The problems of case retrieval in CBR and plagiarism detection have in common a need to detect close but not exact matches between exemplars. In this paper we describe a plagiarism detection system that has been inspired by ideas from CBR research. In particular this system can detect similarities between programs without performing exhaustive comparisons on all exemplars. Our analysis of similarity in this well controlled domain offers some insights into the kinds of profiles that can be used in similarity assessment in general. We argue that the choice of a perspicuous profile is crucial to any classification task and determining the best predictive features may require significant analysis of the problem domain.
Authorship Analysis: Identifying The Author Of A
Abstract: In this paper we show that it is possible to identify the author of a piece of software by looking at stylistic characteristics of C source code. We also show that there exist a set of characteristics within a program that are helpful in the identification of a programmer, and whose computation can be automated with a reasonable cost. There are four areas that benefit directly from the findings we present herein: the legal community can count on empirical evidence to support authorship claims, the academic community can count on evidence that supports authorship claims of students, industry can count on identifying the author of previously un-identifiable software modules, and real time intrusion detection systems can be enhanced to include information regarding the authorship of all locally compiled programs. We show that it is possible to identify the author of a piece of software by collecting and identifying eighty-eight programs for twenty nine students, staff and faculty members at Purdue University.
Ivan Krsul Eugene H.
Authorship Analysis: Identifying The Author of a
Keywords: authorhip analysis, authentication
Abstract: Authorship analysis on computer software is a difficult problem. In this paper we explore the classification of programmers' style, and try to find a set of characteristics that remain constant for a significant portion of the programs that this programmer might produce. Our goal is to show that it is possible to identify the author of a program by examining programming style characteristics. Within a closed environment, the results of this paper support the conclusion that, for a specific set of programmers, it is possible to identify the author of any individual program. Also, based on previous work and our observations during the experiments described herein we believe that the probability of finding two programmers who share exactly those same characteristics should be very small.
H. Spafford, Stephen
Software Forensics: Can We Track Code To Its
Abstract: Viruses, worms, trojan horses, and crackers all exist and threaten the security of our computer systems. Often, we are aware of an intrusion only after it has occurred. On some occasions, we may have a fragment of code left behind used by an adversary to gain access or damage the system. A natural question to ask is "Can we use this remnant of code to positively identify the culprit?" In this paper, we detail some of the features of code remnants that might be analyzed and then used to identify their authors. We further outline some of the difficulties involved in tracing an intruder by analyzing code.
Built by Mark Crosbie and Ivan Krsul.
Security Archive Homepage.
COAST Project (CERIAS)Page.
Purdue CS Dept page.