Classification Theory
Classification Theory
Introduction
Classification of information is as much an art form as it is a science [BloWe76], specially when classifying computer security software and documentation. Successful searches are to a large extent a function of the mental set that the researcher brings to the task [BloWe76]. All librarians must make a conscious choice about how to arrange and present physical resources.Hence, library classification systems such as the Library of Congress Classification (LCC) of the Dewey Decimal Classification (DDC) were developed to arrange printed matter in topical or disciplinary categories (i.e. to position books related to the same or similar subjects next to each other)[BloWe76].
Subject classification systems generally require a taxonomy that will classify the information at hand. In the case of software, for example, the classification system must classify problems which can be solved by computer software. A tree structure is the most natural for a classification system, allowing arbitrary levels of refinement. An example of a taxonomy specifically applied to computer software can be seen in the the GAMS project
The Library of Congress Subject Headings [LCSH] would seem to be the most natural source for an appropriate computer security taxonomy for documents and software related to security. Unfortunately, the format and organization of such breakdown does not adapt itself easily to our purposes.
Hence, we have developed our own Computer Security Classification for organizing security related documents and tools. The classification dictates the physical organization of the archive, and from this taxonomy we generate the subject indices.
Another function of the classification or cataloging system is to establish a standardized form for the names of every author for which we have information, and hence being able to group, if not physically then virtually, works by one author in one place. The existence of pseudonyms and variant spellings is a problem that must be addressed and typically the librarian chooses one on the forms and groups variations under this one spelling [BloWe76]. As an example, consider the following equivalent forms for the Director of the COAST group at Purdue: spaf, Spaf, Eugene H. Spafford, Gene Spaf, Gene Spafford, Gene H. Spafford, Eugene Spafford.
This particular issue was addressed by the maintainers of the Security Archive by the creation of an Authors Database that maintains centralized information about known authors in the Archive. This author database allows us to group documents and tools by author name in the author index.
References
- [LCSH] Library of Congress ; Library of Congress Subject Headings; 18th Edition; Washington, D.C. ; Cataloging Distribution Service.; Library of Congress ; 1995
- [GAMS] Applied and
Computational Mathematics Division and the Scientific Computing
Environments Division within the Computing and Applied
Mathematics Laboratory of the National Institute of Standards and
Technology ; Guide to
Available Math Software (GAMS)
- The GAMS project of the National Institute of Standards and Technology (NIST) studies techniques to provide scientists and engineers with improved access to reusable computer software which is available to them for use in mathematical modeling and statistical analysis.
- [BloWe76] Marty
Bloomberg and Hans Weber ; An Introduction to
Classification and Number Building in Dewey; Libraries
Unlimited, Inc., Littleton, Colorado. ; 19th Edition ; 1976 ;
- The book provides a concise introduction to the Dewey Decimal Classification System, with a brief history of the classification mechanism, Melvin Dewey, the format of the DDC system, and a detailed description of the characteristics of the 18th edition of the DDC system, including major heading and subject areas. - The book is out of date, however, because the current edition of the DDC system being used is the 19th edition.
Last Modified: 4 March, 1995.
security-archive@cerias.purdue.edu (COAST Security Archive)