The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Scalable Learning Through Error-correcting Codes based Clustering in Autonomous Systems

Principal Investigator: Bharat Bhargava

Intelligent Autonomous systems (IAS) continuously receive large streams of diverse data from numerous entities operating and interacting in their environment. It is vital that the learning models in IAS to scale up to the new and unknown data items that were not present in the training or testing datasets. Scalable learning is nothing but a method to achieve maximum classification without rejecting any unknown data item as anomalies. In this paper, we present Perfect Error- correcting Codes (PEC) clustering technique to approximate the classes of multi-feature data items by reversing standard forward error correction coding. Approximating classes problems generally arise in information systems that are processing fuzzily cataloged data items. These data can be classified by applying binary vectors to their corresponding features (1: feature is present or 1: feature is absent) to obtain message words. These code words can be used as cluster centers. In PEC clustering, binary vectors of 23 bits are mapped into code words (labels or indices) of 12 bits. Two binary vectors with the Hamming distance of 2 will have a few common labels thus classified accordingly. PEC clustering has 2 23 code word space, which makes it ideal for scalability in clustering of thousands of categories. With reasonable redundancy, the clustering can be accomplished in O ( N ) time. In addition, we present an information processing model for on-the-fly processing of data streams with multi-processor pipeline: Read, Analyze, and Toggle (RAT) model. items

Personnel

Students: Ganapathy Mani

Representative Publications

Keywords: clustering, cognitive autonomy, knowledge discovery, scalable learning