Publications -> Conference Papers

Incremental fuzzy clustering for document categorization

Authors: J.-P. Mei, Y. Wang, L. Chen, and C. Miao
Title: Incremental fuzzy clustering for document categorization
Abstract: Incremental clustering has been proposed to handle large datasets which can not fit into memory entirely. Single pass fuzzy c-means (SpFCM) and Online fuzzy c-means (OFCM) are two representative incremental fuzzy clustering methods. Both of them extend the scalability of fuzzy c-means (FCM) by processing the dataset chunk by chunk. However, due to the data sparsity and high-dimensionality, SpFCM and OFCM fail to produce reasonable results for document data. In this study, we work on clustering approaches that take care of both the large-scale and high-dimensionality issues. Specifically, we propose two methods for incrementally clustering of document data. The first method is a modification of the existing FCM-based incremental clustering with a step to normalize the centroids in each iteration, while the other method is incremental clustering, i.e., Single-Pass or Online, with weighted fuzzy co-clustering. We use several benchmark document datasets for experimental study. The experimental results show that the proposed approaches achieved significant improvements over existing SpFCM and OFCM in document clustering.
Keywords: Document handling; Fuzzy set theory; Pattern clustering
Conference Name: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’14)
Location: Beijing, China
Publisher: IEEE
Year: 2014
Accepted PDF File: Incremental_fuzzy_clustering_for_document_categorization_accepted.pdf
Permanent Link:
Reference: J.-P. Mei, Y. Wang, L. Chen, and C. Miao, “Incremental fuzzy clustering for document categorization,” in Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’14). IEEE, July 2014, pp. 1518–1525.
   author	= {Mei, Jian-Ping and Wang, Yangtao and Chen, Lihui and Miao, Chunyan}, 
   title	= {Incremental fuzzy clustering for document categorization}, 
   booktitle	= {Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE'14)},
   year		= {2014}, 
   month	= {July}, 
   pages	= {1518-1525}, 
   location	= {Beijing, China},
   publisher	= {IEEE},