Publications -> Journal Papers

Adaptive scaling of cluster boundaries for large-scale social media data clustering


Authors: L. Meng, A.-H. Tan, and D. C. Wunsch II
Title: Adaptive scaling of cluster boundaries for large-scale social media data clustering
Abstract: The large-scale and complex nature of social media data raises the need to scale clustering techniques to big data and make them capable of automatically identifying data clusters with few empirical settings. In this paper, we present our investigation and three algorithms based on the Fuzzy Adaptive Resonance Theory (Fuzzy ART) that have linear computational complexity, use a single parameter, i.e. the vigilance parameter to identify data clusters, and are robust to modest parameter settings. The contribution of this paper lies in two aspects. First, we theoretically demonstrate how complement coding, commonly known as a normalization method, changes the clustering mechanism of Fuzzy ART, and discover the vigilance region (VR) that essentially determines how a cluster in the Fuzzy ART system recognizes similar patterns in the feature space. The VR gives an intrinsic interpretation of the clustering mechanism and limitations of Fuzzy ART. Second, we introduce the idea of allowing different clusters in the Fuzzy ART system to have different vigilance levels in order to meet the diverse nature of the pattern distribution of social media data. To this end, we propose three vigilance adaptation methods, namely, the activation maximization rule (AMR), the confliction minimization rule (CMR), and the hybrid integration rule (HIR). With an initial vigilance value, the resulting clustering algorithms, namely, the AM-ART, CM-ART, and HI-ART, can automatically adapt the vigilance values of all clusters during the learning epochs in order to produce better cluster boundaries. Experiments on four social media data sets show that AM-ART, CM-ART, and HI-ART are more robust than Fuzzy ART to the initial vigilance value, and they usually achieve better or comparable performance and much faster speed than state-of-the-art clustering algorithms that also do not require a predefined number of clusters.
Keywords: Vigilance region (VR); Adaptive parameter tuning; Adaptive resonance theory (ART); Clustering; Big social media data.
Journal Name: IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 12
Publisher: IEEE
Year: 2016
Accepted PDF File: Adaptive_Scaling_of_Cluster_Boundaries_for_Large-scale_Social_Media_Data_Clustering_accepted.pdf
Permanent Link: http://dx.doi.org/10.1109/TNNLS.2015.2498625
Reference: L. Meng, A.-H. Tan, and D. C. Wunsch II, “Adaptive scaling of cluster boundaries for large-scale social media data clustering,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 12, pp. 2656–2669, December 2016.
bibtex: 
@article {LILY-j30,
   author 	= {Meng, Lei and Tan, Ah-Hwee and Wunsch II, Donald C.},
   title 	= {Adaptive Scaling of Cluster Boundaries for Large-scale Social Media Data Clustering},
   journal 	= {IEEE Transactions on Neural Networks and Learning Systems},
   year 	= {2016},
   month 	= {December},
   volume 	= {27},
   number 	= {12},
   pages 	= {2656-2669},
   publisher 	= {IEEE},
}