Publications -> Conference Papers

Unsupervised Embedding for Latent Similarity by Modeling Heterogeneous MOOC Data


Authors: Z. Jiang, S. Feng, W. Chen, G. Wang, and X. Li
Title: Unsupervised Embedding for Latent Similarity by Modeling Heterogeneous MOOC Data
Abstract: Recent years have witnessed the prosperity of Massive Open Online Courses (MOOCs). One important characteristic of MOOCs is that video clips and discussion forum are integrated into a one-stop learning setting. However, discussion forums have been in disorder and chaos due to ‘Massive’ and lack of efficient management. A technical solution is to associate MOOC forum threads to corresponding video clips, which can be regarded as a problem of representation learning. Traditional textual representation, e.g. Bag-of-words (BOW), do not consider the latent semantics, while recent semantic word embeddings, e.g. Word2vec, do not capture the similarity between documents, i.e. latent similarity. So learning distinguishable textual representation is the key to resolve the problem. In this paper, we propose an effective approach called No-label Sequence Embedding (NOSE) which can capture not only the latent semantics within words and documents, but also the latent similarity. We model multiform MOOC data in a heterogeneous textual network. And we learn the low-dimensional embeddings without labels. Our proposed NOSE owns some advantages, e.g. course-agnostic, and few parameters to tune. Experimental results suggest the learned textual representation can outperform the state-of-the-art unsupervised counterparts in the task of associating forum threads to video clips.
Keywords: Unsupervised embedding; Latent similarity; Heterogeneous; MOOC
Conference Name: Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17)
Location: Jeju, South Korea
Publisher: Springer International Publishing
Year: 2017
Accepted PDF File: Unsupervised_Embedding_for_Latent_Similarity_by_Modeling_Heterogeneous_MOOC_Data_accepted.pdf
Permanent Link: https://dx.doi.org/10.1007/978-3-319-57529-2_53
Reference: Z. Jiang, S. Feng, W. Chen, G. Wang, and X. Li, “Unsupervised embedding for latent similarity by modeling heterogeneous MOOC data,” in Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’17). Springer International Publishing, May 2017, pp. 683–695.
bibtex: 
@inproceedings{LILY-c117, 
    author = {Jiang, Zhuoxuan and Feng, Shanshan and Chen, Weizheng and Wang, Guangtao and Li, Xiaoming},
    title  = {Unsupervised Embedding for Latent Similarity by Modeling Heterogeneous {MOOC} Data},  
    booktitle = {Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17)}, 
    year  = {2017}, 
    month = {May}, 
    pages = {683-695}, 
    location = {Jeju, South Korea},
    publisher = {Springer International Publishing},
 }