[1910.10953] Deep topic modeling by multilayer bootstrap network and lasso
MBN, as an unsupervised deep model, overcomes the weaknesses of the model assumptions, anchor word assumption, and shallow learning, which accounts for the advantage of DTM over the 5 representative comparison methods
Abstract: Topic modeling is widely studied for the dimension reduction and analysis of
documents. However, it is formulated as a difficult optimization problem.
Current approximate solutions also suffer from inaccurate model- or
data-assumptions. To deal with the above problems, we propose a polynomial-time
deep topic model with no model and data assumptions. Specifically, we first
apply multilayer bootstrap network (MBN), which is an unsupervised deep model,
to reduce the dimension of documents, and then use the low-dimensional data
representations or their clustering results as the target of supervised Lasso
for topic word discovery. To our knowledge, this is the first time that MBN and
Lasso are applied to unsupervised topic modeling. Experimental comparison
results with five representative topic models on the 20-newsgroups and TDT2
corpora illustrate the effectiveness of the proposed algorithm.
‹Fig. 1: Deep topic model. (Deep topic modeling)›