Interpretable Topic Extraction and Word Embedding Learning using row-stochastic DEDICOM

arXiv:2507.16695v1 Announce Type: cross
Abstract: The DEDICOM algorithm provides a uniquely interpretable matrix factorization method for symmetric and asymmetric square matrices. We employ a new row-stochastic variation of DEDICOM on the pointwise mutual information matrices of text corpora to identify latent topic clusters within the vocabulary and simultaneously learn interpretable word embeddings. We introduce a method to efficiently train a constrained DEDICOM algorithm and a qualitative evaluation of its topic modeling and word embedding performance.

Leave a Comment Cancel Reply