- Qiaozhu Mei, Xu Ling, Matthew Wondra, Hang Su, and ChengXiang Zhai
|Sentiment Classes||Positive / negative|
|Aspect Detection and Sentiment Analysis||Topic-Sentiment Mixture Model|
|Positive sentiment model on unseen data:||KL-divergence: ~21|
|Negative sentiment model on unseen data:||KL-divergence: ~19|
This research is one of the first to use a probabilistic graphical model to extract both the aspects and the sentiment. As is common with topic models, the approached method is largely unsupervised. The one exception is that the authors have chosen to use a labeled training set to find some prior parameters of the model. While these models are extremely powerful, they are also quite hard to fully understand. For this post I'll discuss the main idea of the method, and its strong and weak points, but I will leave out most of the technical stuff. I think it's more useful to make a tutorial-like post about topic models at some point than try to explain it in every paper discussion that uses this technique.
An overview of the model is given below. You have to read it from right to left, when taking the generative approach. So, when a document would be generated, the first decision is whether the word is a background word (common English words, denoted by the encircled B), or a topical word. In case of the latter, either on of k subtopics has to be chosen to generate the word. These are the encircled in the Themes block. Last, given that a topic is chosen, the word can be drawn from one of the neutral topic word-distributions, or from the positive or negative word-distributions.
While a model like this can generally be trained in an unsupervised manner, the accuracy of the model would increase if proper priors are available for the positive and negative sentiment models. To that end, an existing sentiment service called Opinmind is used to get polarity information for a set of sentences. By using a great variety of sentences when polling Opinmind, the sentiment models can be generic enough to be used on unseen data/topics. Given these proper priors, the parameters of the model can be estimated using Expectation-Maximization.
When the model is trained. the following tasks can be performed:
- Rank sentences with respect to topics
- Categorize sentences by sentiments
- Reveal the overall sentiment for complete documents or for topics
Sentiment Dynamic Analysis
Not only are the authors aiming for a good model that can provide sentiment information for each topic in the text, they also feature a method to track sentiment over time. To do this, an HMM model is build using the computed parameters from the mixture model. First, all documents are sorted according to their timestamp and put into one big sequence of words. Then the HMM model is created in such a way that there is one controlling node E, which is connected to a node for each topic. These nodes are connected to three nodes, representing positive, negative, and neutral sentiment, respectively. The Baum-Welch algorithm is used to learn the transitional probabilities between the various states and the output probability for E. Then, the Viterbi algorithm is used to decode the collection sequence. Then computing topic and sentiment dynamics is just a matter of counting how often the corresponding state is tagged.
For the evaluation of the sentiment models, a series of training data sets is constructed, again using Opinmind, ranging 10 topics. Using 10-fold cross-validation, each training set covers 9 topics, and after training, the model is confronted with the unseen topic. To compute how well the sentiment model matches the unseen topic, the Kullback-Leibner divergence is computed between the distributions of the sentiment model and the one from the unseen model. The results, as reported in a graph, are roughly 19 for the positive and 21 for the negative model. The rest of the model is unfortunately just illustrated by sample output instead of being actually evaluated.
A very interesting model is presented in this paper. Unfortunately, it is not very thoroughly evaluated. The only part that is evaluated is done using the KL-divergence instead of the more widely used perplexity measure. Although the sample output looks good and the paper includes lots of formulas to show the machinery of the model, concrete conclusions with respect to its performance are hard to draw.