- Christopher Scaffidi, Kevin Bierhoff, Eric Chang, Mikhael Felker, Herman Ng, and Chun Jin
|Aspect Detection Method||Frequency-based, uses baseline statistics|
|Sentiment Analysis Method||N/A|
This work can be seen as following-up on Hu & Liu (2004). It improves on the latter, by using baseline statistics of words in English and probability-based heuristics for identifying aspects of product categories. Before aspect extraction, Red Opal is provided with statistics on lemma frequencies in generic text. In order for this to work properly, the text has to be roughly in the same style, so in this case, a corpus of conversational English is chosen to derive these statistics from. Additionally, words that are more likely to occur as a non-noun than as a noun are put in an ignore-list, as product aspects are set to be nouns exclusively in this research. Now for each review, the frequency of the nouns is determined and the probability is computed that that noun would occur in generic English. This is based on the following two assumptions:
- It is assumed that the occurrence of lemma x in position i is independent of whether lemma x occurs in any other position j. In other words: .
- It is assumed that the occurrence of lemma x in position i is independent of position i. In other words: .
Of course, these assumptions are false. But in practice, the performance does not seem to be harmed by it, and it makes the problem easier to solve. Applying some approximation techniques results in this formula to determine the probability that a certain word would appear as often as it does by pure chance:
If and is small, then it is unlikely that x occurred this often just by chance. That means that it is probably an aspect. When a word x is not encountered in the corpus, Red Opal will use the average as a default value. This framework is extended bigrams as well, as aspects are often described using two nouns together.
Evaluation & Discussion
The system is evaluated by generating a list of product aspects from the set of product reviews, and manually marking each entry in the list with a score of 0, 1, 2, or 3. The marking was done by students who responded to a survey invitation (10 good responses in total). To compute precision, all entries that were marked with a 2 or a 3 by at least half of the evaluators, was set to be correct.
Unfortunately, recall is not computed, so while the precision reported is quite high (85% - 90%). I personally find that a grave deficit, as it is relatively easy to get a high precision but at the cost of very low recall. The work with the panel data is quite interesting, for example the fact that users tend to find bigram aspects more insightful than single noun aspects. But the number of respondents is a bit low (only 10 used for the analysis). Something that I very much like is the inclusion of a complexity analysis. Here the complexity is noted as O(n), which is perfect.
In my opinion, the main contribution is the idea of using baselines statistics when finding frequently occurring words as aspects. With Hu & Liu (2004), one could get incorrect aspects, just because those words are so common.