Comparative experiments on sentiment classification for online product reviews · EMOTICONES

Auteur-es

Cui, Hang; Mittal, Vibhu; Datar, Mayur

Nombre Auteurs

3

Titre

Comparative experiments on sentiment classification for online product reviews

Année de publication

2006

Référence (APA)

Cui, H., Mittal, V., & Datar, M. (2006). Comparative experiments on sentiment classification for online product reviews. Proceedings of the 21st national conference on Artificial intelligence - Volume 2, 1265‑1270. https://dl.acm.org/doi/10.5555/1597348.1597389

Mots-clés

ND

URL

https://dl.acm.org/doi/10.5555/1597348.1597389

Accessibilité de l'article

Open access

Champ

Natural Language Processing

Type contenu (théorique Applicative méthodologique)

Applicative

Méthode

We conduct experiments on a corpus of online reviews with an average length of over 800 bytes crawled from the Web. Such a large-scale data set allows us not only to train language models and cull high order n-grams as features, but also to study the effectiveness and robustness of classifiers in a simulating context of the Web.

We study multiple classification algorithms for processing large-scale data. We employ three algorithms: (i) Winnow (Nigam & Hurst 2004), (ii) a generative model based on language modeling, and (iii) a discriminative classifier that employs projection based learning (Shalev-Shwartz et al. 2004).

Cas d'usage

Online customer reviews

Objectifs de l'article

This paper looks at a simplified version of the problem: classifying online product reviews into positive and negative classes. We discuss a series of experiments with different machine learning algorithms in order to experimentally evaluate various trade-offs.

Question(s) de recherche/Hypothèses/conclusion

Research question(s) : Evaluating text fragments for positive and negative subjective expressions and their strength can be important in applications such as single - or multi- document summarization, document ranking, data mining, etc. This paper looks at a simplified version of the problem : classifying online product reviews into positive and negative classes.

Hypothesis(es) : We conjecture that those experiments were hindered by the small training corpora, and thus were not able to show the effectiveness of high order n-grams (n > 3) indiscerning subtleties in expressing sentiments.
[...]
One of the main difficulties is that people typically use both positive and negative words in the same review ,regardless of the rating score. As such, we hypothesize that a discriminative classifier could gain more strength in differentiating the mixed sentiments. We experimentally compare a discriminative model with a generative model by language modeling to verify this hypothesis.

Conclusion(s) : In this paper, we presented the experiments we have done on sentiment classification using large-scale data set. Our experimental results show that a discriminating classifier combined with high order n-grams as features can achieve comparable, or better performance than that reported in academic papers. More importantly, this paper shows that sentiment classification is possible to be learned from online product reviews, even with very disparate products and authors. In addition, we have shown that high order n-grams do help indiscriminating the articles’ polarity in the mixture context. This observation based on large-scale data set has never been testified before.

Cadre théorique/Auteur.es

Classification of words according to their semantic orientation (Hatzivassiloglou et McKeown, 1997 ; Turney et Littman, 2003)

Sentiment classification on the article level (Pang et al., 2002 ; Pang & Lee, 2004)

Classifying polarity of documents (Nigam and Hurst, 2004)

Online product reviews (Dave et al., 2003 ; Hu and Liu, 2004 ; Popescu and Etzioni, 2005 ; Wilson et al., 2005)

Classifier (Shalev-Shwartz et al., 2004 ; Manning & Schütze, 1999 ; Nigam and Hurst, 2004)

Concepts clés

Sentiment analysis

Données collectées (type source)

We accumulate reviews about electronic products like digital cameras, laptops, PDAs, MP3 players, etc. from Froogle [Google Shopping].

Each review comes with the full text and the rating score by the reviewer. These reviews are crawled from prominent sites, such as cnet.com, ciao.co.uk and shopping.yahoo.com.

Définition des émotions

No definition

Negative, neutral, positive labeling

Ampleur expérimentation (volume de comptes)

The size of the whole corpus is around 0.4GB, including a total of over 320k product reviews about over 80k unique products. The average length of the reviews is 875 bytes.

Technologies associées

Passive-Aggressive (PA) Algorithm Based Classifier

Language Modeling (LM) Based Classifier

Winnow Classifier

N-gram models

Mention de l'éthique

ND

Finalité communicationnelle

A large amount of Web content is subjective and reflects peoples’ opinions. With the rapid growth of the Web, more and more people write reviews for all types of products and services and place them online. It is becoming a common practice for a consumer to learn how others like or dislike a product before buying, or for a manufacturer to keep track of customer opinions on its products to improve the user satisfaction. However, as the number of reviews available for any given product grows, it becomes harder and harder for people to understand and evaluate what the prevailing/majority opinion about the product is.
Sentiment classification, also known as affect or polarity classification, attempts to address this problem by (i) presenting the user with an aggregate view of the entire data set, summarized by a label or a score, and (ii) segmenting the articles/text-fragments into two classes that can be further explored as desired. While many review sites, such as Epinions, CNet and Amazon, help reviewers quantify the positivity of their comments, sentiment classification can still play an important role in classifying documents that do not have explicit ratings.
Often, web sites, such as personal blogs, have user reviews with personal experiences in using a particular product without giving any score. The review comments from these sites are valuable because they cover a lot more products than those formal review sites.

Résumé

Evaluating text fragments for positive and negative subjective expressions and their strength can be important in applications such as single- or multi- document summarization, document ranking, data mining, etc. This paper looks at a simplified version of the problem: classifying online product reviews into positive and negative classes. We discuss a series of experiments with different machine learning algorithms in order to experimentally evaluate various trade-offs, using approximately 100K product reviews from the web.