SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis
- Auteur-es
- Li, Fangtao; Wang, Sheng; Liu, Shenghua; Zhang, Ming
- Nombre Auteurs
- 4
- Titre
- SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis
- Année de publication
- 2014
- Référence (APA)
- Li, F., Wang, S., Liu, S., & Zhang, M. (2014). SUIT : A Supervised User-Item Based Topic Model for Sentiment Analysis. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1), Article 1. https://doi.org/10.1609/aaai.v28i1.8947
- Mots-clés
- ND
- URL
- https://ojs.aaai.org/index.php/AAAI/article/view/8947
- doi
- https://doi.org/10.1609/aaai.v28i1.8947
- Accessibilité de l'article
- Open access
- Champ
- Natural Language Processing
- Type contenu (théorique Applicative méthodologique)
- Applicative
- Méthode
-
We propose a novel Supervised User-Item based Topic model, SUIT model, which can simultaneously utilize the textual topic and useritem factors for sentiment analysis.
In this model, the useritem information is represented in the form of user latent factor and item latent factor.
We conduct experiment on both review dataset and microblog dataset. - Cas d'usage
- ND
- Objectifs de l'article
- In this paper, we propose a new Supervised User-Item based Topic model, called SUIT model, for sentiment analysis. It can simultaneously utilize the textual topic and latent user-item factors.
- Question(s) de recherche/Hypothèses/conclusion
- Research question(s) : Probabilistic topic models have been widely used for sentiment analysis. However, most of existing topic methods only model the sentiment text, but do not consider the user, who expresses the sentiment, and the item, which the sentiment is expressed on.
- Hypothesis(es) : Since different users may use different sentiment expressions for different items, we argue that it is better to incorporate the user and item information into the topic model for sentiment analysis. In this paper, we propose a new Supervised User-Item based Topic model, called SUIT model, for sentiment analysis. It can simultaneously utilize the textual topic and latent user-item factors. Our proposed method uses the tensor outer product of text topic proportion vector, user latent factor and item latent factor to model the sentiment label generalization.
- Conclusion(s) : The results demonstrate the advantages of our model. It shows significant improvement compared with supervised topic models and collaborative filtering methods
- Cadre théorique/Auteur.es
- Sentiment analysis (Liu, 2010 ; Pang et Lee, 2009)
- Probabilistic topic model (Hofmann 1999 ; Blei, Ng et Jordan 2003 ; Blei et McAuliffe 2007)
- Sentiment modeling in an unsupervised framework (Mei et al. 2007 ; Brody et Elhadad, 2010 ; Li et al. 2010 ; Jo et Oh, 201 ; Titov et McDonald, 2008a ; Lin et He, 2009)
- Supervised variants of Latent Dirichlet Allocation model, LDA (Blei and McAuliffe 2007 ; Simon et al. 2008 ; Zhu et al. 2009 ; Wang et al. 2011 ; Zhao et al. 2010 ; Titov and McDonald, 2008b)
- Importance of user and item information (Tan et al. 2010 ; Li et al. 2011)
- Concepts clés
- Sentiment analysis
- Données collectées (type source)
-
We conduct our experiments on two datasets. The first is movie review data set. The second is a dataset crawled from a microblog site.
- Review dataset : the first dataset is a collection of movie reviews. Following Pang and Lee’s setting (Pang and Lee 2005), all the review stars are mapped into a 1~4 sentiment scales.
- Microblog dataset : the second dataset is crawled from a Microblog site. We first filter out the spam tweets, and then manually annotate the remaining tweets into three categories (0 ~ 2): negative (0), neutral (1), and positive (2). Negative, neutral and positive labels refer to the sentiment level that user has rated to the item. It is also necessary to filter out the spam users who only post advertisement. - Définition des émotions
- Definition of emotion analysis
- Ampleur expérimentation (volume de comptes)
-
Review dataset : 15507 reviews with explicit stars. There are 458 users, 4543 products and 15507 reviews in total.
Microblog dataset : 14 movies, which are released in the last year. After filtering the spam tweets and the spam users, we finally got 387 users and 1299 tweets which include 403 negative tweets, 431 neutral tweets and 445 positive tweets. - Technologies associées
- ND
- Mention de l'éthique
- ND
- Finalité communicationnelle
- These reviews are very useful for the general users, who often read product reviews before making the final decision. Companies also hope to track the customers’ opinion to improve the quality of the products or services.
- Résumé
- Probabilistic topic models have been widely used for sentiment analysis. However, most of existing topic methods only model the sentiment text, but do not consider the user, who expresses the sentiment, and the item, which the sentiment is expressed on. Since different users may use different sentiment expressions for different items, we argue that it is better to incorporate the user and item information into the topic model for sentiment analysis. In this paper, we propose a new Supervised User-Item based Topic model, called SUIT model, for sentiment analysis. It can simultaneously utilize the textual topic and latent user-item factors. Our proposed method uses the tensor outer product of text topic proportion vector, user latent factor and item latent factor to model the sentiment label generalization. Extensive experiments are conducted on two datasets: review dataset and microblog dataset. The results demonstrate the advantages of our model. It shows significant improvement compared with supervised topic models and collaborative filtering methods.
- Pages du site
- Contenu
Fait partie de SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis