What's great and what's not: learning to classify the scope of negation for improved sentiment analysis · EMOTICONES

Auteur-es

Councill, Isaac; McDonald, Ryan; Velikovich, Leonid

Nombre Auteurs

3

Titre

What's great and what's not: learning to classify the scope of negation for improved sentiment analysis

Année de publication

2010

Référence (APA)

Councill, I., McDonald, R., & Velikovich, L. (2010). What’s great and what’s not : Learning to classify the scope of negation for improved sentiment analysis. Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, 51‑59. https://aclanthology.org/W10-3110

Mots-clés

ND

URL

https://aclanthology.org/W10-3110

Accessibilité de l'article

Open access

Champ

Natural Language Processing

Type contenu (théorique Applicative méthodologique)

Applicative

Méthode

Creation of a detection system capable of correctly identifying the presence or absence of negation in portions of text that are expressions of feelings.
Towards this end in we describe both an annotated negation span corpus as well as a negation span detector that is trained on the corpus. The span detector is based on conditional random fields (CRFs) (Lafferty, McCallum, and Pereira, 2001), which is a structured prediction learning framework common in sub-sentential natural language processing tasks, including sentiment analysis (Choi and Cardie, 2007; McDonald et al., 2007)..

Cas d'usage

ND

Objectifs de l'article

The goal of the present work is to develop a system that is robust to differences in the intended scope of negation introduced by the syntactic and lexical features in each negation category. In particular, as the larger context of this research involves sentiment analysis, it is desirable to construct a negation system that can correctly identify the presence or absence of negation in spans of text that are expressions of sentiment.

Question(s) de recherche/Hypothèses/conclusion

Research question(s) : This paper describes an approach to negation scope detection in the context of sentiment analysis, particularly with respect to sentiment expressed in online reviews. The canonical need for proper negation detection in sentiment analysis can be expressed as the fundamental difference in semantics inherent in the phrases, “this is great,” versus, “this is not great.” Unfortunately, expressions of negation are not always so syntactically simple.

Hypothesis(es) : This paper presents a system for identifying the scope of negation using shallow parsing, by means of a conditional random field model informed by a dependency parser.

Conclusion(s) : Results were presented on the standard BioScope corpus that compare favorably to the best results reported to date, using a software stack that is significantly simpler than the best-performing approach. Cross-training by learning a model on one corpus and testing on another suggests that scope boundary detection in the product reviews corpus may be a more difficult learning problem, although the method used to annotate the reviews corpus may result in a more consistent representation of the problem. Finally, the negation system was built into a state-of-the-art sentiment analysis system in order to measure the practical impact of accurate negation scope detection, with dramatic results. The negation system improved the precision of positive sentiment polarity detection by 35.9% and negative sentiment polarity detection by 46.8%. Error reduction on the recall measure was less dramatic, but still significant, showing improved recall for positive polarity of 20.0% and improved recall for negative polarity of 6.6%

Cadre théorique/Auteur.es

Linguistic negation (Givon, 1993, Tottie, 1991)

Negation and its scope in the context of sentiment analysis (Moilanen et Pulman, 2007 ; Choi et Cardie, 2008 ; Danescu-Niculescu-Mizil et al., 2009 ; Wilson et al., 2005 ; Nakagawa et al., 2010)

Conditional random field (Lafferty, McCallum, and Pereira, 2001)

CRF in the context of sentiment analysis (Choi and Cardie, 2007 ; McDonald et al., 2007)

CRF and negation in the context of sentiment analysis (Morante and Daelemans, 2009)

Concepts clés

Sentiment analysis

Données collectées (type source)

BioScope corpus (Vincze et al., 2008) : annotated clinical radiology reports, biological full papers, and biological abstracts. Annotations in BioScope consist of labeled negation and speculation cues along with the boundary of their associated text scopes. Each cue is associated with exactly one scope, and the cue itself is considered to be part of its own scope.

Product Reviews corpus : a novel corpus was developed containing the text of entire reviews, annotated according to spans of negated text. A sample of product reviews were obtained by randomly sampling reviews from Google Product Search and checking for the presence of negation. Each review was manually annotated with the scope of negation by a single person, after achieving inter-annotator agreement of 91% with a second person on a smaller subset of 20 reviews containing negation.

Définition des émotions

No definition

Negative, neutral, positive labeling

Ampleur expérimentation (volume de comptes)

BioScope corpus (Vincze et al., 2008) : 9 papers and a total of 2670 sentences

Product Reviews corpus : A sample of 268 product reviews were obtained by randomly sampling reviews from Google Product Search and checking for the presence of negation. The annotated corpus contains 2111 sentences in total, with 679 sentences determined to contain negation.

Technologies associées

Conditional random field

Structured prediction learning framework

Mention de l'éthique

ND

Finalité communicationnelle

The automatic detection of the scope of linguistic negation is a problem encountered in wide variety of document understanding tasks, including but not limited to medical data mining, general fact or relation extraction, question answering, and sentiment analysis.

Résumé

Automatic detection of linguistic negation in free text is a critical need for many text processing applications, including sentiment analysis. This paper presents a nega- tion detection system based on a conditional random field modeled using features from an English dependency parser. The scope of negation detection is limited to explicit rather than implied negations within single sentences. A new negation corpus is presented that was constructed for the domain of English product reviews obtained from the open web, and the proposed negation extraction system is evaluated against the reviews corpus as well as the standard BioScope negation corpus, achieving 80.0% and 75.5% F1 scores, respectively. The impact of accurate negation detection on a state-of-the-art sentiment analysis system is also reported.