Sentiment Analysis

Sentiment analysis estimates tone numerically. It works only when the measure matches how tone operates in your corpus.

What it is

Sentiment analysis usually means one of several tools. The important question is what the score is supposed to measure.

Dictionary methods count terms from a curated lexicon such as LIWC, VADER, NRC, or AFINN. They are transparent and easy to rerun. They struggle with sarcasm and negation, especially after domain shift.

Supervised classifiers work best in-domain and require a labeling plan with validation. LLM-based ratings are quick to set up. Their scores can change with the prompt or model version. Treat that route as experimental unless your supervisor has approved it and you can evaluate it properly under the Ethics & AI policy.

The weak point depends on the material. Sarcasm-heavy social media breaks many dictionaries. Classifiers trained on movie reviews fail on policy documents. A thesis needs to show that the chosen measure is valid for the actual texts.

What you learn in the DH course

In the DH course, the sentiment unit is mainly about validation. Students practice the following.

Comparing dictionary methods and checking where each one breaks
Building a supervised classifier from labeled examples
Handling negation, intensifiers, and other contextual modifiers
Inter-annotator agreement (Cohen’s kappa, Krippendorff’s alpha) for labeled data
Validating sentiment scores against human judgment
Reporting limits without treating the score as self-explanatory

What you need to learn first

Preprocessing. Dictionary methods depend heavily on tokenization and lemmatization. See Preprocessing.
Basic statistics. You need agreement metrics, confidence intervals, and a working sense of reliability.
Python or R. Python options include vaderSentiment, nltk, and transformers. R users can start with sentimentr or quanteda.sentiment.

What you can do with it

Chart whether coverage of a policy turned negative after a key event
Compare the tone of government and opposition speeches across a legislative term
Track sentiment toward a country or leader in foreign-language press
Surface high-emotion passages for qualitative close reading
Build a scalar covariate for a topic model or regression

Preprocessing matters because dictionaries depend on tokens.
Framing Analysis can include sentiment as one dimension of a frame.
Topic Analysis pairs naturally with sentiment when tone varies by theme.

Sentiment Analysis

What it is

What you learn in the DH course

What you need to learn first

What you can do with it

Related methods