Introduction
Chances are, if youâre going to do a mid-2000s to mid-2010s style sentiment analysis, youâre going to use a sentiment dictionary[These are also referred to as sentiment lexicons].[Or, you might be a colleague of mine and publishing an article in PNAS in 2022. Congrats!] When reaching for one of these things, youâve got quite a few options. This embarras de richesse should raise the question, Which should I use? I say âshould raise the questionâ because I feel like all too often these dictionaries are presented as issue-free, âsure just go ahead and use itâ tools. For anyone who has studied measurement or thought about scale validation that is a truly apoplexy-inducing idea.
Both for your and my future reference, below I (will, upon completion) have a description of 10 or so dictionaries that Iâve come across quite a few times and found worth investigating why they were invented and if they should ever still be used.1 Regardless of what I say below, remember that youâll probably have to think about your application when choosing a dictionary; there is no one-size-fits-all best choice. Also, this post abstains from high-level thoughts about the entire enterprise of using sentiment dictionaries to get sentiment scores for sentences, utterances, documents, etc. Thatâs a different kettle of fish.2
With that, letâs go through these things in alphabetical order.
AFINN
The subtitle of the original publication says it all, âEvaluation of a word list for sentiment analysis in microblogs.â In the paperâs abstract, the great Dane and presumably the namesake of the lexicon, Finn Arup Nielsen, lays out more explicitly why he created a new sentiment lexicon, âThere exist several affective word lists, e.g., ANEW (Affective Norms for English Words) developed before the advent of microblogging and sentiment analysis. I wanted to examine how well ANEW and other word lists performs for the detection of sentiment strength in microblog posts in comparison with a new word list specifically constructed for microblogs.â Thereâs AFINNâs origin story.3
One of the unique features of the AFINN lexicon is that words are mapped to integers instead of merely {positive, negative}. Here you can see a few words at each value:
set.seed(1)
get_sentiments('afinn') %>%
group_by(value) %>%
add_count(name = 'count') %>%
slice_sample(n = 4) %>%
summarise(
`no. words` = mean(count),
words = glue_collapse(word, sep = ', ')) %>%
arrange(value) %>% filter(value != 0)
# A tibble: 10 Ă 3
value `no. words` words
<dbl> <dbl> <glue>
1 -5 16 motherfucker, bitches, cocksuckers, bastard
2 -4 43 scumbag, fucking, fraudsters, fucked
3 -3 264 moron, destroy, despair, scandals
4 -2 966 animosity, censors, robs, touts
5 -1 309 imposing, unclear, demonstration, uncertain
6 1 208 share, extend, feeling, commit
7 2 448 tranquil, consent, supportive, sympathetic
8 3 172 audacious, classy, luck, gracious
9 4 45 exuberant, wonderful, rejoicing, wowww
10 5 5 hurrah, outstanding, superb, thrilled
And yes, I did set the seed above to avoid randomly showing you certain words.
You might be wondering
- why -5 to 5 and
- how he assigned words to those numbers
Quoting him on the former, âAs SentiStrength it uses a scoring range from â5 (very negative) to +5 (very positive).â Convention is powerful. As for the latter, the question of how numbers were assigned to words, âMost of the positive words were labeled with +2 and most of the negative words with â2 [âŚ]. I typically rated strong obscene words [âŚ] with either â4 or â5.â So he was basically winging it.
Another unique feature: the dictionary has 15 bigrams, 10 of which are below.4
set.seed(12)
get_sentiments('afinn') %>%
filter(str_detect(word, ' ')) %>%
slice_sample(n = 10)
# A tibble: 10 Ă 2
word value
<chr> <dbl>
1 cashing in -2
2 no fun -3
3 green wash -3
4 not good -2
5 dont like -2
6 not working -3
7 some kind 0
8 green washing -3
9 cool stuff 3
10 messing up -2
- Size: 2477 entries
- Coverage: In addition to the bog-standard sentiment words, it has words all the cool kids were saying in the late 2000, early 2010s.
- R Packages:
tidytext
,lexicon
,textdata
- Publication: Here
- Bottom Line: Itâs been superseded by VADER (below)
Bing (aka Hu and Liu)5
According to the man himself, âThis list [i.e., the lexicon] was compiled over many years starting from our first paper (Hu and Liu, KDD-2004).â Iâm not sure what the post-publication compilation process was, but the original process is well-described in the original publication. Essentially, they started with adjectives6 with obvious polarity (e.g., great, fantastic, nice, cool, bad, dull) as âseed wordsâ and collected synonyms (and antonyms) of those words, then synonyms and antonyms of those words, and so on, iteratively. To do this, they used WordNet, a chill-ass semantic network. One thing thatâs nice about the resulting lexicon is that itâs topic general. That is, though they developed this lexicon for the specific purpose of determining peopleâs opinions about product features in product reviews, it has a generality beyond that.
Actually looking at the words, youâll notice thereâs some weirdness.
head(get_sentiments('bing'), 10)
# A tibble: 10 Ă 2
word sentiment
<chr> <chr>
1 2-faces negative
2 abnormal negative
3 abolish negative
4 abominable negative
5 abominably negative
6 abominate negative
7 abomination negative
8 abort negative
9 aborted negative
10 aborts negative
First, Iâm not sure what â2-facesâ is. If you say that itâs a solecistic rendering of âtwo-faced,â Iâd say probably. In their appropriate alphabetic order, both âtwo-facedâ and âtwo-facesâ appear later in the dictionary. Anyway, youâll notice as well that a lot of the words would reduce to a single lemma if we lemmatized the dictionary. You can think of that as a positive feature of the BING dictionary. It means you donât have to have lemmatized (or stemmed) text. But its inclusion of derived words seems a bit haphazard. The abort-aborted pair is there, but abolish is hanging out along without its past tense.
- Size: In the
tidytext
package the BING dictionary has 6786 terms (matching what his website says, âaround 6800 wordsâ) - R Packages:
tidytext
,lexicon
,textdata
- Bottom Line: Itâs a classic and got wide coverage, but not as good as VADER or NRC-EIL.
Loughran-McDonald Master Dictionary w/ Sentiment Word Lists
Iâm honestly why this dictionary is included in packages â not because itâs bad7, but because itâs so (so so so) niche. If youâre doing text analysis on financial text (or aware of cool research doing this), please drop me a line and tell me about it.
If you want to learn about it, hereâs the page.
- R Packages:
tidytext
,lexicon
,textdata
NRC (original)
Saif Mohammad and friends developed a few National Research Council (NRC) of Canada-sponsored sentiment dictionaries. The first of them assigns words not only polarity labels (positive, negative), but emotion labels as well:
::get_sentiments('nrc') %>%
tidytextcount(sentiment, sort = TRUE)
# A tibble: 10 Ă 2
sentiment n
<chr> <int>
1 negative 3318
2 positive 2308
3 fear 1474
4 anger 1246
5 trust 1230
6 sadness 1187
7 disgust 1056
8 anticipation 837
9 joy 687
10 surprise 532
These eight emotions below negative and positive were theorized by Bob Plutchik to be fundamental, or something.8 Iâm going to ignore those emotions in this subsection. Iâm also not going to talk too much about this dictionary, because itâs superseded by the real-valued its successors, the NRC-EIL and NRC-VAD (below).
Before I leave this lexicon, though, one bizarre thing about it: 81 words are both positive and negative (???)
get_sentiments('nrc') %>%
filter(sentiment %in% c('negative', 'positive')) %>%
add_count(word) %>%
filter(n > 1) %>% select(-n)
# A tibble: 162 Ă 2
word sentiment
<chr> <chr>
1 abundance negative
2 abundance positive
3 armed negative
4 armed positive
5 balm negative
6 balm positive
7 boast negative
8 boast positive
9 boisterous negative
10 boisterous positive
# âš 152 more rows
As a matter of semantics, I get it for some of these (balm I do not get, though). Practically, if youâre doing an analysis with this dictionary, youâre probably going to want to remove all these terms before calculating sentiment scores.
- R Packages:
tidytext
,lexicon
,textdata
- Size: 5464 words (positive and negative, not words with ambiguous polarity)
- Bottom Line: Superseded by Saif Mohammedâs subsequent efforts.
NRC-EIL
This thingâs value-added, as my economist friend likes to say, is that instead of a simple âpositiveâ or ânegativeâ value for each sentiment entry, thereâs a number between -1 and 1. âReal-valued,â as measurement-heads say. This is actually extremely important if youâre aggregating word-level sentiment into something bigger (which ⌠honestly, email me if youâre doing anything other than that.) How they got these real-valued polarity scores is actually a pretty interesting methods story if youâre into that kind of thing, but I wonât go into âMaxDiff scalingâ here. One very important thing to note about this dataset, though, is that valence is that polarity isnât in this dataset. This vexed me for a minute before I realized that itâs in the real-valued NRC-VAD (below). So, on the off chance youâre looking for the best measurement of Plutchikâs eight basic emotions (and thatâs a very off chance), this is the best place to look.
- R Packages:
textdata
- Examples: You can see an example of an analysis of Ezra Kleinâs podcasts here.
- Bottom Line: Iâm not sure why it exists, but itâs the only lexicon doing what itâs doing.
NRC-VAD
Once you have a hammer, everything starts looking like a nail. Thatâs how I explain the existence of this dictionary to myself. Saif Mohammed & Co. found these cool scaling technique and were like, âOn what dimensions can we scale more words?â They seem to have stumbled on this idea the three most fundamental dimensions in concept space are valence, arousal, and dominance. Maybe itâs an indictment of my memory, perhaps an indictment of the psychology department at the University of Arkansas, but I managed to graduate summa cum laude with a degree in psychology without ever hearing of this. Regardless of supposed fundamental dimensions in concept space, valence is fundamental and is just another word for polarity which is the main thing people are doing with sentiment dictionaries.
This also led to my favorite table in Iâve ever seen in an academic article
Whenever I need to express that something is pure bad valence, I now reach for the phrase âtoxic nightmare shit.â
- R Packages:
textdata
- Bottom Line: Hell yeah. This is a good one.
SOCAL
Itâs the Semantic Orientation CALculator! Like VADER (below), SOCAL is both a sentiment dictionary and rules for modifying wordsâ sentiment given the context in which they appear. Below, Iâll only briefly consider the dictionary part. I recommend reading the publication both for more details on SOCAL itself as well as sentiment analysis generally. In a sea of sentiment articles that are slapdash publications from some âProceedings of blah blah blahâ or âMiniconference on yak yak yakâ this one really stands out for its professionalism and thoroughness.9 As for whether or not you should actually use this dictionary, uh, you should keep reading.
One fun thing to note about SOCALâs dictionary: it has more entries with spaces than any other dictionary Iâve seen.
%>%
hash_sentiment_socal_google group_by(n_gram = str_count(x, ' ') + 1) %>%
count(n_gram) %>%
ungroup()
# A tibble: 8 Ă 2
n_gram n
<dbl> <int>
1 1 2071
2 2 1097
3 3 97
4 4 17
5 5 3
6 6 3
7 7 1
8 8 1
This dictionary has not only a huge bigrams:unigrams ratio, but it has sextagrams, a septagram, and even an octogram! This never happens! Letâs look at the n-grams where n > 4
%>%
hash_sentiment_socal_google filter(str_count(x, ' ') > 3)
Key: <x>
x y
<char> <num>
1: darker and funnier than expected -4.092375
2: every other word is f -1.955614
3: in your face everywhere you turn -5.357512
4: lowbump bump bump bump bump bump bumpbumpbumpbump lowbump 5.622471
5: throw your peanut shells on the floor -5.186874
6: trying to get on top -2.573428
7: type stype suh huh uh huh's -4.281949
8: write an awesome story to sell -6.262021
I, uh, donât really know what to make of these. Thereâs actually a restaurant named âLambertâs Cafeâ in Ozark, Missouri where you get peanuts in tin buckets and âthrow your peanut shells on the floorâ and, unless Iâm remembering it wrong, itâs something people like about the place.
Speaking of things that are starting to be concerning, the distribution of scores:
library(ggtext)
ggplot(hash_sentiment_socal_google, aes(y)) +
geom_density() +
labs(
y = '',
x = 'Valence Score',
title = '**What tale tell ye**, ye two thick tails?'
+
) theme(plot.title = element_markdown(face = 'italic'))
No, that density plot isnât glitching. There really are words out there in the extremes:
%>%
hash_sentiment_socal_google filter(abs(y) > 15)
Key: <x>
x y
<char> <num>
1: almost mafiosio styled 15.61625
2: automatically bestselling 17.60440
3: coming of age isms 17.62518
4: cushion handled 23.45929
5: hop ified -30.16008
6: keyboard crafted 30.73891
7: more than palateable 25.61696
8: oven to stovetop 19.49040
9: piano blessed 30.40257
10: rustic yet contemporary 16.31055
11: slotted spooned 18.19201
12: thick spoked 20.43281
At this point, you might be wondering ⌠what scale is this? And what are values for our vanilla valence-indicators âgoodâ and âbadâ?
%>%
hash_sentiment_socal_google filter(x %in% c('good', 'bad'))
Key: <x>
x y
<char> <num>
1: bad -1.636520
2: good 1.872093
Ok, thatâs fine. Maybe. But hereâs something that probably isnât fine:
%>%
hash_sentiment_socal_google filter(str_detect(x, 'good|bad'))
Key: <x>
x y
<char> <num>
1: average good 0.4205258
2: bad -1.6365198
3: feel good 1.2984294
4: good 1.8720931
5: good intentioned -4.1537399
6: good natured -1.5298719
7: half bad -3.9237520
Here is where I lost all faith. âGood intentionedâ and âgood naturedâ are negative?!
At this point Iâm going to call it a day with SOCAL. At some point I might write to the SOCAL authors or Tyler Rinker to see if something has gone wrong.
- Publication: Again, I truly recommend it
- Size: 3290 entries
- R Packages:
lexicon
Syuzhet
âThe default method,âsyuzhetâ is a custom sentiment dictionary developed in the Nebraska Literary Lab. The default dictionary should be better tuned to fiction as the terms were extracted from a collection of 165,000 human coded sentences taken from a small corpus of contemporary novels.â
Now, it does include a lot of words I find to be neutral (e.g., âyesâ, âtrueâ)
We can check out the distribution of the terms in my experimental sideways histogram below:
# ggplot(key_sentiment_jockers, aes(value)) +
# geom_histogram(breaks = seq(-1, 1, by = 0.1), color = 'white') +
# theme(
# panel.grid.major.x = element_blank(),
# panel.grid.minor.x = element_blank(),
# axis.text.x = element_text(margin = margin(t = -10, b = 5)),
# plot.title = element_text(face = 'bold', size = rel(1.3))) +
# scale_y_continuous(breaks = c(500, 1000, 1500)) +
# labs(
# x = 'Jockers/Syuzhet Value',
# y = 'Terms in Dictionary',
# title = 'Distribution of Sentiment Values in Syuzhet Dictionary'
# )
<- pull(distinct(key_sentiment_jockers, value))
distinct_values
ggplot(key_sentiment_jockers, aes(value)) +
geom_bar(width = .07, alpha = 0.8, fill = c(viridis::magma(8, direction = -1), viridis::mako(8))) +
geom_text(stat = 'count', aes(label = after_stat(count)),
hjust = -0.1,
position = position_stack(),
family = 'IBM Plex Sans',
face = 'bold') +
scale_x_continuous(breaks = distinct_values) +
theme(
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
plot.title = element_text(face = 'bold'),
axis.text.y = element_text(margin = margin(r = -20))) +
coord_flip() +
labs(x = 'Valence Value in Syuzhet',
y = 'Counts',
title = 'Distribution of Sentiment Values in Syuzhet Dictionary') +
geom_vline(xintercept = 0)
- R Packages:
syuzhet
,lexicon
- Size: 10,748 words
- Bottom Line: Pending.
VADER
Is VADER evil? Maybe. But it also stands for Valence Aware Dictionary and sEntiment Reasoner.10 Impressively, the found âthat VADER outperforms individual human ratersâ when classifying tweets as positive, accurate, or neutral.11 Part (most?) of that impressiveness is due to the âERâ of VADER. Nevertheless, here Iâm only considering the VAD part. If you want to check out its ruled-based sentiment reasoning, check out its github page or publication.
The lexicon has an impressive 7,500 entries, each with its associated polarity and intensity (-4 to 4). Did they get those intensities just winging it like Finn? Nope. Each potential entry was placed on the -4 to 4 scale by 10 Amazon Mechanical Turk workers.12 The score you do see in the dictionary means that a) ratersâ scores had a standard deviation of less than 2.513 and b) that the mean rating among the 10 raters was not 0.14
One interesting feature of the lexicon is its inclusion of emoticons.
<- read_delim('https://raw.githubusercontent.com/cjhutto/vaderSentiment/master/vaderSentiment/vader_lexicon.txt',
vader delim = '\t', col_names = FALSE) %>% magrittr::set_names(c('token', 'score', 'sd', 'scores'))
%>%
vader filter(str_detect(token, '[A-Za-z1-9]', negate = TRUE)) %>%
group_by(bin = score %/% 1) %>%
mutate(y = row_number()) %>%
ungroup() %>%
ggplot(aes(bin, y, label = token)) +
geom_text(check_overlap = TRUE, fontface = 'bold', family = 'IBM Plex Sans') +
scale_y_continuous(NULL, breaks = NULL) +
labs(
x = '',
caption = 'To reduce the real-valued chaos, I rounded down emoticons\' scores to the nearest integer'
+
) theme(
panel.grid.major.x = element_blank(),
axis.text = element_text(face = 'bold'),
panel.grid.minor.x = element_line(linetype = 2, color = 'grey35')
+
) guides(color = 'none')
I donât want to make this dictionary seem trivial. Its creators also validated it using sentences from New York Times editorials, as well. Itâs just not every day that you can make a histogram of emoticons.
- R Packages: As far as I know, itâs not in any. You can get it directly from its github repository.
- Side Benefit: This is probably the one that your âpythonistaâ friends are familiar with, since itâs in the
nltk
library.
Coverage Comparison for AFINN, BING, and NRC
<- get_sentiments(lexicon = 'afinn')
afinn <- get_sentiments(lexicon = 'bing')
bing <- filter(get_sentiments(lexicon = 'nrc'),
nrc_emotions !(sentiment %in% c('positive', 'negative')))
<- filter(get_sentiments(lexicon = 'nrc'),
nrc_polar %in% c('positive', 'negative')) sentiment
bind_rows(
select(mutate(afinn, lexicon = 'AFINN'), word, lexicon),
select(mutate(bing, lexicon = 'BING'), word, lexicon),
select(mutate(nrc_polar, lexicon = 'NRC'), word, lexicon)) %>%
summarise(Lexica = list(lexicon), .by = word) %>%
ggplot(aes(Lexica)) +
geom_bar() +
scale_x_upset(n_intersections = 7) +
theme_minimal() +
theme(
panel.grid.major.x = element_blank(),
panel.grid.minor.y = element_blank(),
title = element_text(family = "IBM Plex Sans"),
plot.title = element_text(face = 'bold'),
plot.subtitle = element_markdown(),
axis.text = element_text(family = "IBM Plex Sans")
+
) labs(
y = 'Set Size',
x = 'Set',
title = 'Are Bigger Dictionaries Mostly Supersets of Smaller Dictionaries?',
subtitle = "*No*, and that's weird"
)
We can see how many entries each dictionary has:
c(nrow(afinn), nrow(bing), nrow(nrc_polar))
[1] 2477 6786 5626
Here we see a surprising amount of non-overlap. Of Bingâs 6786 terms, almost 4,000 do not appear in either of the other two dictionaries. Almost 3,000 of NRCâs polarity entries arenât in either of the other two, as well. The third bar indicates that BING and NRC share just over 1,500 words. The short and long of this is that it might be important which dictionary we choose. They have very different coverages.15
Still to do
This page, like the rest of my life, is a work in progress.
- I still have a few dictionaries to add. In Tyler Rinkerâs
lexicon
package there are: jockers (and jockers_rinker), emojis (and emojis_sentiment), senticnet, sentiword, slangsd. Iâve already covered all dictionaries intidytext
andtextdata
. - Further comparison of dictionary overlap. 2a. Right now I look at three dictionariesâ overlap. These three arenât the best, theyâre just the first ones I checked out. Thatâs not a principled criterion for selecting dictionaries. Iâll redo that section programmatically. 2b. Iâm going to lemmatize/stem the dictionaries covered and get their size and overlaps as I did with AFINN, BING, and NRC above. Itâs possible that sizes are much more similar once you remove a bunch of morphological tangle.
- Inspired by this sentence from Nielsen, âThe word list have a bias towards negative words (1598, corresponding to 65%) compared to positive words (878)â Iâm also going to see what the respective positive/negative balances are of these dictionaries.
Recommended Usage and Tendentious Postscript
Something a little wild to keep in mind. The way these dictionaries arrived at their codings of words ranges from a lower bound of sensible to downright impressive. Thatâs obviously good. Keep in mind, however, that youâre not categorizing words as good or bad. Youâre using words as features to get calculate sentiment at some level of aggregation, and thereâs a chasmic categorical difference between that task and (sub-)task of classifying words. Even if the word ratings had been arrived at perfectly, using âthe dictionary approachâ to measure aggregate sentiment is another task entirely, one that requires separate validation. It would entail having people rate the sentiment of sentences/utterances/paragraphs, taking those as ground truth, and then seeing how well these dictionaries capture that. As far as I know, only Mr. Finn Arup Nielsen did that for his AFINN dictionary (and to so-so results even within the domain of social media; external validity would be another matter). So, always keep in mind what youâre doing. Youâre throwing a bunch of text into a blender that gives you a read-out of some of that textâs ingredients. Youâre not getting a direct reading of the textâs emotional valence
Footnotes
For most of them the answer is probably not.âŠď¸
And I try to keep my distinct kettles of fish in distinct posts (here)âŠď¸
I think itâs also the explanation for AFINNâs name. Itâs a portmanteau of the âAâ from âANEWâ and âFINNâ, this guyâs christian name.âŠď¸
âsome kindâ is the only entry in the dictionary with a 0 value.âŠď¸
Iâm not sure why this sentiment lexicon is occasionally referred to as âBingâ when Bing Liu is just one of the two creators, but so it is. Bing Liu has written a textbook on sentiment analysis and his website is a thing of beauty, so Iâd say he deserves all good things.âŠď¸
Yes, only adjectives. Other parts of speech have since been added.âŠď¸
In fact, itâs being actively maintained which is nice.âŠď¸
Before researching for this post, I hadnât heard of Plutchik or his theory of emotions. The idea of basic emotions is definitely a thing, however (see Ekmanâs âAn argument for basic emotionâ). To see if people are using this guyâs ideas, I looked over a few syllabi for psychology of emotion courses and through Michelle Shiota and James Kalatâs Emotion textbook. Nothing in the syllabi. He was mentioned in the textbook, but just as a proponent of basic emotion. Using their sole citation to a work of Plutchikâs, I went to Google Scholar to see if people are citing this thing. The idea is that, if people are citing it, then itâs an active theory, if not itâs scientifically dead. What I found was that itâs not being cited that much, but when it is, itâs by people doing sentiment analysis. It reminds me a little of Freud-psychology situation: if you take Freud seriously, youâre not taking psychology seriously. Maybe with Plutchik, if you take Plutchik seriously, youâre not taking emotion research seriously?âŠď¸
Yes, I realize Iâm valorizing professionalism after just writing âMiniconference on yak yak yak.â I may not contain multitudes, but there are at least two wolves inside me (as the saying goes).âŠď¸
This dictionary is definitely winning the marketing competition. I asked ChatGPT IV to make sentiment dictionary names corresponding to the initialisms SKYWALKER AND LEIA and got âLexicon of Emotion Intensity and Analysisâ for the latter. SKYWALKER was rough.âŠď¸
You should be wondering how itâs possible for this metric to be better than human raters at classifying tweetsâ sentiment if the ground truth of what a tweetâs sentiment is comes from humans. Just to be impish, Iâm actually not going to answer that question; Iâll just that that the word individual in âindividual human ratersâ is very important.âŠď¸
Not necessarily the same 10 workers for every word!âŠď¸
If there was no consensus (operationalized as a standard deviation > 2.5), the candidate word didnât make it into the dictionary.âŠď¸
I wonder if they meant to say that it didnât round to zero. If you had 9 people rate a word at 0 and a single person rate it at 1, the word wouldnât have a mean rating of zero, but ⌠câmon. Note: After writing that as an obviously ridiculous footnote, I was looking through the dictionary and noticed calmodulin, a word I had never seen. It received 8 zeros and 2 ones.âŠď¸
This situation should seem strange. Iâd argue that your priors should have closer to, âBigger dictionaries will be more like supersets of smaller ones than mostly non-intersecting bigger sets.â Think about it this way. If you were creating a sentiment dictionary, youâd try to gather all the most important emotion words and score their polarity. If you found out someone had created a bigger sentiment dictionary, youâd assume that either they included more generically obscure words or maybe words unique to a given time or application. In either case, youâd expect their dictionary contain at least most of your words.âŠď¸