Welcome
For reasons lost to oblivion, I decided to analyze transcripts of Ezra Kleinās podcasts using elementary text analysis. The following was the result.
Interviewing Style
Before even looking at the content of the episodes, before looking at what words are said, sentiments are expressed and topics are covered, we can look at volume of speaking.
We can also compare this distribution of how much Ezra speaks as interviewer to how much other interviewers speak. After all, itās hard to really feel what the numbers in the graph above mean without comparing them to the relative talkativeness of other interviewers. Below is a comparison of Ezra with two other interviewers, both of whom have had Ezra on as a guest on their programs: Rob Wiblin (80,000 Hours) and Tyler Cowen (Conversations with Tyler).
This is a little more interesting, I think.
My hypothesis before crunching these data was that Tyler was speak the least (and it wouldnāt be close), then Rob, then Ezra. And that turned out to be right.1 After all, Ezra speaks quiet a bit because people tune into Ezraās show in order to listen to Ezra. Though Tylerās show does have āTylerā in the name, he frequently announces that his interviews are the interviews that he wants, not what the listeners might want. And his listener-be-damned approach often results in quick questions and abrupt topic changes. The 80,000 Hours podcast has this intermediate position where Rob often positions himself as a listener might and restates what the speaker said, gives the conventional wisdom against which the interviewee can react, and ask follow-up questions.
Ezraās Words
Alright, so what does this ingenious interviewer from Irvine actually talk about? Among the ways of answering that question, one is to just look at the words he uses. The simplest way of operationalizing āwords usedā is to look at āunigrams,ā which result from splitting the unstructured text at every whitespace2, making each whitespace-separated unit into a row in a dataframe, and counting how many rows each token (word) appears in that resulting dataframe. Ezraās result is below.3
But you can also get slighly more sophisticated and look at contiguous words. So, instead of splitting at every whitespace, you split at every second or third whitespace. The nice thing about that is you start to see phrases, turns of speech. For example, if you were curious why āpeople,ā ālot,ā andā bitā appear in the unigrams, the bottom-most panel will give you your answer.
Among the bigrams and trigrams, an Ezra Klein Show fan will also be pleased to see Ezraās pat phrases for parts of his podcast: āfinal questionā, ābooks youād recommend,ā and āemail [is] ezrakleinshow@nytimes.com.ā Iām not sure if seeing these phrases is interesting, but they are nice proofs of concept for the method. You also see that the bigrams are dominated by pairs of words that, yes, do have a space between them, but they are lexically one thing. Iād argue that āpast coupleā and āminute agoā are only instances of more-or-less independent words coming together.
Now letās try something potentially more interesting. We can also look at how he uses words differentially. Why would that be useful? Well, just looking at the words he uses might tell us that heās a guy with a show that covers topics X, Y, and Z that use the words associated with those topics.4 One way we can ācontrol forā topic is to contrast Ezraās words with those of his guest. Abstractly, the idea is that if person A and B are talking about subject S, the words that differentiate A and B are not going to be those that generically describe subject S. Rather, they will be the words idiosyncratically associated with each interlocutor.
This surprised me. Sure, one would expect most words to be along the diagonal line which indicates that the word is used in equal proportions by both Ezra and his guests, but I really expected this to show more unique usages in the off-diagonal space.
We can also compare the words of the Ezra Klein Show to the words used in introductory political science textbooks. The idea of such an analysis would be to ask, āWhen Ezra talks politics, which topics (proxied by words) does he dedicate space to than an introduction to politics? Which topics does he spend less time one?ā From these results one could hope to find out something about which aspects of politics Ezra finds interesting and important, and which less so. Unfortunately, I carried out the analysis and the result is below:
I call this āanalysisā āunfortunateā because it doesnāt generate any insight. I mean, what are you really supposed to get from this? One takeaway for the practitioner of text analysis is that you might have to do a decent amount of preprocessing before you can make impactful deliverables. Two useful ways you can preprocess raw text here are lemmatizing (or stemming) and using a parts-of-speech tagger to extract only nouns. What both of those steps share is the winnowing out (or consolidating) of noise/chaff. In a subsequent post Iāll try these feature-space-shrinking techniques out.
Sentiment Analysis
I think most listeners of The Ezra Klein Show have noticed that Ezra is unique among interviewers in his emotionality. Not emotionality in a histrionic sense, but more like āgeneral emotional urgency.ā5 So, in theory it would be interesting to use text analysis to get at this aspect of The Ezra Klein Show that sets the show out from others and uniquely characterizes Ezra as an interviewer. My hunch, however, is that the basic methods of sentiment analysis are not up to the task. In a forthcoming post Iāll dive deeper into why (and what methods would yield more insightful results), but for now letās very briefly look at the results of these dictionary-based methods.
As Tolstoy famously noted, if there are bad ways of doing something, theyāll bad in different ways. Letās put that to the test with sentiment dictionaries. There are various āsentiment lexiconsā that will limpet onto your text to give you the raw materials for calculating sentiment scores at whatever level aggregation you choose. Here Iāve chosen three supposedly general-purpose dictionaries and calculated the sentiment score for each episode of TEKS. As you can see in the table below, their episode-level correlations are very high.
BING | AFINN | NRC | |
---|---|---|---|
BING | 1 | ||
AFINN | 0.91 | 1 | |
NRC | 0.83 | 0.85 | 1 |
Nice. Convergent validityā . And now I donāt have to read Anna Karenina.
If we want to be a little more aesthetic (š š»), we can look at sentiment scores with a plot over time.
Sure enough, it looks like the three dictionariesā scores also (unsurprisingly) strongly correlate at the week level as well. We also see that, by and large, the overall sentiment of episodes is positive.
Iāve also highlighted which episode each dictionary picks out as the most extreme, both positive and negative While they disagreed on most positive (hence the three positive episodes), there was three-way agreement on the most negative: his interview Rachel Zoffnessās author of The Pain Management Workbook. So itās hardly surprising that these out-of-the-box algorithms would rank it as the most negative. It is disappointing, though. I went back and listened to the episode after finding out it was universally declared the most negative and ⦠isnāt actually that negative. The post October 7th episodes on the situation in Israel and Gaza are much much more negative. Lesson: convergent validity does not imply construct validity.
Right, so the simple valence continuum (positive/negative) was mostly a flop. How about using a sentiment dictionary that maps words to eight different emotions and seeing which emotions predominate in The Ezra Klein Show? Eight is four times bigger than two, so that sounds promising. Below is what that looks like over time.
Ok, this could be interesting. We see that trust is consistently the great contributor to the showās positivity and fear to its negativity. Ideally weād compare these numbers to those of another show. But one way we can kick the tires on this analysis is by seeing what words drive trust and fearās respective high-rankings.
Iām just ⦠not sure about this. You see why dictionary-based isnāt a particularly valid way of inferring emotion from word usage. Just looking at the Trust side of things, ākindā appears in collocations like ākind of bad,ā (which would have the valence wrong) and, ākind of [entity],ā which is not usually valenced. Itās certainly not always (probably not even usually) an indicator of trust. You can go down the list and find similar (and other!) problems with every other word. Iām honestly not sure how people take this seriously.
Concluding Thoughts
Iām not sure if I would have predicted this beforehand, but by far the most interesting plots here are the two looking at simple volume of speaking. Why is this? I think it has to be because itās the only case where the method (counting, in this case) adequately represents the thing it purports to measure. I suppose word counts do that as well, but thereās no spice when thereās no implicit contrast (as is the case simple word counts) and then when there is contrast (in those graphs that plot proportions against each other), nothing pops out. Ex ante it might have be unknowable that was going to happen.
Acknowledgements and Arigatos
Andrew Heissās data visualization course inspired (and enabled) me to have nice(ish) looking text on my graphs. If you know some ggplot2
and are looking to become high-intermediate in your ggplot2
skills, I highly recommend his course.
The vast majority of the analyses you see here I learned back in the day from Julia Silge and Dave Robinsonās Tidy Text Mining with R book. I recommend it to anyone as a jumping off point for doing text analysis. Most of the analyses here donāt go (too far) beyond what you can do with the tools you pick up in that book
Footnotes
I sadly didnāt preregister these hypotheses, so you have to reason to believe me.ā©ļø
You can also split on other delimiters as well. Various non-alphanumeric characters would doā©ļø
For the following two graphs, I recommend right-clicking them and opening in a new window. The proportions are terrible on this page. My sincerest aesthetic apologies.ā©ļø
If we wanted a topic analysis weād head down to the last section of this page!ā©ļø
Iām not sure I like that term (and I donāt think it conveys anything to someone unfamiliar with the podcast), but Iāll leave it there as a placeholder for now.ā©ļø