Through Their Words: Hillary, Bernie, and their Republican Friends — Text analysis of the first Democratic Debate
October 14, 2015
by James W. Pennebaker and Kayla N. Jordan
Department of Psychology, The University of Texas at Austin
Computer analyses of the language of the candidates in the most recent debate reveal some basic personality and psychological differences. The primary comparisons are between Hillary Clinton and Bernie Sanders as well as the ways they differed from the most likely Republican nominees: Bush, Fiorina, Rubio, and Trump. The main findings include:
- Clinton is much more positive in emotional tone than Sanders. Her levels of positivity are comparable to the most positive Republican, Jeb Bush.
- Sanders is somewhat more of an analytical, formal thinker than Clinton. All the candidates, however, are reasonably similar with the exception of Donald Trump — who appears incapable of thinking in a formal, logical way.
- Sanders and Clinton use language that comes across as authentic and honest in ways similar to Trump. Fiorino, Bush, and Rubio are significantly lower than the others in authenticity.
- Sanders’ language suggests greater clout and more power-awareness than Clinton’s language.
- The male candidates across parties tend to use male-centric language at high rates with Sanders and Rubio being the most male-centric. For the female candidates, Clinton is balanced in her gender references while Fiorino uses slightly female-centric language.
This is the first of several brief blog posts about the 2016 election. Our plan is to use some sophisticated computer-based text analysis methods to get a better sense of the social and psychological dimensions of the candidates. No platforms or positions discussed here. Just people’s basic thinking, emotional, and interpersonal styles.
The basic system we rely on will be a computer program developed in our lab, Linguistic Inquiry and Word Count, or LIWC (pronounced “Luke”, and available for research purposes at www.liwc.net, or for commercial purposes through www.Receptiviti.com). LIWC analyzes any kind of text and calculates the percentage of words that are emotional, cognitive, and another 80 or so dimensions. There are now hundreds of studies in political science, business, psychology, and other disciplines that have used it. For a brief summary of articles, check this link or this one. The more recent version of the program, LIWC2015, has just been released and has a number of dimensions particularly well-suited to political campaigns.
OK, let’s get serious. There are only two viable candidates for president in the Democratic race right now — Clinton and Sanders. O’Malley made a credible showing but until his ratings reach double digits, we’ll stick with the front runners. Biden may join the team eventually but let’s put him aside since he was a no-show for the October 13th debate.
Although LIWC can analyze a mind-boggling number of dimensions, we will focus on only a handful: emotional tone, thinking style, authenticity, power and clout, and sexism (relative referencing of males to females).
Historically, the American electorate has preferred more upbeat and optimistic candidates to more negative or hostile candidates. The simplest way to measure emotional tone is simply to calculate the total number of words that have a positive connotation (such as happy, success, good) or a negative one (anger, death, hurt). Across the board, Clinton was far more optimistic and upbeat than Sanders.
It’s also interesting to compare the emotional states of Clinton and Sanders to the most likely candidates on the Republican side — Bush, Rubio, Fiorina, and Trump (sorry to those of you who are rooting for Cruz or Ben Carson or one of the others — it’s not gonna happen). As you can see in the graph, Clinton’s emotional state is similar to Jeb Bush, Rubio, and Trump. Sanders and Fiorina are both impressively low.
Note that the Emotional tone index is a weighted score that ranges between 0 (very negative) and 100 (very positive).
The words people use in everyday language can reveal their natural thinking style. There are at least two ways to capture thinking styles that are related to the debates. The first measures people’s natural ways of trying to understand, analyze, and organize complex events. We have devised a metric called the categorical-dynamic index, or the CDI. A high score on the CDI is associated with analytical, formal, and logical thinking. A low score is often associated with more narrative thinking where the speaker is in the here-and-now. The CDI has been found to be related to college grades, measures of intelligence, and various markers of academic success.
The second measure, which we call Cognitive Processing, reflects the extent that people are trying to work out a problem in their minds. For example, if you were asked to describe the most efficient way to get from your home to the City Hall in a particular town about 100 miles away, you would likely use words such as think, believe, realize, know, and other cognitive processing words. However, if you knew precisely how to get there because you had driven that route dozens of times, you would not use cognitive processing words. In other words, if you are still trying to figure out a problem, you use cognitive processing terms; if you are certain that you know “the answer”, you don’t use these words.
In terms of raw intelligence and the ability to think analytically, Sanders scores slightly higher than Clinton. In fact, most of the likely Republican nominees are in the same range. The one who scores far, far below the rest is Donald Trump. Trump shoots from the hip and is guided by his intuition. He is not a logical or analytical thinker.
Of all the candidates, Hillary Clinton scores highest on cognitive processing — meaning she is someone who continues to work through issues as they come up. Bernie Sanders, however, is the lowest of all the candidates. This suggests that he has already thought through his positions and is most entrenched in his beliefs in that he is no longer thinking about them as much as the others.
Over the years, multiple labs have developed algorithms that tap the degree to which people are personal, honest, and authentic. The authenticity measure is made up of words such as I-words (I, me, my), present tense verbs, and other dimensions previously associated with telling the truth. While the Democrats are quite similar to each other in terms of authenticity, they are strikingly higher than the Republican candidates with the exception Trump.
Clout and Power-Awareness
There are at least two ways to think about power, status, and clout. The first, which we call clout, is the kind of power that is seen in a strong leader. A person with clout speaks with confidence and a sense of certainty. People who have clout tend to use we- and social words more and I-words, negations, and swear words less. Overall, Sanders uses language that suggests more clout than Clinton. Although Fiorina was far higher than Sanders in the last Republican debate and Trump was somewhat lower than Clinton.
The second measure of status is power awareness. That is, to what degree are people aware of people with more or less power than they have? When you walk into a room, to what degree do you notice the relative status of others? The power-awareness measure captures the degree to which people use words such as command, boss, and defeat. As depicted in the table, Sanders is much more power-aware than Clinton and is at the same level as the most power-aware Republican Marco Rubio. It’s interesting that Trump is by far the least power-aware of all the candidates. In his mind, he already has the most power and there is no reason for him to have to size up other people along this dimension
Male-Centric Language (A marker of sexism, perhaps?)
The ways people use words tell us where they are paying attention. If a speaker uses a high rate of words such as women, females, she and her, and relatively few references to males, the speaker is simply thinking and talking about women more. Is this sexist language? Maybe, maybe not. Does this speaker always make more references to women than men? If so, the person certainly isn’t paying much attention to people of the male persuasion.
By analysing references to females and males, we can get a sense of candidates’ natural orientations to women and men. We aren’t warranting that this is a measure of overt sexism but it may be a subtle or implicit signal of gender bias.
Check out the graph. The numbers refer to the percentage of all gender references that are male. Numbers above 50% suggest a male bias; numbers below 50% hint at a female bias. Overall, Hillary Clinton and Carly Fiorina use gendered language very differently than the male candidates. Clinton referred to males and females at similar levels; Fiorina made more references to women than men — partly based on the questions she was asked in the last debate. The men, however, were far more gendered in their language than the women. It wasn’t even close. The two who were most male-centric were Sanders and Rubio.
In the Future