On Tuesday, August 16, Rick Perry was surprised to learn that other politicians did not approve of his statement claiming that it would be “treasonous” if the Federal Reserve Chair printed more money. Oddly, throwing around the word treason is not something that most presidential candidates have historically done. According to the Austin American Statesman:

Perry didn’t back down. “Look, I’m just passionate about the issue,” he said in Dubuque, Iowa. “And we stand by what we said.”

Did you catch that? “And we stand by what we said.” Hmmmm. We? Who exactly is we? This is a classic way that people psychologically distance themselves from what they are saying.

Watch the pronouns.

James W. Pennebaker
Author of the forthcoming book, The Secret Life of Pronouns (NY: Bloomsbury)

By James W. Pennebaker and Raj Persaud

With the final debate on Thursday night, people have heard each of the three candidates spew over 17,000 words – that’s more than the average human says in a full day.  Using computerized text analysis methods, we now have a fairly good picture of how each of the candidates uses language within the debate setting.

Recall that the words people use in everyday speech reflects who they are. Of particular relevance are a group of words called function or junk words.  These almost-invisible words include pronouns (such as you, me, they), articles (a, an, the), prepositions (for, with, to), etc.  The ways people use these words can tell us about speakers’ emotional states, formality, honesty, thinking styles, and other dimensions of how they approach their worlds.

Gordon Brown, David Cameron, and Nick Clegg all have distinctive speaking styles that highlight different parts of their personality.  Some of these differences are obvious; others are not.

Optimism.  People who are upbeat and optimistic tend to use present and future tense verbs, first person plural pronouns (we, our, us), simple words, and words that denote positive feelings and, at the same time, tend to avoid words that express negative feelings.  Although the first debate found Clegg and Cameron to be quite high in optimism, Clegg’s upbeat language has moderated whereas Cameron’s has increased ever since.  For last night’s debate, David Cameron was by the most upbeat and optimistic followed by Clegg with Brown far behind.

Want an upbeat, optimistic PM?  Vote David Cameron.

Language Markers of Optimism

Brown Cameron Clegg Direction
Present tense verbs 11.53 12.55 11.62 High
Future tense 1.26 1.63 1.42 High
We-words 2.98 4.28 2.70 High
Positive emotions 3.08 2.75 3.15 High
Negative emotions 2.20 1.85 1.47 Low
Big words 18.90 16.26 16.69 Low

Note that numbers refer to percentage of total words used by each of the candidates.  So 11.53 percent of all of Brown’s words were present tense verbs.  The Direction column indicates what numbers are associated with high optimism.  That is, optimism is associated with high use of present tense verbs and low rates of negative emotion words.

Honesty.  Over the last 10 years, more than a dozen studies of all kinds have analyzed the language of honesty and deception.  At least five language dimensions are reliably linked with honesty and another 3-4 are associated with lying.  People are more likely to be telling the truth if a) their sentences are longer and more complex, b) they use I-words more (e.g., I, me, my), c) they use bigger words, d) the make more references to time and motion, and e) they use more self-reflective words such realize, understand, and think.  The best markers of deception are would-should-could verbs, positive emotion words, and you-words.  Averaging across all these dimensions, Gordon Brown comes across as having the most honest language profile.   Clegg comes in a distant second with Cameron close behind Clegg.

Want an honest PM?  Vote for Gordon Brown.

Language Markers of Honesty

Brown Cameron Clegg Direction
Words per sentence 20.66 17.38 18.98 high
Big words 18.90 16.26 16.69 high
Conjunctions 6.98 5.52 4.99 high
I-words 1.93 2.18 2.70 high
Motion 2.18 2.32 1.45 high
Time 4.84 4.16 4.34 high
Insight 1.53 2.02 2.20 high
Would-should 2.14 3.04 2.96 low
Positive emotions 3.08 2.75 3.15 low
You-words 1.66 1.87 2.44 low

Except for average number of words per sentence, all numbers refer to percentage of total words used by each of the candidates. The Direction column indicates what numbers are associated with high honesty.  That is, honesty is associated with high rate of words per sentence and low rates of would-should-could words.

Thinking style.  The various text analysis methods find that all three candidates are quite bright.  They do, however, think differently.  One interesting difference in thinking is how people break down a complex problem.  For example, if confronted with a new challenge, one strategy is to reduce the problem into its component parts.  To do this, people generally use concrete nouns (which are reflected in the use of articles) and, the more specific they become, they will need prepositions and other linguistic devices (such as relativity words) that reflect specific concepts, objects, and things.  We refer to this as analytic thinking.  Another approach is to trace the evolution of problems and project how they will change in the future.  Looking at how events unfold over time requires more verbs, especially past and future tense.  This is often called dynamic thinking.

Gordon Brown is quite analytic in his approach whereas David Cameron is strongly dynamic.  Nick Clegg is midway between the other two on both of these dimensions.  If you would like to see examples of these differences in thinking, read how the three responded to a question about what their party would do to help families pay for housing:

Gordon Brown: …there is a pent-up demand for housing in our country. There are one million more home owners than there were just over ten years ago, so more people are buying their homes. …Shared equity is something that might be considered because that’s a chance to buy up a part of your house, and it’s become a more popular way of doing things and we are able to help finance that and work with the building societies and banks.

David Cameron: I have every sympathy with you because, frankly, today in our country, people who try and work hard and save, and obey the rules, and do the right thing. All too often, they just find hurdle after hurdle put in their way, whereas people who actually don’t play by the rules, who don’t think about saving and don’t think about their behaviour often get rewarded and that’s not right.

Nick Clegg: … this is one of the things that I, along with immigration, actually, that I probably hear about more than anything else as I travel around the country, a lack of affordable housing as I travel round the country. The lack of affordable housing. The people in your situation, but then there are, I think, 1.8 million families, that’s five million people, who are still on the waiting list for an affordable home.

As you can see, Brown is coolly analytic about the problem, evaluating what aspects of the economy may be contributing to the problem.  Cameron traces what people have done and are doing.  He sees the problem more of the action of others in the past and the present.  Clegg actually doesn’t think much about the problem at all but, instead, later talks about how he would fix it.

Want an analytic thinker?  Vote Brown.

Want a dynamic thinker? Vote Cameron

Want someone who is somewhere in between?  Vote Clegg.

Caveat: Linking natural language use with social, cognitive, and personality dimensions is a relatively new science.  It’s important to think of it in probabilistic terms.  Our approach is more accurate than flipping a coin but far from 100% accurate.  Also, these analyses are based purely on how the candidates spoke in the debates.  As we’ve seen, once the microphones are thought to be turned off, the candidates may actually talk and think differently from how they might appear on the international stage.  Finally, optimism, honesty, and thinking styles are important qualities of leaders.  Remember that there are dozens of other qualities that contribute to good leadership that we are not measuring here.  Consider these traits just the tip of the iceburg.

Is David Cameron this week’s new heartthrob?

By James W. Pennebaker and Raj Persaud

In the first UK Prime Minister debate on 15 April, Liberal Democrat Nick Clegg wowed the country with his warmth, humility, and charm.  Using the LIWC computerized text analysis program, Clegg’s language during the debate was found to be distinctively personal, positive and honest compared to Gordon Brown’s academic distance.  David Cameron’s linguistic style was the least distinctive.

Before the last televised debate, Cameron was widely perceived as the heir apparent to the throne.  It is likely that his advisors were warning him that his primary task was to not slip up.  He certainly succeeded.  But given the surge in Liberal Democrat support following Nick Clegg’s unexpected performance last week, the demands on Cameron may have altered. Now, with the crown apparently slipping from his grasp (polling showed Liberal Democrats had pushed the Conservatives into second place), a change in approach may have been forced upon Cameron.

Oh what a difference a week makes in politics… and in linguistic analysis.

Text analyses of the 22 April debate in Bristol suggest a flipping of linguistic roles between Cameron and Clegg.  David Cameron, by using the most I-words, is now giving the impression of being more personal than his two competitors.  He also practically bubbled with upbeat language – the role that Nick Clegg grabbed last week. He has also adopted Clegg’s strategy of using high rates of present tense verbs and not referring much to the past.

In last week’s debate, Nick Clegg used more personal language (more I-words for example), more positive emotion words, and tended to talk in the present tense at the highest rates.  These are strong indicators of psychological immediacy, in other words, he was speaking more in the here-and-now.   These same language markers have also been found to correlate with truthful – as opposed to deceptive – language.

This week, Cameron has passed Clegg in the personal-immediate-honest department.

Last week Cameron scored highest on negative emotion and Clegg the lowest; this week Cameron was lowest on negative emotion. The reason he has been able to pull off this switch is not just that he himself has dropped his use of negative emotion (last week negative emotion words were 1.85% of his total output compared to 1.52% this week), but also Clegg allowed his use of negative emotion to climb (last week it was 1.35% of his total output compared with 1.61% now).

Cameron didn’t just adjust his game in what would appear to be an adept grasp of where he went wrong last time. Clegg doesn’t appear to have either understood where he went right, or if he did, he has for some reason struggled to keep his eye on the linguistic ball, ensuring he maintained the gap between himself and the others verbally and emotionally.

This could be worrying to Liberal Democrat strategists, as it may suggest that their candidate is going to struggle to maintain the distinctive persona over the next few weeks.

Much of the voter appeal of candidates is in how they communicate rather than just what they communicate.  This was particularly apparent in last week’s remarkable surge in Clegg’s popularity.  From relative obscurity, Clegg was a refreshing new face who spoke in what appeared to be a direct, honest, and upbeat way.  We may not know what he said but he said it so well.

The fluidity of the two challengers is striking in comparison with rock steady Brown.  The prime minister continues to be the least personal of the three in his use of pronouns and I-words.  He displays a thinking style that reflects a natural strategy of organizing complex ideas into highly specific concrete categories more than his opponents (as can be seen with his use of articles and prepositions).  Brown’s way of speaking is predictable, and we may not be able to expect much alteration from him. This could be worrying to Labour strategists who, if Labour continues to bump along at the nadir of third place in many opinion polls, might be hoping for a transformational response from Brown.

It also might suggest that the outcome of this election is going to turn most on how Cameron and Clegg continue to evolve.

Ominously for the Liberal Democrats given what a euphoric week they have just had, Cameron seems to have sidled his way into Nick Clegg’s verbal territory. This could explain the immediate reaction opinion polls which generally have not put Nick Clegg as the overwhelming winner, compared to last time, and instead have either put Cameron marginally ahead, but more generally have put all three leaders at much more level pegging compared to last week.

From a language perspective, just as Cameron has shown a more personable side, the linguistic analysis suggests that Nick Clegg appears to be taking himself a bit more seriously.  This week, he is meaningfully lower than Cameron in personal pronouns in general and I-words in particular.  He also seems to be censoring his own thoughts and feelings compared to last week with a large jump in his use of negations (words such as no, not, never).

As the debates continue it’s not only the differences between the candidates that lends itself to linguistic and psychological analysis, but also how they adapt to polling reaction and each other. It’s not just where you are that counts in a political campaign, but also your ability to adapt and change as circumstances evolve.

Bottom line: Gordon Brown continues to be Gordon Brown.  Both David Cameron’s and Nick Clegg’s voices are evolving and they may be becoming key influences on each other, with Cameron so far learning the right lessons, while Clegg has at least temporarily apparently lost his grip on the distinctive strategy that previously put him ahead.

Examples Brown Cameron Clegg
Word count 5863 5846 5963
Big words 17.77 16.34 17.02
Personal pronouns 9.89 10.74 9.22
I-words I, me, my 2.10 3.03 2.52
We-words We, us 4.32 4.64 3.40
Articles A, an, the 7.66 6.79 6.78
Verbs Is, ran 18.44 19.36 17.34
Past tense Was, ran 2.95 2.50 2.80
Present tense Am, feel 12.76 13.84 12.66
Future tense Will, shall 1.42 F 0.89
Adverbs Very, so 3.58 5.25 5.32
Prepositions On, to 15.03 13.07 13.85
Conjunctions And, but 6.28 5.44 5.03
Negations No, not, never 1.38 1.54 2.21
Positive Emotions Love, nice 2.81 3.71 3.00
Negative emotions Cry, hate 1.84 1.52 1.61
Anxiety worry 0.48 0.22 0.25
Anger kill 0.56 0.58 0.42
Sadness think 0.19 0.14 0.17
Cognitive words Realize 18.73 22.02 18.87
Insight Cause 2.06 2.53 2.53
Cause Would. should 2.52 2.33 2.30
Discrep Love, nice 2.66 3.54 2.80

To learn more about the text analysis work, go to www.psy.utexas.edu/Pennebaker an click on the “Explorations into Language” link.

Dr. Raj Persaud,  FRCPsych MSc MPhil , is a consultant psychiatrist and Visiting Gresham Professor for Public Understanding of Psychiatry in London.

A companion piece to this blog can be seen at the New Scientist website: http://www.newscientist.com/blogs/thesword/2010/04/uk-election-psychology-of-the.html

by James W. Pennebaker

For the first time, the UK is trying out an American-style televised debate as part of its 2010 elections.  Thursday night, April 15, the three party leaders, the current Prime Minister Gordon Brown, the Conservative Party’s David Cameron, and the Liberal Democrats’ Nick Clegg, met in Manchester for their first 90 minute debate.  By all accounts, it followed the American script nicely – polite, uninspired, and rather forgettable.

The good news is that all three candidates used a lot of words – which was all I was hoping for.  For the last several years, my colleagues and I have been intrigued how the words people use in natural conversation reflects their social and psychological states.  Debates are fertile testing grounds because they are not completely scripted and the speakers are public figures who are used to being scrutinized.

As with the American presidential debates in 2004 and 2008, I was able to analyze the word use of the three candidates in order to get a sense of their personality, social, and thinking styles.  Unlike most language analytic methods, the strategy that my colleagues and I rely on focuses on the almost-invisible function or “junk” words that are common in speech – pronouns, prepositions, articles, conjunctions, and auxiliary verbs.  Junk words can be distinguished from the more familiar content words made up of nouns and regular verbs.  Whereas content words tell us what the person is talking about, junk words convey people’s linguistic styles.

Over the last several years, we have found that junk words are powerful correlates of personality, emotional state, and social styles.  For example, we can often tell if people are depressed, honest, arrogant, socially connected, and how they think by looking at their use of junk words.  Interestingly, junk words are processed in the brain differently than most content words and are very difficult to detect in natural conversation.  Consequently, all of our analyses are based on transcripts which are run through our computer program, Linguistic Inquiry and Word Count, or LIWC (pronounced “Luke” – for more information, go to www.LIWC.net).

The 15 April debate was remarkably tame and all three men tended to talk in similar ways.  The most striking differences emerged between Gordon Brown and Nick Clegg, with David Cameron somewhere in between.

Brown, who talked more than the others, used language that was emotionally and psychologically distant.  He was by far the least personal as indicated by his low use of 1st person singular pronouns, or I-words.  Instead of using “I”, he tended to use “we” – a sign of distancing we often see in less adept politicians (John Kerry and Al Gore were both big “we” users).  Consistent with this, Brown also used negative emotion words, especially words that signaled anxiety, at the highest rates.

In comparison, Nick Clegg was far more personal (more I-words), used the most positive emotion words, and tended to talk in the present tense at the highest rates.  All of these language dimensions are markers of psychological immediacy.  That is, Clegg, unlike the other two candidates, presented his arguments in the here-and-now.  It is also interesting that Clegg’s style the one that is most consistent with telling the truth.  People who are honest and not trying to hide anything use more I-words.

Cameron’s style was the least distinctive.  Like Brown, he was high in negative emotion words, but more angry than anxious.  He tended to be a bit more moralistic (using words like would, should, and could), less specific, with a greater focus on money-related issues.

In terms of thinking styles, Brown’s language was the most complex and interesting.  In comparison with the other two men, Brown was more concrete, focusing on particular objects and things (as indicated by his use of articles such as “a” and “the” – words that are needed with concrete nouns).  Like Obama, he also used more verbs than the other candidates, often a sign of more dynamic thinking.  Compared to Brown, both Cameron and Clegg used relatively more cognitive or thinking words – words such as think, realize, understand, because.  People generally use cognitive words when they are still trying to construct a story.  In other words, Cameron and Clegg are still trying to come up with ways to frame their thinking compared to Brown who already has a story in his head.

These language analyses should not be taken too seriously.  As the debates unfold, we will get a much better sense of the stability of the language use of the three men.  The first debate was a unique setting for all three politicians and, once they become accustom to the setting, their natural ways of speaking will leak out more.

Examples Brown Cameron Clegg
Word count 7212.00 6712.00 6379.00
Pronouns 16.53 17.24 18.03
I-words I, me, my 2.37 2.62 2.99
We-words We, us 4.73 4.16 3.15
Articles A, an, the 7.57 7.05 6.82
Verbs Is, ran 18.66 18.19 16.95
Past tense Was, ran 4.44 3.37 2.57
Present tense Am, feel 11.69 12.44 12.60
Future tense Will, shall 1.41 1.00 0.88
Adverbs Very, so 3.81 4.81 5.78
Prepositions On, to 14.86 13.47 13.39
Conjunctions And, but 6.28 5.66 4.94
Positive emotions Love, nice 2.69 2.59 3.04
Negative emotions Cry, hate 1.83 1.85 1.35
Anxiety worry 0.53 0.30 0.16
Anger kill 0.58 0.82 0.56
Cognitive words think 17.30 19.04 17.87
Insight Realize 1.91 2.06 2.85
Causal thinking Cause 2.02 1.85 1.76
Discrepancy Would. should 2.09 2.83 1.94

To learn more about the text analysis work, go to my website, www.psy.utexas.edu/Pennebaker an click on the “Explorations into Language” link.

by James W. Pennebaker

Most years since George Washington, the President of the United States has addressed the joint sessions of Congress along with leaders in the military, judiciary, and other parts of government in a public speech.  The purpose of the address is to summarize the accomplishments and problems of the nation and to lay out plans and expectations for the coming years.  Although the tone of the State of the Union addresses change from year to year, the occasion is generally a mixture of a sober analysis and political undertones.

The address is typically written, at least in part, by the president with help from experts, speechwriters, and aides. Nevertheless, it generally reflects the leader’s intentions, values, emotional and thinking styles, and personality.  Unlike the inaugural address, which is delivered to the nation once every four years, States of the Union (SOU) talks are delivered annually to the country’s governing body.  The SOU, then, is a more business-like and detail oriented communication intended to direct Congress to move in specified directions.

As has been discussed elsewhere, the words people use reflect their social and psychological states.  When analyzing people’s communication, it is possible to separate what they are saying from how they are saying it. That is, different words reflect the content of the communication and others reveal the style of the message.  Very broadly, linguistic content is conveyed through the use of nouns, regular verbs, and some adjectives and adverbs.  Language style is apparent through a group of words variously referred to as function, style, or junk words.  These style-related words include pronouns, prepositions, articles, conjunctions, and auxiliary verbs.

Style or function words are quite different from content words in that there are very few of them, they are used at high rates, are processed in the brain differently, and are quite social.  For example, of the 50,000 to 100,000 words most English speakers have in their vocabulary, only about 500 are function words.  Despite the small number of these words, we use them in almost every sentence. In fact, 50-60 percent of all the words we use are style words.  Of particular significance, these style words are social in the sense that they require a shared understanding between speaker and listener.

Over the last several years, multiple studies have found that the analysis of function words can reflect psychological dimensions of speakers.  Laboratory and real world studies indicate that pronouns and other style words predict a speaker’s honesty, social status, emotional state, social connections with others, dominance, and thinking style.  Function words are linked to people’s immediate psychological state within a given context and also can provide a broader view of their personality across situations and time.

SOU addresses are a perfect opportunity to study the psychological features of the nation’s leaders within relatively formal contexts.  Unlike most speeches, SOUs are generally given in the same location, to the same types of dignitaries, at the same time of the year.  Although the speeches themselves have undoubtedly been shaped by others, they continue to reflect the personality and thinking of the president and his staff.

The Current Analyses – with special attention on Obama

All of the SOU addresses from Truman to Obama spanning from 1946 through 2010 were analyzed using the computerized text analysis program LIWC (Pennebaker, Booth, & Francis, 2007).  LIWC analyzes each speech, calculating the rates at which over 70 categories of words are used.  In addition, six broader categories of language are calculated based on previous research.

Social-emotional style.  Many speakers work to establish a close personal relationship between themselves and their audience.  Markers of this warm interpersonal style include the use of personal pronouns, high rates of positive and negative emotion words, and references to other people.  In general, people scoring higher on the social-emotional style dimension are individuals who truly enjoy talking and connecting with others.  As can be seen in Figure 1, there has been a fascinating evolution in social-emotional language over the last 65 years – from very low social-emotional language to the second Bush’s peak.  Obama is reversing this trend.  Not as emotionally or socially detached as Nixon and earlier presidents, his style is comparable to that of Reagan’s.

Figure 1: Social-emotional style.  Higher numbers reflect use of more personal pronouns, references to other people, and emotional words.

Positive emotionality.  Speakers differ in the degree to which they convey feelings of positive and negative feelings in their speeches.  An overall positive emotionality index was computed by subtracting the percentage of negative emotion words from positive emotion words.  The higher the number, the more the speaker conveys optimism and the less he uses words that convey feelings of sadness, anxiety, or anger.  As can be seen in the second figure, Eisenhower, Carter, Reagan, and Clinton were consistently the most positive in their SOU addresses. Obama is striking in being the least positive.

Figure 2. Positive emotionality.  The higher the number, the more the person uses positive emotion words relative to negative emotion words.

Complex thinking.  An SOU address requires a certain degree of finesse to be effective.  The president needs to convey complex ideas in ways that a broad audience can understand.  Most issues facing a country – such as health care, national security, immigration – are composed of multiple dimensions that are often difficult to discuss in a simple way.  Since Truman, presidents have varied tremendously in their attempts to talk about large issues in complex ways.  Most opt to define problems simply and propose relatively straightforward solutions.

Function words allow for a nice metric to capture complexity of thinking.  When people are dealing with complicated problems they must acknowledge multiple sides to an issue.  Certain exclusive words – including but, except, without, or – signal that the speaker is making a distinction between what is and what is not included in the idea he is conveying.  Similarly, other word categories such as negations (e.g., no, not, never) and causal words (e.g., because, cause, effect) also reflect more complex thinking.

Figure 3 is a striking graph in suggesting that two presidents have been extraordinarily high in complex thinking – John F. Kennedy and Obama. Nixon and George H. W. Bush are a distant 3rd and 4th.  It is also interesting that both Bush-2 and Clinton are two of the least complex thinkers in their SOU addresses.

As a side note, the complex thinking dimension simply reflects the language that the president uses in the SOU address.  He may actually be a very complex thinker in general so these numbers merely tell us how he is presenting ideas to the congress and the American people.

Figure 3. Complexity of thinking.  The higher the number, the more complex and nuanced the language in the presentation of arguments.

Categorical versus dynamic thinking.  There are multiple ways to break down a complex problem.  Perhaps the most traditional method is to try to categorize the issue.  For example, if asked to evaluate the current economy, a categorical thinker would likely identify the various components, then the subcomponents.  In other words, the categorical thinker sees the first issue in approaching a new task as creating the relevant categories and the breaking down the problem to fit into the boxes that have been constructed.  People who are high in categorical thinking tend to use a high rate of concrete nouns, articles, and prepositions.

A very different approach is called dynamic thinking.  Dynamic thinking involves evaluating a new problem from a historical or developing perspective.  Instead of first evaluating the categories or dimensions associated with the problem, the dynamic thinker tracks how we have arrived at the problem, thereby tracking the problem over time.  If asked to evaluate the economy, the dynamic thinker may start with a point in the past and trace how historical forces have brought us to today’s economy.  Dynamic thinking is generally measured by the high use of verbs.  Interestingly, the more that people use verbs, the less they use nouns – suggesting that people tend to be either categorical or dynamic thinkers but not both.

Figure 4 reveals two fascinating trends.  The first is the evolution of dynamic thinking over time.  In the last 65 years, a striking shift in thinking emerged beginning in the 1980s.  With the election of Reagan, presidents moved from displaying categorical thinking to being more dynamic in the ways they discussed complex issues.  Every president since then has followed this trend.  Obama is striking in being by far the most dynamic and least categorical thinker in the modern presidency.

Figure 4.  Categorical versus dynamic thinking.  Higher scores reflect categorical thinking whereas lower (or more negative) scores indicate dynamic thinking.

The Language and Personality of Obama’s State of the Union Addresses

Barack Obama thinks and relates to people differently from most of his predecessors. His thinking style is both highly complex and, at the same time, dynamic.  Socially and emotionally, he is surprisingly cool and distant.  The word “cool” is not ill-advised.  In his SOU addresses, as well as his press conferences, he is detached.  His use of both positive emotion and negative emotion words is much lower than recent presidents.  Although his personal pronouns in his SOUs are slightly above average, they are actually quite low when talking informally in interviews or press conferences.  His is the language of the confident leader as opposed to the close buddy.

Obama has now delivered two SOU addresses.  Has his language changed much from a year ago?  Very broadly, no.  If anything, he is becoming more dynamic in his thinking and slightly less positive in his emotional tone.  Overall, however, he maintains a remarkably even style in the ways he talks to his audiences.

References

Chung, C.K., & Pennebaker, J.W. (2007). The psychological functions of function words. In K. Fiedler (Ed.), Social communication (pp. 343-359). New York: Psychology Press.

Pennebaker, J.W. (August 9, 2009).  What is “I” saying? (guest post).  The Language Log. http://languagelog.ldc.upenn.edu/nll/?p=1651

Pennebaker, J.W., Mehl, M.R., & Niederhoffer, K.G.  (2003).  Psychological aspects of natural language use:  Our words, our selves.  Annual Review of Psychology, 54, 547-577.

Slatcher, R.B., Chung, C.K., Pennebaker, J.W., & Stone, L.D. (2007).  Winning words: Individual differences in linguistic style among U.S. presidential and vice presidential candidates.  Journal of Research in Personality, 41, 63-75.

Relevant Websites

http://www.psy.utexas.edu/Pennebaker

http://www.wordwatchers.wordpress.org

http://www.utpsyc.org

http://www.analyzewords.com

Language of the Media — I

November 2, 2008

 

by Vera Vine and James W. Pennebaker

 

An important part of the 2008 election is the language of the mainstream media. Accusations of media bias fly from both sides of the aisle, including the supposed deep-seated liberal (or, sometimes, conservative) bias of television and newspaper reporting. But without a concrete metric for assessing media bias, most arguments about it often descend into partisan maneuvering. Our text analysis software program, Language Inquiry and Word Count (LIWC; see liwc.net), can help to quantify some of the media’s language. We focused on three major newspapers, the New York Times, the Washington Post, and the Wall Street Journal.

 

What we did:

 

Overall, 138 news reports were collected, comprising 46 topics covered by each of three newspapers, The New York Times (NYT), The Wall Street Journal (WSJ), and the Washington Post (WP), spanning the period beginning with the formation of the first presidential ticket on August 22, 2008, through the launching of the final week of campaigning on October 27, 2008. These newspapers were chosen because of their independence (each is owned by a different company), large readership, and reputations for influential and exemplary reporting. As of November 2, Obama has been endorsed by the NYT and WP; the WSJ has not endorsed anyone – although it has a conservative reputation.

 

To make comparison possible, news reports were selected so that each news story had counterparts with identical topics and similar dates to the other two newspapers. Thirteen articles from each paper were about Barack Obama’s campaign, thirteen were about John McCain’s campaign, eleven covered the U.S. economic crisis, and nine covered general election news concerning both parties equally (e.g., debates, shifts in polls).

 

What we found: Comparing the campaign coverage within each newspaper:

 

The New York Times:

The NYT articles about the McCain campaign were longer than those about the Obama campaign (on average almost 250 words longer). Pronoun use also differed: the NYT used significantly more impersonal pronouns when covering the McCain campaign, and more “you” when covering the Obama campaign.

 

The Washington Post:

The WP used shorter sentences when covering the Obama campaign than they did with the McCain campaign. When covering Obama, the WP also used more personal pronouns, particularly “I” and “you,” and more verbs. The WP’s coverage of the Obama campaign is also nearly significantly higher on the index of “immediacy,” a factor thought to indicate informal style (Pennebaker & King, 1999).

 

The Wall Street Journal:

The WSJ had the fewest differences between coverage of the two campaigns. Although no differences reached the level of significance, some trends suggest that the WSJ’s language when covering McCain’s campaign contain more negations, more anxiety words, more certainty words (“absolute,” “certainly”), and more exclusive words (“except,” “but”).

 

Taken together, these results suggest that the WSJ may actually be less biased than the NYT and WP in their political news reporting, despite a more conservative reputation. These results may be consistent with another study of media bias conducted by a group of political scientists (Groseclose, T. & Milyo, J. (2005). A Measure of Media Bias. The Quarterly Journal of Economics, 120, 1191-1237).

 

Comparing the mentions of candidates’ names:

 

Not unexpectedly, news stories in all papers said “Obama” and “Biden” more when reporting on the Obama campaign, and “Palin” more when reporting on McCain’s. What is somewhat surprising is that the newspapers referenced McCain much more freely, regardless of which campaign was the focus of the news report, which might suggest a preoccupation with McCain, or a tendency to consider news about Obama in light of McCain’s activities.

 

As for mentions of George W. Bush, considered by many to be the specter haunting this election, there was a trend suggesting higher rates of use of “Bush” when covering Barack Obama’s campaign, but only in the NYT. Obama sought to link McCain to the Bush administration, so perhaps the NYT has more coverage of these Obama talking points than the other newspapers do. Or perhaps this difference suggests that McCain’s attempts to distance himself from Bush may have been somewhat successful.   

 

Comparing the newspapers with each other:

 

Despite the differences in language between the coverage of the campaigns, the overall styles of the three newspapers were fairly similar when news reports on all 4 topics were taken together. When the language did differ, it tended to be in the expected directions based on the papers’ respective areas of expertise. For example, the WSJ articles were the least personal in their writing style, using fewer social words and more quantifiers (e.g., “much,” “fewer”) and impersonal pronouns (“it,” “that,” “those”).  The WSJ language also included shorter sentences, fewer function words (i.e., non-content words including pronouns, prepositions, and particles), less use of “we” and “they,” fewer verbs of almost all types, fewer exclusive words (such as “except,” “but”), and fewer cognitive mechanism words (“think,” “know”).

 

Long considered the “writers’ newspaper,” the WP used longer sentences, more “we,” more present tense and less past tense, fewer quantifiers, and somewhat fewer cognitive mechanism words.

 

The take-away:

 

The emotional tone of the coverage of the two candidates was surprisingly even handed across all three newspapers.  There was a weak trend suggesting a more personal tone in reporting on Obama’s campaign by the WP and NYT.

 

The next step will be to tease apart the linguistic styles of the reporters. For example, does the more personal and dynamic quality of reporting on Obama come from the language the reporters bring to the table, or from the oratorical style of things Obama is quoted as saying?  This is ultimately the dilemma in understanding any translation: Is the message an accurate account of the original speaker or does it reflect the psychological makeup of the translator?

by Molly Ireland

Most of us can probably recall times when we felt powerfully in sync with a person during a conversation, for better or worse. While in friendly situations synchrony often translates to simultaneous laughter and increased rapport, in less friendly contexts synchrony might take the form of synchronized suspicion and mutual outrage that the other person refuses to bend to our will.

Language Style Matching. In our lab at the University of Texas at Austin, we’ve been studying a specific kind of verbal synchrony which we call Language Style Matching, or LSM. Style matching is measured by comparing the way two sides of a conversation or two texts use function words, like pronouns, articles, and conjunctions. If two people use similar proportions of, for example, personal pronouns, then their LSM score in that individual LIWC category (see liwc.net) will be high. The comprehensive LSM metric we use to assess overall synchrony is the average of nine function word categories’ matching scores. As other entries on this site discuss, the way a person uses function words is often the key to predicting what they’re thinking and feeling at the time and how they are likely to behave in the future. So if LSM for a conversation is high, then odds are good that everybody’s in the same mindset – perhaps even if they don’t agree with or like each other.

Lab studies. In collaboration with Amy Gonzales and Jeff Hancock at Cornell, we found that in certain cooperative settings LSM is positively correlated with how much group members like each other and can help predict how well a group performs on a task. Recently we analyzed language in a more competitive setting, in transcripts from negotiation studies conducted at the University of Chicago by UT’s Marlone Henderson. Preliminary evidence indicates that LSM is not always a good thing. Higher LSM predicts poorer overall negotiation outcomes for participants who have been experimentally manipulated to approach the negotiation less objectively. For people with objective distance from the negotiation, LSM didn’t predict performance. (In general, objectivity leads to better performance, and being too close to an issue or a negotiation partner leads to poorer negotiation outcomes.)

As with all of the language research discussed here and elsewhere, it’s probably a bad idea to jump to conclusions about what, taken together, these two sets of findings mean for style matching that occurs in real life. But we do know that LSM is a reliable measure of function word synchrony, and we know that function words are themselves reliable predictors of psychology and behavior both in and out of experimental labs. Beyond that, we can safely guess that while style matching often leads to rapport, it can also indicate mutual stubbornness or distrust that, ironically, makes it harder to find common ground. And, more speculatively, to the degree that we can control when and how we style match, good communicators probably know when to follow another’s conversational lead and when to step out of sync.

The Candidates. Using online transcripts from news media websites, I looked at how the presidential candidates match their interviewers’ function word use. Hypothetically, LSM should be highest both when interviewer and interviewee are trying to make each other look good and when the two are at loggerheads. Low LSM might mean that the interviewer is trying to find the truth and the interviewee is focused on misdirection, or vice versa. Here’s how each presidential candidate matched with his interviewers (LSM scores in parentheses; 0 is perfectly out of sync, 1 is perfectly matched):


Barack Obama
1. Larry King (CNN host) (0.93)
2. Katie Couric (CBS anchor) (0.92)
3. Bill O’Reilly (conservative FOX News host) (0.90)
4. Michael R. Gorden and Jeff Zeleny (New York Times staff) (0.90)
5. Amanda Griscom Little (staff for Grist, environmental newspaper) (0.89)
6. Terry Moran (ABC News reporter) (0.89)
7. Chicago Sun-Times staff (0.87)
8. Cathleen Falsani (Chicago Sun-Times religious columnist) (0.86)
9. Jeffrey Goldberg (The Atlantic staff) (0.80)
10. Rick Stengel (TIME magazine editor) (0.76)


John McCain
1. Adam Nagourney and Michael Cooper (New York Times staff writers) (0.94)
2. Sean Hannity (conservative FOX reporter) (0.92)
3. Michal Reagan (conservative talk radio host) (0.91)
4. Pittsburg Tribune staff (0.91)
5. Peter Jennings (ABC reporter) (0.91)
6. Larry King (CNN host) (0.91)
7. Military Times staff (US Army newspaper) (0.90)
8. Martin Wisckol (Orange Co. online news reporter) (0.90)
9. Pastor Rick Warren (Evangelical minister) (0.90)
10. Larry Kudlow (economist, conservative CNBC host) (0.89)
11. George Stphanopoulos (liberal ABC reporter) (0.89)
12. Tim Russert (liberal NBC reporter) (0.86)
13. Financial Times (British financial newspaper) (0.85)

What these numbers might mean. On average, Obama matches slightly less than McCain, although both generally are highly synchronized with their interviewers. This could reflect Obama’s tendency to be more cool-headed and distant than McCain. Interestingly, both McCain and Obama matched with interviewers whose opinions were most diametrically opposed to their own as much as they matched with staunch allies. For example, one of Obama’s highest matches was Bill O’Reilly. The O’Reilly interview was not smooth: both often talked over each other and little headway was made by either side. Perhaps O’Reilly represents one of Obama’s rare failures to step back and regain objectivity when faced with conflict. Here’s an illustration from the September 4th interview (arguing about the success of the surge in Iraq):

SEN. OBAMA: … It has gone very well, partly because of the Anbar situation and the Sunni –
MR. O’REILLY: The awakening, right.
SEN. OBAMA: — awakening, partly because the Shi’a –
MR. O’REILLY: But if it were up to you, there wouldn’t have been a surge.
SEN. OBAMA: Well, look –
MR. O’REILLY: No, no, no, no.
SEN. OBAMA: No, no, no, no, no, no, no.
MR. O’REILLY: If it were up to you, there wouldn’t have been a surge.
SEN. OBAMA: No, no, no, no. Hold on.
MR. O’REILLY: You and Joe Biden — no surge.
SEN. OBAMA: No. Hold on a second, Bill.

McCain matched most with his New York Times interviewers, a newspaper frequently cited by conservatives as liberally biased. The New York Times recently officially endorsed Barack Obama for president. McCain also matched very highly (nonsignificantly lower than his most synchronized interview) with Michael Reagan, a radio talk show host who, despite his conservatism, managed to outrage McCain via a telephone interview on January 31st of this year. Here’s an example from that interview:

REAGAN: Senator, Senator, Senator, Senator, Senator…
MCCAIN (talking over Reagan): …well worth talking about as well…
REAGAN: Senator!
MCCAIN: I’m not…
REAGAN: Senator!
MCCAIN: I asked you, Michael, if I could finish, can I finish?
REAGAN: But you did finish–
MCCAIN: Can I Finish? Can I finish? Yes or no?
REAGAN: What else do you have to say?
MCCAIN: Can I finish or not, I mean otherwise…
REAGAN: Go ahead.

Art Graesser, Moongee Jeon, and Zhiqiang Cai, University of Memphis

It is popular these days to analyze the language of candidates.  We use language as signals on their persuasive impact, entertainment value, and eventually the votes.  This makes sense because language is the window to the thoughts and values of the candidates.

One popular recent approach is to analyze the words used by the candidates. Our colleague James Pennebaker at University of Texas has made the persuasive case that pronouns are important.  Another colleague, Jeff Hancock at Cornell University, has made the case that cognitive words signal deception.  These are all valid analyses.  But the point that we wish to make is that it is also important to dig deeper into the language, into sentence composition and the coherence of the message.  It is time to move beyond the word and into deep meaning. 

We have recently analyzed the nomination acceptance speeches of candidates to perform deeper computer analyses of language.   We used Coh-Metrix, the only computer tool free to the public that analyzes language on sentence composition and discourse coherence (that is, how ideas in sentences connect with other sentences in meaning). Coh-Metrix can be accessed via Google.  It was developed at the University of Memphis on a large grant from Institute for Education Sciences to analyze the language and coherence of textbooks (with Danielle McNamara, Art Graesser, and Max Louwerse).  

We analyzed the nomination acceptance speeches of the four nominees: Obama, McCain, Biden, and Palin.  We selected these speeches because they were all on an even playing field on importance and potential impact on the voters.  It is also perfectly obvious that the speeches are products of speech writers.  So we don’t know whether the conclusions are products of the nominees or their speech writers.  However, it is the candidate that is ultimately responsible for the messages.  We used Coh-Metrix to see how they are different.  So what did we learn?

Length Matters

There were differences in acceptance speeches on length of the speech and the sentences.  The length of all of the candidates was approximately 3500 words, with the two presidential speeches about 50% longer than the vice-presidential nominees. The length of the sentences is an important consideration.  Obama was the obvious leader on this dimension.  The mean number of words per sentence was approximately 20 words whereas the rest of the pack was about 15 words.  We know that the grade level of messages is determined by the length of sentences and the length of words: The greater length of words and sentences translates to a higher grade level (that is, greater difficulty).   We found that Obama was the leader in the grade level of the message according to the Flesch-Kincaid scale of readability (the most popular and accepted measure of readability of messages).  Obama’s speech had a10th grade level whereas the rest of the pack had a 7-th grade level. 

Content Words Don’t Matter Much

Pennebaker made the case that pronouns and function words are important indicators that differed among candidates and that were important.  True enough.  But what about content words?  These are nouns, main verbs, and adjectives.  We found that content words did not differ much among candidates.  Consider the 4 measures in the figure below – clearly no differences among candidates. There were no differences in the words’ concreteness, imagability, familiarity, and age of acquisition (defined as the age when most people learned the words) according to Coltheart’s MRC Psycholinguistics Database.  We suspect that nominees are coached on the words they use so that might explain why there is an even playing field on selection of content words in their speeches.  

Sentences Differ Somewhat Among Candidates

Let’s go beyond the words into sentences.  We analyzed the syntactic complexity of sentences and the noun-phrase complexity.  These are shown below.  The syntactic complexity was approximately the same, except that McCain’s speech was a bit lower.   Palin’s noun-phrase complexity showed a slight advantage, measured as the number of adjectives that modify the nouns. The “hockey moms” and the “six pack dads” are richer noun-phrases than those of the male candidates.  

We found that the Democrats had a higher incidence of questions in their speeches than the Republicans. However, the presidential candidates had a higher incidence of negations.  Questions and negations are flags of uncertainty, openness, skepticism, and other dimensions of complexity.

Coherence of the Messages Differ among Candidates

Coh-Metrix analyzes the coherence of messages on dozens of measures.  Each measure analyzes the extent to which ideas are connected to each other logically and conceptually.  One measure assesses the extent to adjacent sentences have common content words. The other measure assesses the extent to which adjacent sentences are semantically related.  The latter measure is based on latent semantic analysis, a statistical computation that is based on hundreds of dimensions of meaning (developed by Landauer, Dumais, and Kintsch).   The two presidential candidates had more coherent messages than the vice-presidential messages on these coherence measures. 

 
 
 

 

 

Conclusions

So what might we conclude from all of this?  One conclusion that there is much more going on than words.  It is easy to think about words because they are simple, easy to train, and sometimes flashy.  However, we live in a complex world of ideas and solutions to complex problems.  It is important to also consider levels of language and discourse that move beyond the word and into deeper levels of meaning. 

A second conclusion is that the nomination speeches of presidential candidates are a notch above the vice presidents.  They are longer and more coherent, perhaps with the coaching of the speech writers.  It will be interesting to see how the unprepared discourse segments differ among candidates.  This will be our next question, with the assistance of Coh-Metrix. 

A third conclusion is that the complexity of Obama’s language tends to rise to the top.  The speech length, sentence length, grade level, sentence syntactic complexity, noun-phrase complexity, questions, negations, and coherence were all at the top or among the top-two nominees. 

We will continue to analyze the speech of the four nominees.  Stay tuned.

 

 

by James W. Pennebaker

The third and final debate produced language patterns that were remarkably similar to the other two debates.  As before, McCain was slighly more personal and emotional than Obama.  McCain also used more future tense verbs.  Obama used words that suggested he was more cognitively complex with longer words and more complicated sentences. In addition, he tended to use more exclusive words and tentative words (e.g., perhaps, maybe) which can also signal looking at the world from different perspectives.

As discussed in a previous blog, we have also found evidence to suggest that McCain and Obama have different thinking styles.  Whereas McCain tends to be more categorical in his thinking,  Obama is more fluid or contextual in the ways he approaches problems.  Categorical thinking involves the use of concrete nouns and their associated articles (a, an, the) and suggests that the person is approaching a problem by breaking it down into its component parts and attempting to put it in meaningful categories.  Fluid or contextual thinking involves a higher rate of verbs and associated parts of speech (such as gerunds and adverbs).

There were also a few departures in language use by the two candidates compared to their earlier debates.  Obama, for example, used more 1st person singular pronouns than his opponent for the first time in any debate we’ve analyzed.  This may be due, in part, to the fact that McCain only used his “my friends” only once. Obama also used more achievement words than McCain which has typically been a reliably high marker for McCain. 

Using the LIWC computer program, the differences in language usage between the categories in the third debate were as follows:

Category   Examples   McCain     Obama Interpretation
Word count  

6596

7339

Obama talks more
Words per sentence  

13.83

18.39

Obama longer sentences
Big words (over 6 letters)  

17.77

18.72

Obama bigger words
Personal pronouns  

10.22

9.22

McCain more personal in general
   1st person singular I, me, my

2.99

3.08

 
   1st person plural We, our

2.71

3.05

 
   2nd person You, yours

1.91

1.39

McCain more pointed
   3rd person singular He, she, her

1.33

0.63

McCain more reference to others
   3rd person plural They, them

1.27

1.08

 
Indefinite pronouns It, those

6.67

7.67

Obama more vague
Articles A, the

6.76

6.24

McCain more categorical thinking
Verbs Walk, went

15.74

16.65

Obama more fluid or contextual
Auxiliary verbs Is, have

10.29

10.40

 
   Past tense Was, gave

3.35

2.68

McCain talks about things in the past
   Present tense Am, is

10.01

12.06

Obama more present oriented
   Future tense will

1.39

0.91

McCain more future oriented
Common adverbs Very, really

4.05

4.39

 
Prepositions To, for, of

13.22

13.35

 
Conjunctions And, or, whereas

6.55

6.21

 
Negations No, not, never

1.52

1.61

 
Quantifiers Much, few

2.59

3.08

 
Numbers Six, 12

1.65

1.72

 
Social references Friend, we, talk

11.75

10.19

McCain more references to others
Overall emotion words Happy, hurt, kill

5.43

5.01

McCain more emotional
  Positive emotions Happy, nice

3.79

3.61

 
  Negative emotions Sad, nasty, bad

1.65

1.43

 
      Anxiety, fear Worry, scared

0.12

0.14

 
      Anger Angry, hate

0.59

0.31

 
      Sadness Depressed, cry

0.32

0.27

 
Cognitive mechanisms Think, should

17.39

17.89

 
   Insight Realize, know

1.73

2.04

 
   Causal Because, reason

1.43

2.13

Obama more causal reasoning
   DIscrepancy Would,could

2.11

2.00

 
   Tentative Maybe, perhaps

1.52

2.15

Obama perspective difference
   Certainty Absolute, certainly

1.46

1.50

 
   Inhibition Blocked, stop

0.64

0.60

 
   Inclusive words With, and

6.78

6.13

McCain over inclusive
   Exclusive words Except, but

2.24

2.47

 
Relativity Times, going, over

11.61

12.11

 
   Motion Went, fly

1.65

2.02

 
   Space Area, under

5.81

5.78

 
   Time Hour, clock

3.70

4.05

 
Content Categories        
Work Job, paycheck

3.99

4.86

 Obama more references to work
Achievement Try, succeed

1.82

2.58

 Obama higher in achievement words
Leisure Games, tv

0.47

0.44

 
Home Garage, yard

0.53

0.41

 
Money Cash, debt

3.21

3.19

 
Religion God, church

0.06

0.08

 
Death Dead, cemetery

0.09

0.01

 

Debate language

October 15, 2008

 

by James W. Pennebaker

I will try to post the language variables of tonight’s third debate as soon as the transcripts are available.  In the meantime, several comments that have been posted in the last 24 hours that point to some misunderstandings:

Speech writers, trainers, and natural language.  Some people have noted that we can’t determine if the language used by the candidates reflect their speech writers or the candidates themselves.  Indeed, that is why we try to analyze only unscripted language.  Debates are particularly good for this.  Yes, all the candidates slip into canned phrases with some frequency but, in general, they are likely using more of the words they would naturally use than not. 

Some of our other analyses compare the ways candidates talk in one-on-one interviews with debates as well. In general, candidates are fairly consistent in the ways they use words across these contexts.  Obama and Biden tend to be more consistent than McCain and Palin but the differences are not striking.  See the previous posts by Molly Ireland on this topic.

First person singular pronouns: I versus my versus me.  The use of first person singular is psychologically fascinating.  When people are engaged in everyday normal conversations, they use the word “I” at quite high rates (about 6% of all words) compared with “me” (about 0.5%) or “my” (about 0.7%).  A couple of people have noted that McCain obviously uses first person singular pronouns at such high rates because of his use of “my friends”. 

It’s true that McCain has been using “my” at higher rates across the two debates  than Obama but he has also been using “I” at these elevated frequencies as well.  This has been true for both candidates for interviews and debates for the entire election season.  The average pronoun use for the two candidates across the first two debates (as a percentage of total word usage) is as follows:

Candidate

“I”

“me”

“my”

McCain

2.67

0.19

0.48

Obama

1.86

0.14

0.15

It should be noted that the relative rates of all the pronouns were virtually identical from the first to the second debate except for McCain’s higher use of “my” in the second debate (0.20 in Debate 1 and 0.71 in the second).

What does it mean if a person uses high rates of I versus me?  One of the founders of modern psychology, William James, made the strong assertion that the use of “I” implied a self in control whereas the use of “me” suggested the self was being acted upon by others.  This, of course, makes perfect sense.  Empirically, it’s probably wrong.  People who are depressed, lower in status, and lower in self-esteem consistently use “I” at higher (not lower) rates than non-depressed, high status, and self-assured people.  Ironically, the use of “me” doesn’t seem to be related to any of these qualities — or any qualities that we have studied so far.

It would be misleading to think that the use of “I” always signals depression and low self-esteem.  People who use I tend to be more honest and are often more socially sensitive.  They are more likely to say “I think it’s cold outside” instead of “It’s cold outside.”  Saying phrases such as “I think”, “I believe”, etc are subtly indicating that they are aware that other perspectives exist and that theirs is only one of many.

Follow

Get every new post delivered to your Inbox.