Spinning the Election
David Skillicorn and Ayron Little, School of Computing, Queen's University.
With the Canadian election coming up, we thought we'd use the techniques we've developed to analyse how mental states leak into speech and text to look at how the political party leaders are doing. Can we trust them? Here's a look at how software analysis can help you decide.
James Pennebaker, from the University of Texas at Austin, developed a model for deception in text. The model is based on the relative frequency of certain kinds of words.
The Pennebaker model predicts that deceptive text will be marked by:
We applied this model to a large collection of email to and from Enron employees in the three and a half years before the collapse of the company. We discovered that, while the model detects deception, it also detects other kinds of unusual text. The best way we know to categorise what the model detects is spin -- text whose apparent meaning isn't the true beliefs of the person saying or writing it.
Politicians, above all, serve multiple masters and may find themselves saying things that may not quite be what they would say if they could speak their minds. We've applied this model to speeches given by the three English-speaking party leaders: Paul Martin, Stephen Harper, and Jack Layton. We rank their speeches by how well they fit the model -- how much spin the speeches contain.
We collected 20 speeches by Stephen Harper (numbered 1 to 20); 20 speeches by Paul Martin (numbered 21 to 40); and 10 speeches by Jack Layton (numbered 41-50). Since the model was developed for English texts, we could not apply it to speeches by the Bloc.
The Table below is a list of the speeches in increasing order of spin. Speeches are colour coded: blue for Conservatives, red for Liberal, and orange for NDP. You can click on the links to see the text of the speech -- see if you agree with the spin rating.
Average spin score per speech: Liberals: 124; Conservatives: 73; NDP: 88.
To produce these results, we counted the frequency of 88 words derived from Pennebaker's deception model, and divided the counted frequencies by the length of the speech in which they appeared (so that long speeches did not appear more spinful just because they were longer). The columns of the matrix were zero-centred, and those columns corresponding to words where a reduced frequency is significant were negated (reversed around the origin). A singular value decomposition was used to create a perceptual space for both speeches and words.
This figure shows the perceptual space for speeches.
The further a point is from the centre point (origin) of this plot, the more spinful it is. The spin scores in the table above are based on these distances. Notice, though, the directions are important too. Most of the most spinful speeches are in the left hand region of the plot.
We can learn about the role of the words in the deception model by considering a perceptual space of words, like this:
Words that appear close together play a similar role. Words that are plotted in different directions play a different kind of role. Notice that the four different kinds of words in the deception model don't appear in four different directions, indicating that the model is describing something more than just four distinct aspects of deception. As before, the further a word is plotted from the centre of the plot, the more important it is.
We've used these ideas in areas such as organizational behaviour, fraud detection, detecting deception, and detecting potential terrorist communication. You can find out more at David Skillicorn's web page
David Skillicorn is a Professor at Queen's. Ayron Little did this work as a fourth year student, and went on to look at deception in the testimony to the Gomery Commission in her Master's research. We do not yet completely understand these models of word use, so the results here should be taken with a grain of salt. An earlier version of this page had the speeches in the reverse order.