Wednesday, November 9, 2011

Presidential Candidate Poll: How to Fail in Visualizing the Results

How difficult can it be to produce a simple bar graph out of poll results? For the Finnish media corporation MTV3 it is at least very challenging. As I am currently studying, among other things, information visualization at University of Michigan, I got an irresistible temptation to try to analyze some of the problems one of their visualizations has.

On Wednesday, November 9, MTV3.fi published an article "Niinistö still superior candidate, others far behind" (translation mine), with the most recent poll results. The poll was conducted by telephone by Research Insight Finland. The exact number of interviewed people was not revealed in the article but it was "a bit more than one thousand". (N=1002 as revealed in the other graph attached to the article.) The error marginal was ±3.1 %.

There original graph is below. It has been copied from the article, so if it is fixed later, those revisions will not be visible here.

General


Just a couple of observations about the graph in general. There cannot be links contained in an image, so the text "suurenna kuva" ("magnify the image") is not a link. It just should not be there. The title of the graph is differentiated from the other one(s) just by the addition of "Alue" ("Area") at the end. Why not "alueittain" ("by the area") or if they are using some formal way of dividing the country into divisions (see below), they could reveal it here.

Bars and data


The scale of the graph goes from 0 to 100. What are the units? We can guess they are supposed to be the percentage of respondents choosing a specific option in the poll... What is the point of extending the scale up to 100 %? The most popular choice is still less than 50 % and because of this scale, the differences between the less popular candidates are impossible to see. The labels for the bars have no decimals, so Paavo Arhinmäki's popularity data per province is just "1, 1, 1, 2, 1". What's even more worrying: the bars are based on results after rounding them to the nearest integer, not on the actual values!

It could be said that the bars fail to give any real insight into the results because of this mistakes. But wait! There is an even more severe error below them. In the poll there were ten options but there are only nine labels in this graph. The labels are evenly spread, so they are also misaligned. Especially the least popular candidate's names are difficult to associate with the related bars - does Sari Essayah have popularity of 1 % or 19 %? The mistake: they forgot to put a label for "undecided"!

Also: if the question is "Who would you vote if the elections were held right now?", the answer must be in partitive form - Sauli Niinistöä, Timo Soinia, Paavo Väyrystä (his name is misspelled in the labels, with lower case v!) and so on. "En ketään ylläolevista" ("None of the above") is in the correct form, but there are no names above it (only on the left), so which names does it refer to? As this poll was conducted by telephone, how did the interviewees know which names were above this option and which in other directions?

Legend


In the legend on the right, the explanations for the five bars of different color are given. For the purposes of this graph, the results of the poll were divided geographically along the administrative provinces - lääni in Finnish. Here lies one problem: there are no provinces in Finland any more! In this graphs, the division is (probably) based on the provinces in use between 1997-2009 - except that there were six of them, not four. Three of them have been named, but all of them were misspelled:

  • Eteläsuomen Lääni (Province of Southern Finland - should be spelled as Etelä-Suomen lääni)
  • Länsisuomen Lääni (Province of Western Finland - should be spelled as Länsi-Suomen lääni)
  • Itäsuomen Lääni (Province of Eastern Finland - should be spelled as Itä-Suomen lääni)
  • Pohjois-Suomi (Northern Finland - there has never been a province called this, this is probably Provinces of Oulu and Lapland combined)
  • (The Province of Åland Islands is completely missing, probably combined with the first one)

Also, "Total" is not Finnish. It should be "Yhteensä".

I am, of course, only listing the mistakes I spotted. I think the colors are alright and after correcting these, it would not be a bad bar plot at all. Maybe photos of the candidates would make it easier to assiate a candidate with the bars.

2 comments :

  1. There are more bars than choices.

    ReplyDelete
  2. Classic issue of stuffing too much data into a single graph. Fail.

    ReplyDelete