This week we are going to explore some dos and don’ts of data visualization as you prepare for your final projects. Edward Tufte is widely considered one of the world’s leading data visualization gurus, and has been called everything from “Leonardo da Vinci of data” to the “Galileo of graphics.” Tufte will be our guide as we think through what good visualizations say and how bad data displays can lie and distort or even undermine your intended argument.
The Minard Map
It may well be the best statistical graphic ever drawn.
—Edward Tufte, The Visual Display of Quantitative Information (1983)
The Classic Discussed
- Michael Sandberg’s deep dive on the DataViz History blog
- Another discussion of the classic by John Corbett
Static variants (Do they add anything?)
- Kraak’s variants from his 2013 book Mapping Time
Interactive variants (Does it help to be able to manipulate the map?)
- A custom coded instance at http://www.masswerk.at/minard/
- An instance using the Neatline plugin for Omeka
- And the developer’s discussion of his process on the Scholar’s Lab blog
Questions
- Why is this considered such a landmark visualization, if not the best ever?
- What are the key features that make it stand out?
- How would you improve on it, if you were to take a stab?
Keeping it Honest: How Not to Lie with Pictures
This may well be the worst graphic ever to find its way into print.
—Edward Tufte, The Visual Display of Quantitative Information (1983)
We’ve already discussed how not to lie with maps, but it’s easy to do with visualizations as well. One of the biggest issues that Tufte stresses in his seminal work is how to stay honest with infographics. One of the easiest errors to make, for instance, is to scale the radius of circles, or one axis of two dimensional shapes, which results in massively larger areas than your data actually warrants.
- Explore this gallery of images illustrating “The Lie Factor”
- Think about how the literal measurements of the images contradict the implicit argument the graphic is trying to make.
- Explore some more of Michael Friendly’s gallery of the Best and the Worst of Statistical Graphics
- What mistakes did you not think of before that you might want to avoid?
- What examples might you like to emulate for your own projects?
- Why?
Tools
Google Motion Charts (Gapminder)
One of the most impressive data visualization breakthroughs of recent years was Hans Rosling’s invention of Gapminder: an application that really unleashed the “fourth dimension” of time and allowed data to be animated in an immediately understandable and powerful way. His TED talk below illustrating global health data with the tool is legendary.
Google bought the technology and made it available for all to use as Motion Charts.
Resources
We’ve already explored some visualization environments, but here are two more very impressive tools to check out:
Choose one and check it out to see what people are creating. Download or join a service and see if you can create something.
- Upload some of your own data, or download data sets from around the web
- One interesting source is JSTOR for Research, which aggregates all the scholarly literature housed by JStor