As everyday computing power increases, data becomes more available, and more researchers become familiar with powerful programming languages like Python and R, data visualization will continue to become a larger and larger part of social science. I hope that not too far in the future, pages like this one will seem quaint (maybe they already do) and visualization will be a standard part of methodological training in many social science graduate programs. We are not quite there yet, though. Here, I have compiled a list of resources for making visualization work easier.

Books and the like

Kieran Healy’s book is a wonderful reference and large sections of it are available free online.

The “Data visualization” chapter of R for Data Science is very helpful. So is the chapter on “Graphics for communication.”

Although not on visualization per se, Matt Salganik’s book on social science in the digital age is also a great resource for those interested in “big data” and the like.

ggplot2

Anyone who programs in R knows ggplot2, which is part of the tidyverse family of packages. If you do not use R but want to learn, ggplot2 is the de facto foundation of most data visualization in R. ggplot2 is flexible and powerful. Once you begin to understand it, you will deeply appreciate it—but, as with R in general, the learning curve can be a bit steep. ggplot2 supports everything from visualizing spatial data to using forest (i.e. dot-and-whisker) plots to visualize regression results to generating alluvial plots to show flows and time trends. For example, I get a kick out of the alluvial plot below, which I made with ggplot2. It shows the rank and proportion of survey respondents who mentioned different issues as “the most important problem” in the United States from 1939 to 2015. Data are taken from the Roper Center Most Important Problem Dataset using all supplied sources, not just the Gallop data. “Environment” is highlighted in dark gray. Click the image for a larger view—it’s big!

“Most important problem,” United States, 1939-2015.

Color Schemes

colorbrewer – Ideal for generating low-n color schemes.

iwanthue – Ideal for generating high-n color schemes (e.g. 10+ colors). Hosted and developed by the MediaLab at SciencesPo, which is also home to a number of other useful tools like Table 2 Net, for drawing graphs (networks) from tables of data.

htmlcolorcodes – A simple site for getting html hex color codes

8-digit hex codes – Rather than designating an “alpha” (a transparency factor) for a plot, it can sometimes be useful to specify transparency directly in specific color codes, e.g. when you want to highlight a specific line or trend in a plot with many lines and trends. You can do this with 8-digit hex codes. (See also here.)