Over the last few months, I've gotten a handful of emails asking how I collect and visualize data for the posts on Intersectional Analyst! Through these emails, I learned that a lot of folks felt data visualization was outside of their skill set because they didn't have a coding background. Luckily, there are a number of user-friendly resources for visualizing data sets, which aren't too tricky to figure out as long as you have a handle on how basic spreadsheets work! Many of these tools do not require experience in coding. I wanted to share some of the resources I've used and/or come across in the last few years here.
Scraping Data
Data scraping is the process a computer takes when it's taking data from one place and making it readable in another format/location (e.g. taking data off a table on a website, and converting it into a .csv file that you can read in Excel or a stats program like R)
Tabula useful for extracting table data from PDFs (Note to anyone sharing data publicly: avoid putting it in a PDF! Data in PDFs are not the most machine readable - opt for a spreadsheet or .csv file!)
Data Scraper for Google Chrome takes data from the web and puts in a downloadable table (.csv) - This may not work for all webpages. After extracting the data, you will also need to clean up the dataset manually after you import it to a program like Google Sheets. This is the tool I used to scrape data from the NYT online recipes database for the Food, Race, and Power piece.
Visualizing Data & Information:
During my grad school days, I mostly used R to plot my data. R is a free and powerful program, but does require writing some code. For data sets that don't need that much computational processing, it's possible to create aesthetically pleasing and interactive charts with a number of free, user friendly platforms. In fact, most of the interactive plots on this blog are made using Google Sheets.
Google Sheets: free, online spreadsheet program that allows you to create interactive charts that you can embed on websites
Plot.ly: another spreadsheet program that allows you to create interactive graphics - can use with R and other data processing programs as well. I've never used this one before, but have heard good things about it!
Carto: useful for plotting spatial/map data
Canva: has pre-made templates for creating different types of infographics and charts
Tableau: has a number of capabilities but features are limited for free users.
Timeline.js: an open source tool developed by the Knight Lab at Northwestern University for making visually engaging, interactive timelines. I haven't used this resource yet but it looks awesome and I hope to soon!
Places to Explore Examples of Data Journalism
Data Journalism Awards 2016 Shortlist - I first sifted through the data journalism awards in 2014, during my first (unhappy) year of grad school. I was so inspired by how data visualization and analysis could be used in such a socially impactful way, and how I could blend my quantitative interests with storytelling! The data journalism awards are a great place to explore the different ways data can be used for reporting and exploring issues more deeply!
Also check out ProPublica, which does high quality investigative reporting, often driven by data, FiveThirtyEight (if you're interested in following sports and politics, in particular), and The Pudding (from Polygraph)!
Places to Explore Examples of Information Visualization
Flowing Data and Information is Beautiful
Have you made something cool with these resources? I'd love to hear about it!