All posts by Amanda Hickman

Spreadsheet Skills

See Russell’s posts on spreadsheets and pivot tables for a complete walk through of our in-class exercises.

We looked at formulas to find the =AVERAGE() and =MEDIAN() of a range of cells (working with MLB salary data)

We downloaded Google’s data on flu related searches. We used data > text to columns ... to break the text block into columns, and then used some Excel formulas to find =MAX() values.

We didn’t use the auto filter button to set up filters that let us sort the data by a single column, but we could have. We also didn’t use paste special > transpose to shift rows into columns, but we could have, if we wanted to.

We used data Matt got from iCasualties.org to play with pivot tables, in Google Spreadsheets and Excel.

Homework Week 2 (Due Sep 18)

You have two assignments: we want you to answer a handful of specific questions using data that Slate published alongside How Many People Have Been Killed by Guns Since Newtown? and we want you to download the CDC’s data on firearm homicides and find something of your own in that data. If you’re smart, you’ll also use the extra week to start thinking about stories you’d like to work on for your team assignments and figuring out what kind of data is available for those stories. Continue reading Homework Week 2 (Due Sep 18)

How to Find Data

It isn’t hyperbole: journalists today have access to more data than ever before. And we have better tools available to digest and display that data. So where can you look for reliable data?

Some basic strategies that work pretty well: + Google it. That’s never a bad place to start and it only takes a second.

  • Figure out who should have the data? Who might have it? Is this information only the NYPD or the IRS can collect? The Departments of City Planning, Buildings, Housing, Finance and Taxation all keep tabs on who owns property in New York City, where that property is located and what it can be used for. If you know who ought to have the numbers you’re looking for, you can start your search by asking them.
  • Look at recent reporting about the subject. Who has been releasing reports? Who has been cited in stories? Go ask them for data, or ask them for help finding it.
  • Wikipedia is a fantastic resource. Don’t be afraid of it. Most information there comes with a citation — don’t take some Wikipedia author’s word for it, but do look at the source they cited and confirm that the numbers are there.
  • Look for think tanks and aid organizations that specialize in the issue you’re interested in.
  • Ask a librarian

Know your sources

You can get data anywhere, so it is up to you to decide whether or not you’re working with reliable data. You should know where your sources are coming from — do they have an agenda that can help you understand how they’re framing the data they put out? You can roughly guess who is behind NRA Institute for Legislative Action, but what about Law Center to Prevent Gun Violence? Don’t assume that a think tank is reliable just because it kind of feels professional.

Provenance

It is also up to you to know where your data is coming from. Did the organization hire a research firm to conduct a comprehensive study? Or did they post a little box on their website asking visitors how they feel?

Be skeptical: an advocate (or government agency) insisting that these numbers mean something doesn’t make it so.

Where to look?

The Journalism School’s Research Center maintains an excellent roundup of guides, many of which will point you to great data sets. Check out the census, business and crime guides in particular.

NICAR’s database library is a great resource. So is Amanda’s tumblr’s “data sources” tag.

And … I started a wiki page with notes from this week: https://github.com/amandabee/cunyjdata/wiki/Where-to-Find-Data

Welcome!

Welcome to Data-driven Interactive Storytelling, Jour72312, or Data Viz! We’re looking forward to starting the semester and meeting each one of you.

Before our first meeting, please watch the following chapters of Journalism in the Age of Data, a video report on data visualization by Geoff McGhee at http://datajournalism.stanford.edu/

Chapter 2 Data Vis in Journalism Chapter 3 Telling “Data Stories” Chapter 6 Exploring Data

The total running time for these three chapters is 23 minutes.

Syllabus

Take some time to view Geoff McGhee’s Knight Fellowship video report on Data Journalism http://datajournalism.stanford.edu/ It’s about 54 minutes, and it’s available in many different formats (for viewing on desktop or mobile). It’s a nice overview of the current discipline.

Additional suggested reading:
+ Introduction to the Data Journalism Handbook
+ Paul Bradshaw’s How to be a data journalist on the Guardian blog
+ Alastair Reid’s post on How to: get started in data journalism

Also, start thinking about what you might want to do for your first project. I know, the course hasn’t even started and you may not have a clue where to start. That’s okay. But if there’s some data that you’re interested in, or you have some ideas that have a basis in data that you want to explore, bring them to the first class.

NY Times on Economic Mobility and Location

In Climbing Income Ladder, Location Matters (New York Times; July 22, 2013)

A study finds the odds of rising to another income level are notably low in certain cities, like Atlanta and Charlotte, and much higher in New York and Boston.

Kevin Quealy wrote a short post (not a how-to, though) about this story. Some of the charts, especially the one on Charts’n’Things, are fairly complex. I think the team (and it was a big team — check the credits) did a great job of organizing the data and making it approachable, though it bothers me that the map uses inconsistent buckets.

They use a combination of HTML, JavaScript and D3 to build this out.