After a couple of weeks running around the country, including Fluent, visiting some folks at Matter, CIR, and friends at CFA, and visiting family, I’m back into the swing of things over at AxisPhilly, and have been thinking about how we come up with ideas and stories to pursue, and how we come up with ideas about data.
Here’s the pretty common scenario I see in open data:
Someone sent us this awesome data set! Uh, now let’s something with it(?)
Or maybe your agency is interested in opening its data. Or maybe you’re a concerned citizen demanding more transparency. But data without a good question is quite useless.
The key thing to remember here is to make sure you find the right question, and not let the data dictate the question.
For example, in these scenarios, I’ll talk about weevils, because hopefully I won’t have to do weevil data stories anytime soon (although WNYC got close).
So someone gave us a data set that was a count of the weevils for the past five years. Do we have questions that we want to ask about weevils? Does it matter for our audience? How is this relevant to their daily lives? Do gardeners have super strong opinions about weevils? Does the city spend an inordinate amount of resources on weevil abatement, and are those techniques efficient enough to help the weevil/community relationship? How do weevils feel about this? (okay, not the last one, but you get the idea).
When you start FROM the data set, the key idea is to abstract from the data set itself, asking good questions.
If the data helps answer that question, great. But mapping or visualizing a data set without a good question isn’t a good start. Mapping or visualizing can, however, help you think of good questions. Are there interesting clusters in the data? Why are they there? Are they easily explained, or is it something that if explained, could help someone?
Do you work for an agency that’s pursuing open data? How are you engaging the community around your data, getting them invested in it, and asking good questions? If you make stories or tools around open data, are you creating “explorers,” or are you asking good questions that make the data greater than the sum of its parts?