Data, data everywhere, but what does it all mean?
At Willow, we often work with organizations that collect a lot of data about their customers, members, or donors, and the rise of software that makes this data collection easier means there’s simply more data being collected than ever before.
Popular narratives around the power of data and its ability to unlock previously hidden insights generate a lot of enthusiasm around existing company or organization databases, but, it’s often difficult to know how to make the best use of all this data.
More is not necessarily better.
When we need to rely on internal client data in our own work, one of the first things we do is ask what we believe are critical questions about the existing database.
When we’re reviewing a database, we start by examining the origins of the data, and—given those origins—who and what is represented by that data.
For example, data that is automatically generated via a transaction is different from data inputted by humans, who may have different interpretations of similar or even the same events. For the latter, we have to be cautious about the consistency of the data. For the former, we have to understand the limits of what has been (or can be) gathered.
In addition, a company or organization may have decades of data, but over time, the information collected and how it is prioritized may have changed.
These initial considerations become particularly important when we’re trying to look at changes over time, because mismatched data points can make for faulty comparisons.
Sometimes the data you have is a reflection of what’s available or easy to collect, rather than being complete and representative.
An illustrative example of this is seen in discussions about which college majors translate to the highest wages. This seems like a large, but relatively straightforward proposition, but often the big-picture, headline news on these issues obscures how much data is missing or hidden in these calculations. For example, when wage-to-major data only reflects the first 2–4 years post-graduation, does that truly reflect the average economic outcomes for certain majors?
The estimates may come from the best available data, but what if that data is not sufficient or representative?
And of course, if we’re going to assess the value of a college degree, shouldn’t we also be considering the non-monetary benefits that the overall experience conveys as well?
Data-driven decision-making that’s driven by incomplete data is likely to result in less-than-ideal decisions.
Ultimately, this is the question we want to help our clients answer, and as an independent research company, we can provide perspective that may be difficult to come by for those enmeshed in the day-to-day operations.
Sometimes, data can surprise us with its insights. Other times, there’s gaps that make the data seem disappointing. We have worked with a number of clients who were hoping that the answers to critical questions could be gleaned from their existing data, only to realize that the underlying “why” of a phenomenon cannot be derived from data that is designed to only measure “what” is happening.
(See our previous blog post on “Why We Still Talk to People” for more on this.)
No matter what, an initial inventory of the data helps point the way forward. It may suggest that some new data points should be collected, or that existing data should be organized in a different way.
In other cases, we recommend conducting primary research to help fill in the gaps.
An organization’s database is most definitely an asset that can be used to its advantage. The first step is to fully understand what you’ve got.