At Willow Research, we love data. In fact, it’s the core of our business.
But as fans of data and what it can tell us, we’re also compelled to advise caution about what data can’t tell us, particularly when it comes to the ongoing excitement over “Big Data.”
Popularized by Michael Lewis’ “Moneyball,” the story of how the Oakland A’s built a winning team despite their financial constraints, “Big Data,” or data mining, has been part of market research for a long time. Using aggregated information to better understand the patterns of consumer behavior is a staple of the industry.
However, the popularization of Big Data to understand behavior has led to some misunderstandings and misapprehensions of what this kind of data means, and what it can do. There is nothing magical about Big Data, and in fact, by itself, and in absence of judgment, data can’t do anything.
Here are some things that we think need to be considered by any business or organization as they consider how they use data.
One of the criteria by which schools of higher education are judged is the graduation rate. But it can be difficult to discern how much of the graduation rate is determined by what’s happening at the institution, and how much is due to the particular cohort of students and external factors that these students may be dealing with.
We would expect a community college to have lower graduation rates than an elite institution like Harvard. Community college students are often less academically prepared than students who attend four-year institutions. More importantly, they’re also more likely to be confronting issues such as caregiving responsibilities and economic challenges, including food insecurity and homelessness.
Insisting that a non-selective institution bring its graduation rates up to match a highly-selective institution ignores the importance of inputs. It also disregards the mission of community colleges and less selective institutions in providing opportunities for a broader base of students.
We also know that judging institutions based on aggregated data about graduation rates—such as the U.S. Department of Education’s College Scorecard—often misses cases where students attend community colleges never intending to graduate. Some plan to transfer after a year. Others are looking for a handful of courses as part of their continuing education needs.
To judge institutions like community colleges by a number that doesn’t reflect their mission and doesn’t capture the full scope of outcomes can obscure important information when it comes to moving forward and improving.
The point here is that Big Data is excellent at giving answers to easy-to-quantify questions, which can lure end users into fixating on a small number of easy-to-define outcomes (i.e., graduation rates)—outcomes that may not reflect the actual goals of the organization. Without careful consideration, Big Data can entice organizations down the wrong path.
Without careful consideration,
Big Data can entice organizations down the wrong path.
Big Data can also tempt us to believe that an algorithm making predictions and decisions may eliminate the problem of human biases. After all, it’s based on hard facts!
Unfortunately, we have seen plenty of evidence that human bias is extremely difficult to get rid of, because AI is learning from pre-existing data—data that already has those prejudices baked in.
Take Amazon. The behemoth developed an exciting AI-based screening tool that would automatically review incoming resumes to find the most qualified candidates. The tool taught itself which qualities to look for based on an existing pool of resumes received over the prior ten years—the vast majority from men. The result was an algorithm that reinforced and intensified a bias in favor of male applicants, discarding potentially highly qualified women because they were less likely to fit the already-existing patterns. After an attempt to realign the algorithm failed to improve its selections, the tool ultimately had to be scrapped.
More unsettling, in an in-depth investigation, Pro Publica demonstrated how software meant to predict the risk of recidivism among those convicted of crimes not only taught itself severe racial bias, it missed important criteria necessary to predict which individuals were most likely to reoffend. In short, the algorithm—learning from existing conviction data—ended up being both very racist and very wrong.
When all-too-human biases are built into the underlying data pool, algorithms quite logically come to faulty conclusions.
When all-too-human biases are built into the underlying data pool,
algorithms quite logically come to faulty conclusions.
We were reminded of this issue when reading a recently published article on “Big Data” in legal work in the Financial Times.
One of the data applications described in the article is examining the decisions of a single judge in a single court on the likelihood of allowing class action suits to go forward, comparing this judge to the other 670 US district judges.
This judge allowed those cases to go forward 51% of the time, but this figure is based on only 37 examples. Given the extreme variety in cases, in lawyers, and in presentations, combined with this relatively small number, it would be difficult to say that past data is truly predictive of any individual outcome. This is a situation where the small numbers and hidden variables make relying on the odds suggested by the data highly problematic.
Similarly, some trial consultants are turning to Big Data as a tool for jury selection. It’s a tantalizing application: use demographic and other characteristics to predict how a prospective juror might vote on a particular case. It is certainly true that a person’s demographic characteristics tell us something about them. However, while statistical analysis is a powerful tool for understanding patterns and predilections, it cannot predict how a particular and unique individual will respond to a specific set of facts and circumstances. Though it can tell you about a prospective juror’s potential tendencies, Big Data is no substitute for a probative voir dire and an attorney’s good judgment.
When we studied the healthcare industry, we found that Big Data is being utilized more and more by health systems and insurance companies, often to uncover characteristics that make a patient at greater risk for chronic illness (and therefore, greater cost). Patients flagged as high-risk are targeted for specific intervention—but what kind of intervention and how will individual patients respond? Even as they’re increasingly reliant on this data, payers aren’t certain of its effectiveness, because individuals don’t often behave exactly like the composite average suggests they would.
Then there are diagnostic tools, which are becoming more and more sophisticated. But when it comes to complex medical situations, even IBM’s Watson struggles; when put to the test in hypothetical scenarios, it had a hard time providing proper treatments for patients with cancer. Left to its own judgment, the algorithm actually posed a danger to patients, and trained and knowledgeable doctors would have been necessary to prevent harm.
When the outcome of each individual case matters, human expertise and judgment are still necessary.
When the outcome of each individual case matters,
human expertise and judgment are still necessary.
Big Data is an exciting field that can uncover innovative and useful patterns, but it is not without its dangers. Relying too much on the data by itself is a mistake that can have profound consequences.
At Willow, we spend our time thinking deeply about what data says, what it doesn’t say, and whether we need to challenge our own assumptions.