Analytics Blog
Avoid the Adventurist Trap: The Cost of Poor Data Quality
In this second blog post of our “Taking Data for Granted” series, I want to talk about the cost of a “set it and forget it” culture. If you’re landing on this blog post, I encourage you to read my previous one first. Otherwise, let’s get straight into it.
What Does a “Set It and Forget It” Culture Cost?
As we learned in the previous post, proving the cost is easier than proving the value, and so we move on to the cost of a “set it and forget it” culture. Later on, we’ll talk about the value and relate it to a concept I call “set it and check it.”
Devaluation of Your Data
The first cost is the devaluation of your data due to data quality issues and data disruption. Most of us are familiar with seeing gaps in transactions and revenue across all of our analytics data, including visits and visitors, but we shouldn’t be. These disruptions in your data are extremely costly because insights are not as reliable as before, and we have to make concessions or reduce our confidence level. As a data professional, would you trust data with lots of inconsistencies and anomalies? These are types of statistical error, and they reduce our ability to make inferences about the entire population.
Prevent Statistical Bias
This leads to statistical bias, particularly if you have a high-pressure culture within your organization where people are expected to deliver regardless of issues that may arise – and this is all too common.
Your analyst’s or data scientist’s job is already more difficult due to statistical error, and by putting them under pressure to deliver insight in spite of data quality issues, they’ll introduce bias into their analysis simply to keep their job and look good. They might not have the confidence to stand up and say “I can’t do this analysis because the data is bad,” especially if there’s another team relying on it.
Cure Your Blame Culture and Empower Your Analysts
Both of these are amplified by an internal blame culture, again, all too common. In my experience, there’s so much blame culture in analytics, particularly around people’s opinions on analytical methodology and bad code. This only happens because our data is more critical than most of us give it credit for, and analysis is way harder than most of us appreciate. So when it isn’t right and things go wrong, people get annoyed and angry. You must cure your blame culture and empower your analysts.
The Financial Cost
The next cost is the straight-up financial cost of fixing data quality issues. Last year, fixing data quality problems was one of our client’s primary focus, and it took us almost an entire year and many hours to resolve!
This incorporates the second stage of the statistical modeling process – data collection – and is where there’s most risk for your data integrity to fail and data quality issues to occur. This is where the “set it and forget it” mindset rears its ugly head again. This is due in part to the assumption that, once the implementation has been completed and validated, it’ll never break. This is a lie, a fallacy, so let me lift the veil.
Unfortunately, everything breaks over time, and it’s not always a fault in the implementation. Browsers release new versions, users change their behavior, and your website itself changes. Your analytics implementation should be an ever-changing thing to keep pace with the velocity of everything around it. If you think that you can just do it once and then leave it alone, you’re wrong.
Our implementations are nothing but code. We program our analytics tools either directly using JavaScript, a tag management system, or an SDK for mobile apps. Even when you’re visually configuring your analytics implementation using a tag management system, it ultimately gets compiled down to code, and just like a mechanical machine, will break over time with no maintenance.
But unlike a mechanical machine such as a car, this isn’t due to wear and tear, but rather obsolescence. Instead of your analytics implementation “breaking down,” what happens more often is all the technology around it is constantly changing and being upgraded, and this may cause incompatibility issues and expose new bugs in the code that didn’t exist previously. We recently discovered this exact problem, when one of our clients was collecting a user ID into an eVar, and it stopped working because, at some point, the cookie being parsed got changed to an HttpOnly cookie, which isn’t accessible via JavaScript.
Also, just look at the Google Chrome Release notes. The first page contains releases in June 2020 alone (at time of writing), so many releases, upgrades, and changes are always happening to the technology around us. It’s no wonder that our implementations “break down.” It’s no one’s fault, but it can be avoided with regular data quality audits, which build a level of data quality assurance and trust within your organization.
Data quality audits and regular audits build a level of data quality assurance and trust within your organization.
The Erosion of Trust
The erosion of trust is the final cost of a “set it and forget it” culture that I want to talk about. It’s also the hardest to quantify, but probably the most severe. Trust in data is like trust in anything: easy to lose and hard to gain.
A big factor in trust comes from executives and senior management because they need reliable reports in order to make informed business decisions. It’s your job to uphold the highest standards of data quality so you can have the most impact with your insights. Don’t underestimate the impact of regular reporting!
But it doesn’t matter if you do a great piece of analysis and find an optimization that could save millions of dollars. If you have data quality issues, people won’t trust your insights – and maybe you shouldn’t either. Similarly, it doesn’t matter if you always deliver your report on time for the Monday trading meeting. If a data quality issue is uncovered by someone else, it can undermine all the reports you’ve provided in the past.
It doesn’t matter if you do a great piece of analysis and find an optimization that could save millions of dollars. If you have data quality issues, people won’t trust your insights. Click & Tweet!
Save Time and Money, and Improve Your Data Quality
To conclude, the costs of poor data quality are:
- Devaluation of data
- Statistical bias
- Blame culture
- Financial impact
- Trust
In the next part of this series on data quality, we’re going to explore the value of something I call “set it and check it” culture. Stay tuned.