How To Tackle Today’s Data Quality Challenges- Valutrics
If your data quality is suspect, your analytics program may also be doomed. Here’s a look at today’s data quality challenges and how to tackle them.
Today’s information consumer is bombarded by messages via mobile phone, television, the Internet, email blasts, text messages, billboards, product placement, and more. And enterprise IT organizations are facing a similar challenge, more data coming in from more channels than ever.
The change has made the task and practice of ensuring data quality more challenging than it has ever been, too. That’s according to data analytics and architecture evangelist Karen Lopez. Lopez is a senior project manager and data architect at the consulting firm InfoAdvisors, and will be a speaker on the topic of Data Quality at the upcoming Interop ITX conference.
Data quality has gotten more complex, according to Lopez. For instance, it used to be that data quality professionals would handle events like changes to the postal code data. But now with data coming in from Twitter feeds and other fast-moving sources, the type of data is changing much more frequently.
“Now we have data from many different sources, and the data from those sources is changing in format, is changing in meaning,” Lopez said in a recent radio interview with UBM Tech’s All Analytics. “In order to make that work, we have to know what has changed.”
Executives typically understand the data quality issues that can occur with their own data sets. But when it comes to using outside data sets they may overlook potential complexities. For instance, if there’s a list of all the countries in the world, an executive may consider that an unchangeable list. But what about countries that existed before today and have ceased to exist? Or countries that split into two in the future?
“For the most part, 20 years ago, organization sourced and used their own data,” Lopez said. “Now we have data lakes, data swamps, data hoarding. People are throwing data at the business and the business is finding it and falling prey to the myth that all data is self-defined.”
This increase in data types and data volume has made the job of data quality more complex.
What’s more, the tools for ensuring data quality, managing data, and storing data have increased in number and complexity, too. Lopez remembers a time when there were five or six DBMS systems to know. Now there’s open source, cloud technologies, and big data.
“There are now literally hundreds of thousands of versions and products and distributions that vary on their feature sets,” Lopez said.
Data professionals need to think through these things at the start of a new project, according to Lopez.
Lopez’s session at Interop ITX, Data Quality: Five Mistakes that Could Sink Your Data Ship at Launch, can help attendees get ready to ensure their projects’ success. And Lopez also brings significant field experience to the presentation, having worked extensively on problem projects in the last year, and helping organizations correct course for projects that have gone wrong.
Check out the audio interview of Lopez’s interview with UBM Tech here, and be sure to join us at Interop ITX in May for details on how to make sure your data quality ship launches right the first time.