In today’s episode, we will speak with Susan Walsh and learn why organizations struggle with creating and maintaining high-quality data and the steps she takes to resolve data issues. Susan Walsh has nearly a decade of experience fixing your data and founded the classification guru. Susan is a specialist in data classification and data cleansing. She is passionate about helping you find the value in cleaning your dirty data and raises awareness of the consequences of ignoring issues through our blog, webinars, and speaking engagements.
Top 3 Value Bombs:
- One reason organizations struggle with having high-quality data is they don’t value data as an investment but rather a cost.
- You must be able to track the lineage of your data to the source-of-truth, it’s critical to resolving data quality issues.
- Don’t just fix bad data, identify the root cause and resolve the issue there.
Why data quality is important:
- Needs to be as accurate as it can be, it’s all about the details.
- If decisions are made off bad data then they can quickly hurt the organization.
You created a creative acronym called COAT, and that your data should always have its coat on. Can you break down what COAT represents?
- Consistent – Use the same definitions and terminology (i.e what is a definition of customer or supplier, imperial vs. metric system)
- Organized – How do you want to use the data, can you categorize it appropriately
- Accurate – Is the data match the source of truth and the intention
- Trustworthy – Can the data be trusted to make sales, marketing, buying decisions
Why do so many organizations struggle with having high-quality data?
- They don’t value data as an investment but rather a cost.
Who in the business is ultimately responsible for ensuring high-quality data?
- Everyone, organizations should have SMEs for the various LOB’s/functions that report to a governance committee.
What are some of the common blind spots organizations face when it comes to cleaning data?
- Not fixing the root cause of the data quality issues. If you correct the data but not the underlying cause then it will most certainly occur again in the future.
In your experience, are there particular data issues that are more difficult to correct than others?
- Not understanding where the data is coming from, you have to be able to trace it to the source. What’s the original source/truth of the data is extremely important.
How is AI changing the data quality landscape?
- RPA scanning technology is advancing quickly and can be more accurate than a human process.
- Still need a human process to review classification/grouping data, it takes a lot to train the AI and understand the context and the technology is not sophisticated enough yet to understand and make the correct decision.