Data Management’ by our knowledge partner, Scott Taylor, The Data Whisperer.
To help BRIDGE THE STORY GAP between Data and The Business, we are excited to share excerpts from the book ‘TELLING YOUR DATA STORY – Data Storytelling forStructured data works harder than unstructured data
The Story of Structured Data – 4Cs of Master Data
Data management program leaders must have governance over—and market expertise on—the universal business data that exist throughout the value chain. Seemingly tactical activities, such as normalizing naming conventions, applying consistent identification keys and codes, and correcting hierarchical assignments, are the foundational data building blocks for achieving a successful data management program. The data management practice is the central point of management and control for a unified company nomenclature.
To find the value data has to offer, it must be structured. It must align across disparate sources so that you can extract and distill the most useful and relevant information. It is the differentiator between a flood of unstructured and disparate information and a standardized and structured data source that everyone can trust. What makes Big Data “big” is its lack of structure. Most, if not all, standardized executive and activity reporting comes from structured data.
Structured Data Works Harder Than Unstructured Data
Applying structure is difficult and time-consuming. It can be fraught with debates about the structure of the structure itself. That means you need to do the hard work of defining and gaining consensus on important terms like customer, brand, and market. But without those basics, the rest is a mess.
Structured data in the form of master, reference, and metadata is the most important data any organization has. It is the data in charge of your business. Since it is about your relationships and brands, is there anything more important in your business? No. So the logic follows that the data about those relationships and brands is your most important data.
The Importance of Structured Data
All the other data is about structured data. Until you have the foundational structure set, the common definitions established, and the processes in place to govern both of those, you will be tossed and turned into a sea of disparate data. The structure of your data, or lack of it, causes many issues with reporting and analytics. The usual complaints about data include:
- Why don’t we have a list of top customers?
- This hierarchy is wrong.
- Look at all these duplicates!
- These products are missing.
- What do we mean by this market?
Some in the analytics community may disagree and believe that unstructured and semi-structured data hold the most valuable insights. Unstructured data offers loads of analytical promise, insight, and value – but you need structured data first for your organization. An insurance company getting flooded with mobile-phone pictures of accidents needs to have machine learning and artificial intelligence processes to interpret these unstructured inputs. If their machine learning algorithms don’t recognize a front left fender as a front left fender, the service will not be accurate. Applying structure to unstructured data is how value is released. But before you can do the cool stuff, you need to do that hard work. As Pink Floyd (and your mother) reminds us – You can’t have your pudding if you don’t eat your meat. Structured data is all meat.
The Basics Still Matter
Despite all the clamor and celebration of unstructured and semi-structured data, the basics are still the basics. While data scientists spin their graphs and search for analytical needles in their big data haystack, the business needs answers. Answers to questions like:
- How many customers do we have?
- Are sales up?
- Have we increased market share?
- Where do we deploy media?
- Which partners are the most effective?
If structured data is about your relationships and your brands in the beginning, where do you begin this beginning? How do you structure these relationships? How do you start to codify these relationships? And most importantly, in the context of Telling Your Data Story, can you convince your business stakeholders to take this seriously?
The 4Cs: Code, Company, Category, and Country
Let’s go back to the subject of most data– relationships and brands. Relationship and brand data are most often organized by segmentation, aligned by hierarchy, and viewed by geography. A simple way to describe the basic structure needed for your relationship and brand data are The 4Cs: Code, Company, Category, and Country.
- A CODE lets you know something is unique
- A COMPANY lets you know who owns it
- A CATEGORY lets you know what kind of relationship it is
- A COUNTRY lets you know where it is
Code – Is it unique?
Every record about a relationship and a brand in a database has a code – somewhere. You need some form of unique identifier – a customer code, a record ID, a product code. “I have a code. Therefore, I am.” in a database. Once a code is put on a record – it “exists” in that database. You need a code to make sure it is unique. But since every system has its own set of codes, you probably have more than one across your multiple workflows, departments, and regions.
Company – Who owns it?
You need to know whom an entity belongs to through a hierarchical structure, often known as a parent/child relationship or family tree. A hierarchy has multiple levels, from the local branch, divisions, subsidiaries, all the way up to an ultimate global parent. Bill-to, ship-to, plan-to, sell-to are all part of the hierarchy. The bigger the relationship, the more complicated the hierarchy.
Category – What is it?
You need to know what kind of thing you are dealing with, especially if you don’t have much of a relationship with it. Categories define TAMs (total addressable markets), enable segmentation, and are the denominator for penetration analysis. Category attributes determine targeting. You try to find likely prospects based on industry, segment, or sub-segment.
Country – Where is it?
Following along our alliteration with Cs, you also need country and some form of geography. Geography has a hierarchy too – region, province, city, zip, or postal code. The media market, sales market, and measurement market are different configurations of geography depending on your use case. Similar to a category, geography determines sales assignments and media placements. But agreeing on a standard definition of the market will clear up lots of confusion between departments when you simply ask, “How am I doing in New York Metro or All Major Markets or EMEA?”
Imagine how well your data would flow if you knew you had unique records (code), that every one of them had a full and updated hierarchy (company), complete segmentation (category), and consistent geographic location information (county and market). This creates a common language between departments about simple but vital elements of your business relationships. WHERE something is, WHAT it is. WHO owns it and that it is UNIQUE.
Suppose you can confidently determine where something is, what kind of thing it is, who owns it, and that it is genuinely unique, leverage those definitions with your business stakeholders. In that case, you can more easily manage your relationships across your company.
Your Essential Business Vocabulary
The 4Cs are also the basis of your essential business vocabulary. I will cover the importance of vocabulary in a later chapter. The 4Cs represent the characters in your data story. Entities, Hierarchies, Segments, Geographies. Depending on your business dynamics, you might call them:
- Outlet, Account, Channel, Market
- Item, Supplier, Sector, Region
- Product, Brands, Segment, Market
- Matter, Client, Type, Office
- Consumer, Household, Demographic, Metro Area
4Cs for Better Data Integration and Decision-Making
Once the data on your relationships and brands is structured and standardized, it can harmonize and integrate better into your processes, methodologies, and workflows between your systems, regions, and go-to-markets-as well as externally within an ecosystem. As you try to gain a holistic view of your relationships and anticipate future needs, applying these 4Cs will align you to your data objectives more quickly. It will also give you the structure and scalability required for your enterprise data journey. All sorts of data problems go away.
Think of all the data projects and analytics efforts that depend on entity uniqueness, standardized hierarchies, consistent segmentation, and precise geographies. Data scientists waste time unduplicated entities, determining hierarchy, trying to reconcile segmentations and geographies. They call it munging and wrangling.
How many executive decisions are made upon customer counts, family trees, prospect targeting, and market coverage? Many of your data efforts will fall back on disambiguation, hierarchies, segmentation, and geographies—code, company, categories, and countries. Authenticated identity and determining uniqueness vanish with consistent management of identifiers.
Let’s just take a short Zen moment and think about the potential clarity this brings to your data.
Take a moment—deep breath.
It feels better just thinking about it.
In part 6 of this series, we will explore telling data stories to executives. How to formalize and organize your data story by uncovering your company vision and expose the data challenges that, in many cases, are hidden in plain sight.
Excerpted with permission from Technics Publications from ‘TELLING YOUR DATA STORY – Data Storytelling for Data Management’ by Scott Taylor, The Data Whisperer & Infocepts Partner.