A large number of B2B marketers – almost 27% – admit that insufficient data has cost them 10%, or in some cases, even more in annual revenue losses.
This clearly highlights a significant issue faced by most marketers today, and that is: poor data quality. Incomplete, missing, or poor-quality data can have huge impact on the success of your marketing processes. This happens since almost all departmental processes at a company – but specifically sales and marketing – are heavily fueled by organizational data.
Whether it is a complete, 360-view of your customers, leads, or prospects, or other information related to products, service offerings, or address locations – marketing is where it all comes together. This is why marketers suffer most when a company does not employ proper data quality management frameworks for continuous data profiling and data quality fixing.
In this blog, I want to bring attention to the most common data quality problem and how it impacts your critical marketing processes; we will then look at a potential solution for this problem, and finally, we will see how we can establish it on a continuous basis.
So, let’s get started!
Biggest Data Quality Problem Faced By Marketers
Although, poor data quality causes a long list of issues for marketers at a company, but having delivered data solutions to 100+ clients, the most common data quality issue that we have seen people face is:
Attaining a single view of core data assets.
This issue surfaces when duplicate records are stored for the same entity. Here, the term entity can mean anything. Mostly, in the realm of marketing, the word entity can refer to: customer, lead, prospect, product, location, or something else that is core to the performance of your marketing activities.
The Impact Of Duplicate Records On Your Marketing Processes
The presence of duplicate records in datasets used for marketing purposes can be a nightmare for any marketer. When you have duplicate records, following are some serious scenarios that you can run into:
- Wasted time, budget, and efforts – Since your dataset contains multiple records for the same entity, you may end up investing time, budget, and efforts multiple times for the same customer, prospect, or lead.
- Unable to facilitate personalized experiences – Duplicate records often contain different parts of information about an entity. If you conducted marketing campaigns using an incomplete view of your customers, you may end up making your customers feel unheard or misunderstood.
- Inaccurate marketing reports – With duplicate data records, you might end up giving an inaccurate view of your marketing efforts and their return. For example, you emailed 100 leads, but only received responses from 10 – it could be that only 80 of those 100 were unique, and rest of the 20 were duplicates.
- Reduced operational efficiency and employee productivity – When team members fetch data for a certain entity and find multiple records stored across different sources or gathered over time in the same source, it acts as a huge roadblock in employee productivity. If this happens quite often, then it noticeably impacts the operational efficiency of an entire organization.
- Unable to perform correct conversion attribution – If you have recorded the same visitor as a new entity every time they visited your social channels or website, it will become almost impossible for you to perform accurate conversion attribution, and know the exact path the visitor followed towards conversion.
- Undelivered physical and electronic mails – This one is the most common consequence of duplicate records. As mentioned earlier, each duplicate record tends to contain a partial view of the entity (this is why the records ended up as duplicates in your dataset in the first place). For this reason, certain records could have missing physical locations, or contact information, which can cause mails to fail delivery.
What is Entity Resolution?
Entity resolution (ER) is the process of determining when references to real-world entities are equivalent (same entity) or not equivalent (different entities). In other words, it is the process of identifying and linking multiple records to the same entity when the records are described differently and vice versa.
Implementing Entity Resolution Into Your Marketing Datasets
Having seen the dreadful impact of duplicates on the success of your marketing activities, it is imperative to have a simple, yet powerful, method for deduplicating your datasets. This is where the process of entity resolution comes in. Simply, entity resolution refers to the process of identifying which records belong to the same entity.
Depending on the complexity and the state of quality of your datasets, this process can contain a number of steps. I am going to take you through each step of this process so that you can understand what exactly it entails.
Note: I will use the generic term ‘entity’ while describing the process below. But the same process is applicable and possible for any entity involved in your marketing process, such as customer, lead, prospect, location address, etc.
Steps In The Entity Resolution Process
- Collecting entity data records residing across disparate data sources – This is the first and most important step of the process, where you identify where exactly the entity records are stored. This can be data coming from social media ads, website traffic, or manually typed in by sales reps or marketing staff. Once the sources are identified, all records must be brought together at one place.
- Profiling combined records – Once the records are brought together in one dataset, it is now time to understand the data and uncover the hidden details about its structure and content. Data profiling statistically analyzes your data and finds out if data values are incomplete, blank, or follow invalid pattern and format. Profiling your dataset uncovers other such details, and highlights potential data cleansing opportunities.
- Cleaning and standardizing data records – An in-depth data profile gives you an actionable list of items for cleaning and standardizing your dataset. This can involve steps to fill in missing data, correcting data types, fixing patterns and formats, as well as parsing complex fields into sub-elements for better data analysis.
- Matching and linking records belonging to the same entity – Now, your data records are ready to be matched and linked, and then finalize which records belong to the same entity. This process is usually done by implementing industry-grade or proprietary matching algorithms that either perform exact match on uniquely identifying attributes, or fuzzy match on a combination of attributes of an entity. In case the results from the matching algorithm are inaccurate or contain false positives, you may need to fine-tune the algorithm or manually mark incorrect matches as duplicates or non-duplicates.
- Implementing rules for merging entities into golden records – This is where the final merge happens. You probably don’t want to lose data about an entity stored across records, so this step is about configuring rules to decide:
- Which record is the master record and where are its duplicates?
- Which attributes from duplicates do you want to copy over to the master record?
Once these rules are configured and implemented, the output is a set of golden records of your entities.
Establish An On-going Entity Resolution Framework
Although we went through a simple step-by-step guide for resolving entities in a marketing dataset, it is important to understand that this should be treated as an ongoing process at your organization. Businesses that invest in understanding their data and fixing its core quality issues are set for a far more promising growth.
For quick and easier implementation of such processes, you can also provide data operators or even marketers at your company with easy-to-use entity resolution software, that can guide them through the steps mentioned above.
Conclusively, we can safely say that a duplicate-free dataset acts as a crucial player in maximizing ROI of marketing activities and strengthening brand reputation across all marketing channels.