The importance of data cleansing

Luigi Raw
By Luigi RawSenior Data Engineer
5 minutes to read

In order to create unified and effective marketing campaigns, we have access to a wide range of client data.

This helps us gather the insights we need to make informed decisions. But with this, and the introduction of GDPR, we have to be able to filter through, identify, and safely store data. Which is why data cleansing is so important.


What is data cleansing?

The obvious starting point is to quickly answer “What is data cleansing?” Well the technical answer is:

“Data cleansing is the process of altering data in a given storage resource to make sure that it is accurate and correct. “

To put this in plain English, data cleansing is ensuring the data you hold is useful and fit for purpose. This may involve modifying data, removing unneeded or unused data, and aggregating data from various sources.

Our focus throughout this article is around the securities and access to data. Ensuring only the relevant data is captured, stored and used.


Why is data cleansing important?

From contact details, to browser information, through to device specifications – the type and volume of data available to us is constantly growing.  This can make it difficult to sift through to find useful and insightful data. Especially when there’s a mixture of irrelevant or obsolete entries.

A data cleanse will help you easily interpret data so you can make informed business and marketing decisions. It also ensures that people who might not be as experienced or analytical will be able to make sense of the data.


The “use it or lose it” mindset

When data cleansing, a ‘use it or lose it’ mindset can be useful to decide what data is needed to achieve your goals. This mindset can be adopted throughout the data capture, storage, and reporting journey too.

At the data capture stage, data cleansing can help you identify what data is critical compared to ‘nice to have’ data. This can inform what you need to capture through on-site forms (i.e. contact, registration, and checkout forms) which can in turn reduce form sizes and improve the overall user experience.

For example, a registration form may request email, telephone number, and address. But as communication is only via email, the other details are ‘nice to haves’. Similarly, when a full address is needed to verify the country of origin, the other data is immediately redundant.

This approach should also apply to data storage; you should only retain data that is pertinent to servicing the customer. Under Article 5(1)(c) of the General Data Protection Regulation’s (GDPR) ‘Data Minimisation’ policy, data should only be retained that is adequate to fulfil the purpose, and should be relevant to that purpose. More information on this can be found about this via the Information Commissioner’s Office (ICO) website.

If you filter and store the data correctly, it should already be in a satisfactory format. If not, then data cleansing should start further up the funnel to ensure all data is correctly data gathered and retained.

Once you have this process implemented, the next stage of data cleansing is access segmentation or “need to know” reporting.


The “need to know” reporting method

Depending on what type of business you’re in, certain pockets of data may not be required by all areas of the business in order to get their jobs done. For example, a paid media marketer will find a list of products sold, number of sales, and revenue useful for making decisions, but access to personal details such as name and address isn’t necessary.

This is again covered by GDPR. While it might be an extreme example, it’s important that data is properly stored and sorted in order for you to remain compliant. This policy clearly outlines that data should be processed in a manner that ensures appropriate security, including protection against unauthorised or unlawful access and against accidental loss, destruction, or damage.

Trying to police data access in a company where everyone can access everything would be near impossible and potentially damaging.


The final word

An unavoidable consequence of holding data, is that it makes businesses targets for data breaches. There are many types of attacks which can cause a breach including phishing, hacking, and even internal threats.

32% of UK businesses have experienced a cyber breach or attack in the last 12 months, so it’s clear it’s a growing problem. Data cleansing is just one part of the data security model, but is a strong starting point to protecting sensitive information.

For more information, the ICO website has a useful checklist on data security that can help you quickly assess your policy.

comments powered by Disqus

Articles by Luigi Raw

Biography coming soon!