The EU GDPR: How to Know What You Don’t Know

Here’s a little challenge for you: can you list how many departments there are within your business?  How about the number of teams that sit within each department?  If that seems too easy, then how about listing the number of databases held by each team?  And if you really want a stretch, how about taking a guess at the number of data points your business holds on individuals.

It’s likely that everybody would know (or, in the case of a large corporate, could find out) the answers to the first two.  The second two can be almost impossible to manually discover.

Some would argue that it’s easy to find the number of databases within a business but what we have discovered during the course of our work is that many organisations have terabytes of unknown data – something we reflect on in our whitepaper “GDPR – Why It’s About More Than Legislation”.

For this blog post, we’re going to focus on just one element – that of unknown data.

The Data That You Know About

Let’s say an organisation has a team for each of the following functions: HR, Finance, Marketing, Sales, Operations and Customer Service.  Each of these teams is likely to have its own master data source.  It could be as straightforward as an SAP ERP system, each of the teams having a discrete Line of Business app or database, plus the company having an overall infrastructure to provide email and collaboration software.  Every interaction leaves a digital marker, and so every piece of data and its movement can be tracked.

If your organisation only has data that it knows about, then if you are asked by an individual to disclose or delete the information you hold on them as part of the GDPR then you should be fine.  Except that you’ve probably got the following:

Data That You Don’t Know About

What the above example doesn’t include are data repositories that many organisations have, but either don’t think about or don’t know that they exist.  These include, but are not limited to:

  • Decommissioned servers that are still holding data
  • Duplicated databases from campaign activity / mergers / roll-outs of new software
  • Data that has been wilfully misused
  • Data shared with a third party as part of a service-delivery contract
  • Emailed data that has been shared innocently or to avoid corporate process
  • Development servers that are not considered as part of the company’s live data estate


















All of the above instances introduce risk and cost to an organisation.  Risk in that confidential information could be leaked, lost, or accessed by unauthorised persons.  Costs come in the form of data breaches that result in legislation, plus remediation costs to fix the weakness in the network / governance process.

Pinning Down Unknown Data

Whilst you may have unknown data, it won’t take teams of consultants or outrageous cost to locate it within your organisation, and neutralise the risk it poses.  At Exonar, we’ve developed a platform that uses Big Data and Machine Learning to track down, identify and classify data – wherever it might be hiding.  We have helped clients to find and retrieve data containing passwords, personally identifying data points and company sensitive information.  We’ve also helped them to find terabytes of duplicated information.  As part of this process, they’ve reduced cost and avoided risk but what is perhaps more important to them organisationally is that they have flushed out what was previously ‘unknown’.

Better Business as Usual

Organisations that have a firm handle on all of their data assets not only have a more stable platform for managing the customer experience, they also have greater knowledge of their overall business.  At a time when businesses are awash with data, the ability to identify it and make it meaningful has far greater impact beyond GDPR compliance, but it’s a good place to start.

Exonar are experts in helping businesses to uncover unknown data, reducing risk and cost.  To find out how we can help you, get in touch.