May 14th 2020 – By Anna Cotton

The power of data discovery

Originally published in Global Data Review.

Exonar started life as a forensic data discovery company, helping organisations which had suffered a data breach track their information. It’s now developed those capabilities to help clients extract the value in their data. Chief Executive Danny Reeves spoke with GDR about the importance of understanding data in order to extract its value.

I’d like to start with data as an asset, a term that is used more and more. What are the kinds of practical, concrete changes that need to be made for an organisation to start seeing data as an asset rather than a liability?

The best way for me to answer that is to describe the fundamentals of data discovery, because the vast majority of organisations don’t actually know what data they’ve got. A lot of organisations are applying data policy governance, and some of them are pretty good at it, and this is enabling them to build policies, to educate, manage people and processes. But the horse has already bolted because they’ve got this back-end of massive structured and unstructured data, and they usually don’t really know much detail about what is actually there.

Organisations often employ various resources and consultants to go and give them a report on what data they’ve got. But then, as you can imagine, that data is out of date the minute the report is produced. And there are very few systems that I am aware of on the market today which make a good job of continual data asset monitoring. And what I mean by that is, first of all, understanding what data you’ve got and then monitoring what you do with it and making sure that it continues to report and monitor what’s going on in the meantime – and therefore how effective your policy and governance processes are. For me that’s fundamental.

In my experience and in the experience of people we talk to, when they’re completely frank and honest about what they’ve got, I haven’t yet found one person that can honestly say they know what they’ve got in detail, at depth. And if you think about that even further on, you can have the best business intelligence and analytics systems and processes in the world, but if you still aren’t feeding in enough of the detailed data, then they’re only going to be giving you output which is based on the data they’ve got.

Do you have an example?

I spoke to an organisation that manages loyalty schemes. This organisation manages some of the largest loyalty schemes of the largest retailers on earth, so they’re absolutely a data business and they have a whole bunch of data scientists who create outputs to inform their customers of sentiment and behaviours and all that sort of stuff. And I said, “That’s great, so you obviously understand all of your data at depth and at scale.” And the answer I got was; “Yes we do.” And it was followed by “However…”, and what they said was a lot of their data goes to the data science department and there’s a whole bunch of things the data scientists do with it and they don’t necessarily have clear visibility or tracking of all of that data. So it’s almost a dataset within a dataset. So even organisations that fundamentally survive on the basis of using data as an asset still admit that there are whole areas of their data estate where they don’t have a good understanding, a clear understanding of what’s in there. That goes right back to my first point, which is that organisations don’t really know what they’ve got or how to find it, how to identify what’s in it and how to use that easily.

How do the GDPR principles of data minimisation and purpose limitation interplay with data discovery? If you’ve helped a client unlock their data, should there be someone in the organisation urging caution about how they use it?

That links into data policy, governance and education that I mentioned earlier. We’ve seen the importance of DPOs [data protection officers] and people like that, because of their ability to understand and apply policy to what organisations do. What we’re now seeing is that moving beyond policy and box-ticking.

There’s also the concept of “zero trust”. It’s a fairly emotive term, but you can look at it in a number of ways. There’s the obvious negative sense of zero trust, which is “don’t trust anybody, expect everybody to be nefarious or destructive or have bad intent”. Another way to look at it is to not trust that your policies and education is going to give you an absolute response, an absolute capability. People have got their own jobs to do, they’ve got their own area in the business which is not about data governance, it’s about creating revenue or driving the business forward or whatever. So where we see the debate rise up is really about applying that zero trust concept, which says that all your data discovery needs to be continuous. Once you’ve applied a bunch of changes like moving data or deleting duplicate data, you still have to monitor what’s going on a day-to-day basis. I guarantee that if you remove all of your duplicated data by the following day, if you’re a big organisation, a bunch of that duplicated data is all over the place again – and that’s because you can’t trust that your policies and DPOs have absolutely educated the workforce to the point where they’re not going to make any mistakes. It’s just impossible. It’s never going to happen.

Now we’ve seen the data issues rise up to the chief information officers and chief technology officers who are saying; “Okay, well how can we apply the tools which monitor this, as easily as possible, so we can record that against our policies?” That’s the thing that organisations and enterprises are becoming more focused on, because they realise this zero trust thing isn’t just about protecting yourself from the bad people, the ones who did bad things, it’s actually just providing your people with the support they need so they can focus on the business – and don’t trust them to be good policy followers or good governance followers, because they’ve inevitably got more important things to worry about.

Do you find that organisations are able to put a dollar value on their data? Some observers have said this is possible by evaluating the risk or reward associated with the data.

Frankly speaking, we have not yet found an organisation that is confident about putting an asset value on its data. And that’s back to the earlier point; we have this belief that all organisations should manage their data to protect and empower the people they serve, and it’s that empowerment piece which is starting to get traction. Organisations are moving beyond the pure data protection initial box-ticking phase. They’re now moving to: “How do we help our people to apply these policies without thinking about it?” That’s one step.

The next step is more organisations now saying: “we’ve got visibility and all that data, what is the value of it?” We have examples of companies we’re working with who have asked us to develop tools which give them a simple view of ‘X’. So, if they have many, many datasets across many different acquisitions for example. That’s moving into the M&A space, and having the ability to explore all of those datasets and look for patterns, whether those patterns are natural language patterns, whether they are names and addresses, whatever it might be. We’re developing tools to help people to see that single view.
That means they can look across a business that’s got, for example, eight acquisitions and therefore eight different systems running. They can look across that and say, “what patterns are matching that we have within that data? And therefore, what can we learn from that which potentially gives us strategic value across those acquisitions, and therefore start to develop some asset value from the multiple data sets across multiple different M&A acquisitions?”, for example.

That means they’ve started to look for opportunities to realise asset value, but we do ask the question, “what value is that to you?” And they can usually give a worded articulation of what the asset value is. They’re seeing asset value and their ability to be much more effective at working with their supply chain or their customer base, much more efficient around both the technologies and also what they learn from that data. But to answer your question specifically, I haven’t yet found an organisation that can say; “This is the number value I can put on my data.”

There are some organisations like, for example, [UK car breakdown service] the AA. They were one of the early organisations to demonstrate data asset value; I don’t know what the exact numbers are, but a lot of their revenue was being realised on an annual basis from the data they were selling back into the motor manufacturers. They were one of the earliest to realise data is an asset, and actually the message from them is they sold the information back to the automotive industry and became revenue positive, the majority of their revenue came from that activity rather than going out fixing cars. So, there are good solid examples of where we see organisations realising the asset value but I think across other industries and other sectors, we’re still going through a lot of learning about how that data can be realised as an asset. I think the bottom line is that virtually all the organisations we’ve talked to appreciate that data can be an asset. Very few of them are able to put a dollar sign on it right now, but most of them are thinking about how they get to it.

What exactly does Exonar do for clients in this field? And what type of people do you work with in businesses – is it data scientists, data governance professionals, legal teams, chief data officers, or some combination of all of them?

The company was born through discovering what data had been lost in a breach – this is many, many years ago – but an organisation that had a data breach couldn’t work out what data they have lost, and therefore had no idea of the risk it posed. The founders of the business developed some software that helped organisations understand what data had been lost, and they were able to come up with very clear evidence on what data had gone missing, and therefore what was in that data and therefore what the risks associated with that was. That’s how Exonar was born.

What that developed into is the core of what we do; we’ve been described by a customer as turbocharged Google for the enterprise. What was meant by that is if you have the internet and you didn’t have Google, or you didn’t have a search engine, how do you ever know what was there, what to look for, how to search for it, how to get to it? You wouldn’t – it’s just too vast. We believe that that’s exactly where most organisations are in terms of all their datasets. It’s not the whole of what we do but is the core of it. We go in, we discover the data, we then index it right down into the body of text: that means right down into the words that are within whatever that dataset is, and then we build an index of all of that. Then if you ever want to search for something in that data and you don’t have to go out and search the data, you search the index. It’s very, very fast and its index is at scale.

So, the two things that we do at core is: we are able to index a billion items, for example, within a dataset and we index it down to the detail of the body of text. Once you have that index, you can hand it to the data scientists and say: “Fill your boots, what is it you’re looking for? What is it you’re trying to achieve? What pattern matching do you want to apply to this?” We fulfil the fundamental requirement of any good data policy, governance or assessment is that indexing of what data you’ve got, where it is, and what’s in it. So, it is about protecting an asset but you can also say to the index: Find me every file that’s got 100 names and addresses in it and it’ll come back and give to you all of these data points, all these items have 100 names and addresses in it. You will instantly know that’s likely to be sensitive data.

In terms of who we work with, as you’d imagine, the range is right across the organisation. We work with the data protection officers and we work with the people that are tasked with policy, procedure and governance – that’s because it’s their job to apply or to design and apply those policies and then at ways to implement them, and obviously our technology can help there. We also work with a lot of the technology and the operations people, but more and more the engagement is now right from the chief technology officer through to the chief information officer and the chief executive.

At the c-suite level, they’re trying to think beyond the operational day-to-day, the policies and governance. They’re thinking more about what is the purpose of their business, what is their value to the customers and to the market and how can they best serve that, They’re the ones that are now saying: “Okay well now we’ve got all this data we can turn it into intelligence, now we’ve got this intelligence, how can we apply that to ensure that it’s asset value, but we’re also serving our purpose to the people we serve more effectively”. And we’re seeing more and more of that level of people within an organisation wanting to understand what we do because they can think about it from a more strategic standpoint rather than from a governance and policy standpoint.

Do you meet chief data officers?

We do, yes. We’re actually going through a process right now of trying to be clear about what the key points for each of these personas are. Each of these roles will have a slightly different take on what the value of the data asset is and therefore what they want to be thinking about and what they want to be doing with it. The point for us as an organisation is – because we’re seeing more and more of people with those sorts of roles and titles in business start to ask questions about what they can do with data discovery – that’s something that’s evolving all the time. I mean even in this current climate, it’s forcing people to re-evaluate again what value they can get from data and how that can help deal with the situation and how they can come out stronger.