It’s a paradox that when companies are employing outside data to power their products and/or decisions they are not actually basing those on data – but on conclusions on data made by others. Let's open up the black box of location data.
Originally published at LSA Insider
Why does the black box of data exist at all? Probably due to two reasons: The first being that data is a scarce and valuable resources in the internet economy and therefore safeguarded (As an example, The House of Facebook and the House of Google are both built on a strong walled-garden data foundation), and the second being that data often is of varying quality and it is therefore beneficial for the provider of data to mask that. The paradox of course being that clients have a hard time using data to its true and fullest potential. This is true for data in general and true for location data specifically.
At Unacast we call this the black box of location data.
Unacast is a location data company that has deliberately placed (pun intended) itself square in the middle of the real-world chaos of signals – from a Darwinistic tactical reasoning: In order to build an engine and platform (and we call this engine the Real World Graph®) that can evolve with the real-world we cannot as a company connect ourselves to one data type or one industry or use case. Rather we have to be able to deal with them all, as the real-world will never be 100% accurate at any given time or understandable from a handful of sources or technologies.
Although Unacast is headquartered in NYC, our origin is from Norway and we have all our technical development in Oslo. All the Scandinavian countries are fundamentally open, and Norway especially so – to the extent that everyone can see everyone else’s salary. Most people, especially in the US, see that as completely bonkers, but it does lay the foundation of a trust-based and open society.
For this reason, not surprisingly perhaps, one of the main company values at Unacast is “Trust through transparency”, and we internally openly share everything from meeting notes, podcasts and video recording from management meetings and product meetings, to all financials on a monthly basis. Twice a year we also gather everyone for a week of strategy discussions and realignment, and we even take the whole company on vacations - to work. By empowering people, we believe we will build better products – and have more fun. And, so it was only natural to infuse our transparency value into our products too.
The typical output from a location vendor is the visit conclusion, and this is the paradox I mentioned above. Specifically, this means some kind of ID, a venue name (and perhaps address) and a timestamp. With still immature users of location data this is in most cases acceptable, for now, but it is quickly changing. With maturity, as in all hype cycles, comes an understanding that the data is not a magic wand and hence can’t do everything and only by understanding the data itself, and not only the conclusion, can one build better products and take smarter decisions.
At Unacast we think differently and we have built our data products from the start on the principle of trust and transparency, and we therefore include the underlying data traits for all rows of data. And, we are adding more for every month, based on client feedback and our increased internal learnings. One of these transparency fields is the number and names of nearby venues, and I’ll demonstrate the significance of that later in the text.
This makes the advanced client capable of both understanding the underlying skews and limitations in the data sets, as well as understanding the strengths. And for the less advanced clients we will compile all of this into a Unascore - giving them a notion of how much to trust the accuracy of the data and understanding better what data supports which use cases. As an example, it is vastly different to run a McDonalds attribution campaign versus a NYC city wide retargeting campaign.
As mentioned above, a good example of our focus on transparency, out of many, is how we show nearby venues with distance so that our clients can both see our conclusion on the likely visited venue, but also the nearby ones that could have been visited but that we didn’t end up concluding was the real visit based on our machine learning algorithms. This gives the client the opportunity to use only the conclusion or to zoom out and include more of the variables into their decision-making supporting uses cases both focused on accuracy and scale.
Why is this significant? Well, most other location data would designate the visit on ALL of these nearby locations hidden behind the black box, therefore inflating the data set sold without providing the underlying knowledge of how and why.
This is wasting client dollars but most importantly hindering successful campaigns. Both of course immensely negative for the location industry as a whole in the long run.
We don’t believe that’s the best way. We believe in trust through transparency.