Data Enrichment for E-Commerce webinar

950 ways to sell more with Machine Labs and data enrichment.

Learn what it is and tips on how to use it with our CEO, Andrew Veitch.

You can download the slides used in the webinar here:

Data Enrichment Webinar

Webinar Transcript

Hello, I’m Andrew Veitch, founder of Machine Labs. Welcome to our first ever webinar. This is on the subject of Data Enrichment. Let me just quickly share my screen. This will be all much more professional on our second webinar show.

Data enrichment.

So what we’ll be covering; what is data enrichment, what you can use to do your data enrichment, and how it helps your business.

So let me start just by telling you a short story, I’ve got a friend who runs a beer and wine shop called the Cork and Cask in Marchment in Edinburgh. He knows all of his customers individually, he knows a little bit about their stories and the sort of things that they might like. The challenge we have in E-commerce or direct to consumer, is we don’t have that same level of understanding of our customers. Sometimes we just have a name and an address in a database, maybe even just an email address. Sometimes if we’re lucky, we have some purchase history, but we just don’t really know who the customer is. And that’s where data enrichment comes in. Data enrichment takes that name and address and actually gives you a really rich detailed picture of who the customer actually is. So that we can be just like my friend who actually runs his bricks and mortar store.

So how it works? Well, what we do is we start by trying to match the customer in your Shopify or Magento, or whatever it is shop database with other data sources. Most of the time we do this matching by postal address, it can sometimes be done by email address, but postal address is much better. We do two attempts. So the first attempt is the actual household, so that’s a postcode and a street number and we can get that right usually more than 90% of the time. And that’s incredibly accurate, because you’ve got exactly got the person. There will be times where you can’t do that, quite often, if someone’s just very recently moved house and the databases haven’t caught up can be can be one time.

If we can’t do that, we can then move to a postcode level match, which is roughly half the accuracy, because there’s 15 households roughly for every postcode in the UK. I should say as well, just on that subject, I’m going to be concentrating on the UK in this talk. We do also offer data enrichment for pretty much every developed country in the world. Some of the details will vary slightly, because we use slightly different data sources, but the broad concepts are exactly the same. Certainly the US, Canada, Australia, all work in much the same way.

And then after we’ve actually got this matched data, we then add it to your customer record. Sometimes this is called data appending. So data appending, and data enrichment are just exactly the same thing. I personally prefer the term enrichment. But if you come across appending, it’s just the same.

So what I’ve done here is I’ve taken a screenshot of the contact page. This was actually a real live customer so I’ve taken this screenshot carefully just to take the demographics bit out, because obviously, I didn’t want to actually publish an order history, name, address, picture of a house of an actual, an actual real client. But we can see here, this is obviously an older person living in the council house. We have five pictures that we put up for every contact in your database. There’s a sort of sketch on the left of what they might look like, a photo of roughly what we think they might look like, a photo of their house, obviously, if you want to see an actual real photo of their house that would be available just if we just scrolled up a little bit, because we would just then use Google street view to show a picture of the actual house. Then we have a picture of the sort of thing that they like to be doing and then on the right we have a picture of what their car might look like. Or it’ll be a picture of a bus stop or a train or something if we think that they mainly use public transport. Then obviously we have that age group income and the sort of household technology level. And then an additional one on here is the best hour of the day for contacting them by email, we would also have got their gender and that was just the up on the top.

I should also say as well, just just stepping back on this, obviously, data enrichment is not, you know, 100% accurate science, you know, so we’re not actually saying this is necessarily absolutely the case, we’re saying that these these numbers are probably somewhere in the region of about 80% accurate. So for database segmentation 80% accuracy is pretty good. Because if you think about, you know 80% accuracy is certainly going to be enough to dramatically improve your conversion rates.

So where does the data come from that we use to enrich your customer database? And again, this will slightly vary by country, I’m also not going to go through every one, but let’s kick off.

Jörg Michael, and I’m afraid I’m not entirely sure I’m pronouncing the Jörg bit right, Jörg, Jörg? Anyway, he puts together a huge database. Originally, he called it the sex machine, somewhat unfortunately, all I know is he’s now changed it to a slightly more politic, politically correct name, and pleased to say. This maps 40,00 first names to likely gender. So again, you can’t honestly be 100%. But I mean a study of some academics who looked at it in the UK actually came to 92% accuracy on their test samples. So it really is pretty good. It also takes into account the country people are from. So Lillian, in the UK, would definitely always be a female name, but Lillian and I hope you’ll like my French translation there would be male in France a lot of the time. Database is pretty big as well, it covers all of Europe, China, India, it also covers Korean, which I forgot to put on that list.

So that gives us gender next up. We look at what I’d say just before moving off that if you are an a Machine Labs customer, you will get the gender data enrichment free, because that is free to us. Everything from this point on we have to pay for, so unfortunately, because we have to pay for it, I’m afraid you have to pay for it too. So there is an extra charge for all of this, this enrichment from this point on.

Experian ConsumerView has 49 million adults in the UK, which frankly is virtually all of them. Experian, you probably do know Experian, they’re famous for doing credit rating so if you’ve been declined that credit card, it may well have been Experian that was behind that. They give us Mosiac groups and types which are becoming a bit of a currency, they’re just a way of grouping people together and are handy for seeing, roughly, who your customers are. And you can also advertise specifically to these groups and types. ConsumerView itself has 500 demographic variables and you’ll be pleased to know I’m not going to go through all 500. But age, income, number of adults, age of the adults and children in the household, the type of house it is, whether there’s a garden, whether it’s rural or urban are some of the key ones.

Next up, we have Kantar TGI, which is a giant research project of 85,000 consumers over 35 countries. Now, again, at this point, we’re now moving away from data about an actual customer to more panel based research. So how panelbase research works is you, you do a survey of a portion and then you can extrapolate that out. So what we’re basically doing here is we’ll look at your database, we say for each individual customer which Kantar TGI respondents does this customer look like and then we will, we will infer this. So again, it’s definitely not totally accurate, but it’s massively more accurate than chance and you know, it’ll be right much more often than not. So from Kantar TGI we can work out what supermarket they shop in, membership of a very wide range of organisations, number of books they’ve bought, number of holidays they’ve going on, whether they’re willing to pay more for green products, the newspapers they read, although that’s probably obviously more online though, and how internet savvy they are.

Another big research organisation is YouGov. This works slightly differently, rather than interviews, this is done by an online survey, which is probably, as it’s online, it’s probably less good than an interview. But on the other hand, it allows much bigger reach. So you’ve got a 70 million panel from 50 countries, which is absolutely enormous. This gives you a huge range of interests, you know, beauty, books, cars. Actually, we have sex and relationships, which probably leads to weddings, and actually, family size and divorce are also both demographic variables that we can give you likelihood of. Of owning different types of pets, what sort of smart speaker they have.

So if we go through these and the previous data points, you’re now at about 950 demographic variables, which likely covers whatever it is that you’re interested in. What I will say though is, if you know, 950 things about someone in practice, you really know 951. So if you have a variable that isn’t in our data set of 950, it will, your variable will almost certainly be correlated with some of these 950. So in other words, really, whatever it is that you’re looking for, chances are, we can work for other things, we can work with other things that are related to it.

The benefits. Okay, so we’ve got all this data in the database, you can look at a customer, how does it actually help? I mean, the first and most obvious one is different messages for different demographics. So my main, main marketing job I spent most of my time in was Diet Chef. At Diet Chef, I would have a customer who was maybe a 70 year old man who had been told to diet by his doctor, and then or I would maybe have a 21 year old woman who was dieting in order to look good in her bikini on her holiday. Now, would I want to communicate using the same images and the same messages with the 70 year old man and the 21 year old woman? I think the answer is no. So again, if you look here at two of them Mosiac groups I’ve put out, you know, E20 classic grandparents, over 70s, mainly J41, which would be about 18 to 25. In urban settings, I mean, clearly, you would not want to message them using the same types of language.

Next up product recommendation, because it’s not just how you message people, it’s also what you sell them. So we use a deep learning AI. So if you’re using the Machine Labs smart products feature where you know, you drag the product into the email and it gets replaced with relevant products for that person. You know, competing products will just use purchase patterns and maybe a bit of behaviour. What we’ll also feed into the AI is gender, the location within the country and again, even here in the UK, you know, purchase patterns in Glasgow and purchase patterns in London are very, very different. Their behaviour, again, the 500 variables from Experian ConsumerView and market research, and Kantar TGI, YouGov census data. I didn’t actually mention census data, but that is another one of the data sources. And there’s a few others as well, actually. But again, I didn’t want to bore you rigid going through absolutely every external source. But certainly, if you put in 950 variables, or three or four variables that the competitors do, I can assure you 950 gives you a much better product recommendation.

The next thing is it helps you understand your database. So that’s, there’s three things that we do on our report there. First off, we tell you who your best customers are. Once you know the demographics of your best customers, it’s quite easy to then go out and get more of them. But just, just to be clear, by best I mean customers with the highest lifetime value, ie the customers who spend the most with you. I mean, in a recent Joy of Marketing podcast, I spoke to the director of Sky Adsmart so you can actually just literally give him one of the Mosaic demographic codes and he’ll advertise to them. Media buyers will find you offline media, or on Facebook or Google you could choose demographics that are similar to your best customers. Conversely, you can, you can look at the least valuable customers and cut back advertising in that area. We’ll also give you a report of the customers being recruited over the last 30 days. So obviously, what you’re wanting to do here is hopefully see that it’s your best customers that have been recruited and not your least valuable ones. Again, this is also very important because it’s fairly easy to get focused on CAC cost, customer acquisition cost, but sometimes paying more to recruit a good customer is a much better thing to do. And this is just a quick look at the demographics report, which is coming up real soon. So again, you can see at the top line, what your database actually looks like, divided by group, then who your best customers are, lowest value customers, which is essentially just that report in the other order, and then who you’re recruiting at the moment.

So thank you very much. I hope you found some of that useful. We’re going to be doing these webinars every month. So I hope to see you next month for our next webinar.

Thank you very much

You can download the slides used in the webinar here:

Data Enrichment Webinar

Reader Interactions

Leave a Reply

Your email address will not be published.