by Mike Cassidy
Noam Naveh learned the anti-online-fraud game from the ground up, starting as a guy who manually reviewed suspicious online orders before becoming one of the foremost experts in deploying automated systems to foil digital fraudsters.
Along the way, Naveh was the lead analyst for Fraud Sciences, a Tel Aviv-based firm that was acquired by PayPal in 2008. Naveh spent four years with PayPal before forming his own consultancy, Fraud Strategy.
Naveh stopped by Signifyd’s San Jose headquarters, where he presented “Machine Learning and Human Teaching” as part of a Payment Fraud Meetup for e-commerce fraud protection professionals. His presentation was one in a series of meetups hosted by Signifyd. The next gathering is scheduled for Sept. 14, when Google’s Swami Vaithianathasamy will explore the evolution of online fraud protection and the role of machine learning in the field.
Meanwhile, we took the opportunity during Naveh’s visit to sit down with him and talk about the roles of humans and machines in fighting e-commerce fraud. This transcript has been edited for length and clarity.
Q1: You’re a leading thinker in the area of combining humans and machines to achieve optimal results. Can you talk a little bit about what machines are good at and what humans are good at?
A: So, machines are very good at doing repetitive tasks very, very quickly. And they have infinite, very good memory. It’s huge and it’s also very reliable. They don’t forget and that makes them very adept at dealing with repetitive, mechanical tasks that need to be done very, very quickly — and forever. They don’t get tired. They don’t care about repetitive, mechanical, menial jobs. That makes them perfect for dealing with the huge stream of transactions, for example, or anything else that is at scale.
We have now connected in the world all these billions of people with billions of devices, transacting in many ways. So, we need solutions that can actually deal with this kind of scale.
Humans, on the other hand, have all these disadvantages when we compare them to machines. They forget. Then they get bored. And they have biases. They’re definitely very, very slow. But they are very adaptive. And they understand context in a way that is very hard to teach machines. They can look at new problems and they can figure out new ways to solve them without prior knowledge, sometimes without prior experience, without prior examples, of the same problem. And that’s a whole realm where the machines just cannot do very well.
Q2: So, what is the role of machines in fraud prevention?
A: Fraud presents, basically, a competition between intelligent humans, the fraudsters, and other intelligent humans — the humans that have developed and devised algorithms or are using machine learning. Then they see if they can outwit each other.
There are no set rules. The rules are changing all the time and that is specifically where machines fall short. If we can agree on a set of rules and that set of rules is going to stay in effect for a while, then over time, we can devise statistical modeling that can solve problems very, very well. But the rules are constantly changing and the data that is used in order to solve the problems is constantly changing. And the way that the fraudsters approach the problem changes. And there are other changes, in context, for example.
It used to be the case that in order to buy something online, you had to provide a lot more information than you do now. Sometimes today, you purchase with one click. So in a world where we’re trying to create a fantastic user experience for the good guys, we’re also creating a situation where it’s very hard for us to prevent the bad guys from enjoying that customer experience as well .
Q3: Where do humans come into the equation?
A: The discussion starts with, OK, we are all in agreement already about the set of things that we are looking at when we look at the transaction in order to decide whether it’s fraudulent or not. But now the question is, when we are looking at those details, are we missing something? Is there data there that is confusing us, that’s misleading us? Are we misinterpreting the data? Is there data that is not there that we need to make a better decision?
Other types of questions of that sort, for example, are critical when you’re doing machine learning for fraud prevention.
You have in your memory, in your database, all transactions for which you know what happened. This database, or this history, becomes what is called the training set. This is what you feed into the model when you’re asking the model to learn how to predict fraudulent transactions from legitimate transactions.
And then the question is, are those transactions that you have in this database, in this memory, the outcomes of those transactions, are they correct? In other words, those transactions in your memory that are categorized as legit transactions, did they really end up being legit? And those transactions that are categorized as fraudulent transactions, are they really fraud?
Q4: How do machines learn?
A: So, the way that machines learn is that they start learning every time from fresh. When you create a model, you actually start from a point of zero. You use the data set to train the model to get to a certain accuracy and then you launch the model and let it make decisions in real life. When you feel that the model is out of touch, or has deteriorated in some performance, you take a new set of data and you train the model from scratch. The model doesn’t have a memory anymore of what it used to think. You train it from scratch.
Q5: Are those who run e-commerce businesses or those who work for them at all suspicious of machine learning? Do they feel like they know better than a machine? Do they fear that machine learning could replace them? Or are we past all that?
A: I don’t know that I’ve heard that being discussed in the industry. I don’t think that anybody is hampering the advancement of machine learning in fraud prevention just to make sure that people keep their jobs. I don’t think that’s a good idea. And I think all those people that do manual review, they are so so expensive for an organization to recruit and retain and manage and run. Organizations would love to be able to replace them with machines. But the reality is that it’s very hard to do. And so even very large, very advanced, very smart organizations still employ hundreds of people that do manual review.
It’s a very hard problem to solve because we are dealing with intelligent, human fraudsters on the other end of this.
Q6: It sounds like there are some particular considerations for using machine learning for fraud prevention. For instance, with a machine-learning based e-commerce search engine, you rev it up and the engine just gets better and better as it learns. In fraud, it doesn’t seem you can set it and forget it.
A: That is exactly the point. Nobody is trying to outsmart (a search engine), so the machine can get better just by collecting more data. In fraud, unfortunately, we don’t have that luxury. We do expect the machines to get smarter over time, because we develop new ways, new algorithms for machine learning. And we can deal with more data. We have a stronger computing power. We can refresh models more quickly as a result.
But these are only solutions that sort of lie inside the envelope of the existing problem set, the existing data. Once you start dealing with changes in what the fraudsters are doing and changes in product and changes in the data, these are new things and the model cannot just incorporate them on its own. It needs humans to distill this and sort of make it palatable for models to consume.
Robot photo by Mike Cassidy. Photo of Noam Naveh, courtesy of Noam Naveh.