Josh Frank and Dusan Bosnjakovic of Intuit will be presenting at the Fraud Force Summit in October. In preparation for their debut, they are sharing a little bit about their fraud machine learning process.
Intuit has made the coveted Fortune list of World’s Most Admired software companies for 12 years in a row now, and is best known for offerings like TurboTax, QuickBooks, and Mint. However, Intuit also has a significant presence in payments processing, as attendees of the Fraud Force Summit in prior years may be aware. As a payments processor for over a million small and medium-sized businesses, maintaining strong fraud detection capabilities is vital. Device information is an important part of our defenses against fraud, but device ID is only one way that device information can be used to detect fraud. Like most users of iovation information, we use device ID to trigger rules on velocity, reputation, etc. Another possible use of device attributes is as a component of link analysis—attendees may remember we talked about some of our work in this area at the 2014 Fraud Force Summit. Our most recent improvement in how we incorporate device information in our fraud defenses is a process we call “Flash Fraud Model”.
Our fraud defenses include fraud models that in turn include device ID rule violations. As we continued to develop and refine our fraud models, we realized that we are trying to detect two different types of things in uncovering fraud—trends and patterns:
- Trends are longer-term tendencies that are true in general across specific fraud rings. They are roughly additive across variables and usually directionally consistent within a single variable.
- Patterns are a collection of short-term features that characterize a specific fraud ring at a particular period of time. Pattern risk factors are typically not additive across variables, and there is often a “sweet spot” rather than the relationship being directionally consistent across a large range of values for a single variable.
Because we are looking for two different things that are best discovered using different types of algorithms/processes, we decided to use distinct models, processes, and time frames for each, rather than reach a single compromise solution that is not optimal for either job. This resulted in the creation of the Flash Fraud Model—a model specifically designed to find short-term patterns. Unlike our traditional fraud models, the Flash Fraud Model stresses freshness and flexibility versus optimizing a model for long-term accuracy. The Flash Fraud Modeling process currently relies on an ensemble of models combined in a customized way to identify high-risk merchants. Machine learning algorithms retrain each of the models daily without human intervention, so every day we create a unique set of models based on the most recent history.
We have given the system a wide range of variables to work with, and every day the machine learning process trains itself to come up with the best set of models. One recurring theme in these daily models is that detailed device information has proved to be a very powerful source of information for finding fraud patterns. Some pieces of device information come up more often than others, but it truly is a broad range of device features that pop up as useful fraud identifiers, with the specific features that are most useful changing from day to day.
The results have been very powerful. We estimate that adding the Flash Fraud Model saves us millions of dollars in annual losses. It has also been a key contributor to more than doubling the speed at which we catch new account fraud. At a session during the upcoming 2015 Fraud Force Summit, we will give more details of what our Flash Fraud Model is and how it has helped us to catch more fraud faster.