Editor's Note: This article originally appeared on the DataVisor blog on February 12, 2017.

Based on a report issued by PricewaterhouseCoopers (PwC), 90 to 95 percent of all alerts generated by transaction monitoring systems (TMS) are false positives. Not only does this translate into operational overhead, it may also lead to missing real alerts hiding under the mountain of false positive alerts. This is not news to those in the anti-money laundering (AML) space. I often hear the same complaint from my colleagues who implement TMS at other financial institutions. It’s not uncommon to hear:

“Our TMS was generating a few hundred alerts every month, but after we went through the upgrade, it’s generating thousands!”

The problem with false positive alerts is that it creates huge operational overhead that translates into absolutely zero substantive suspicious activity report (SAR) filings. At a certain point, there are diminishing returns for alerts generated. A bank can only investigate so many alerts and still conduct effective investigations.

Not only do these problems generate massive amounts of false positive alerts, they make it easy for criminals to get around existing TMS. (A topic we will explore in a future post.)

Why are there so many false positive alerts?

There is a fundamental technical barrier to traditional TMS that leads to a flood of false positive alerts. The TMS rely on rules or simple models which have a myopic view of global trade, human behavior, complexity of transactional networks and hidden links between nefarious actors. They have a very simplistic view of the activity being monitored by only distilling it down into only a few dimensions for the rule to interrogate.

Here are the two fundamental problems of existing TMS:

Coarse-grained rules that result in detecting many scenarios, most of which are not actually suspicious
Using only a subset of event types and data available, which limits the number of signals they can use for detection

For example, by looking at all the information available in the following diagram, it’s clear that in these ten transactions, only one is suspicious:

However, existing TMS only look at a subset of the data available to it. One rule in the TMS may be to flag all transactions as suspicious within a specific timeframe if they’re between $9,500 and $9,999. In this case, they all look the same, so all ten of these transactions are flagged as suspicious. This is a 90% false positive rate.

Is it possible for existing TMS to make their rules less coarse-grained? No, because TMS only look at a subset of event types, so if the rules are designed to be more specific, then they will miss real suspicious activity. Casting a wide net means that they will be able to detect some suspicious accounts, but will also result in alerts on a lot more good accounts. It’s not enough to simply tweak the existing rules or simple models. Rather, it’s necessary to look to a new technical solution to address the false positive alerts plague.

The promise of unsupervised machine learning

Unsupervised machine learning (UML), if implemented properly, can solve these problems for AML teams. UML can be leveraged to reduce false positives by looking at all activity within your financial institution from a global view and linking common bad actors together. This drastically reduces false positive alerts without compromising on compliance with regulatory guidelines.

To see how this is possible, it’s important to understand the technology. UML is a category of machine learning that can detect hidden patterns in large data sets, such as fraudulent user accounts, without prior knowledge of what a fraudulent account looks like. This is different from supervised machine learning, which requires knowledge of previous patterns to catch similar ones in the future. In the context of AML, UML automatically finds these hidden patterns to link seemingly unrelated accounts and customers. These links can be one of thousands of data fields that the UML model ingests. The below image depicts customers detected by UML because they are linked due to shared attributes such as an email address, physical address, phone number, internet protocol (IP) address and a common beneficiary.

So, in contrast to using coarse-grained rules, UML considers thousands of data fields to detect complex networks. This allows UML to look at a vast array of attributes and sift the real signal (suspicious activity) from the noise. Furthermore, UML can ingest all event data, which enables it to determine if accounts have similar suspicious related activity. For example, UML can link accounts together that have similar high transaction volume with low dollar amounts in the same time window—without being programmed to look for this specific case.

UML also decreases the prevalence of false positive alerts because it can catch a group of related accounts. So, it has more confidence that these accounts are bad. Think about it – If you saw one account do something weird, you might be unsure if it’s bad, but if you see fifty accounts linked together doing similar suspicious activity, you become extremely confident that they’re all bad. UML is better at differentiating between good and bad activity, and when an alert is generated, you can be much more confident that it’s a real alert.

What this means for compliance departments

With rapidly growing compliance department costs and no decrease to regulatory fines in sight, it’s becoming increasingly clear that we need a new approach to TMS. UML can reduce compliance costs by lowering false positive alerts and reprioritizing time spent on investigations. At the same time, it can increase the quality of suspicious activity report (SAR) filings. As the number of alerts to investigate decreases, existing compliance resources can be reallocated to other important activities such as quality control, analyst training and risk assessments.

From a practical perspective, the transition from traditional TMS to one that uses UML does not have to happen overnight. Instead, using UML alongside another TMS can be a great place to start. This can be an easy and more gradual solution, which I’d recommend when implementing any new TMS, unsupervised or not.

Ultimately, financial institutions have embraced UML in other areas of banking such as fraud, credit risk and trading, so it is only a matter of time before compliance departments do the same. It’s now a question of when and which institutions will lead the pack out of traditional TMS to raise the stakes in the fight against money laundering.

End the False Positive Alerts Plague in Anti-Money Laundering (AML) Systems

“Our TMS was generating a few hundred alerts every month, but after we went through the upgrade, it’s generating thousands!”

The Other Elephant in the Room: Defeating False Negatives in AML Systems

Merchant-based money laundering part 2: Prepaid gift card smurfing

Geographic Risk Intelligencefor AML & Fraud

Five products. One API.

GeoAML

GeoFraud

GeoExtend

GeoAnalytics

GeoDynamic

Know Your Geography

1. Collect

2. Normalize & Aggregate

3. Feature Engineering

4. Machine Learning

5. Risk Indicators

Built for the full financial crime lifecycle

CDD & Risk Rating

Transaction Monitoring

Banking Out of Jurisdiction

Elder Abuse Detection

Fraud & Scam Classification

New Location Risk Assessment

Measurable results from day one

Fewer false positive alerts

Granularity over binary flags

API enrichment

Ready to enrich your risk data?

Eight ML-driven risk factors. One composite score.

What GeoAML covers

HIDTA

HIFCA

Geographic Targeting Orders

Southwest Border

Drug Trafficking Risk

Industry Risk

International Nexus

Trade-Based Money Laundering

Why zip-code-level ML matters

Traditional: County-Level HIDTA

GeoAML: Zip-Code ML Scoring

See GeoAML in action

Geographic signals for fraud detection

Fraud-specific geographic risk dimensions

Nearest Bank Branch Analysis

Elder Abuse Area Classification

Counterparty Banking Out of Jurisdiction

Gang Territory Mapping

Mapped to real-world typologies

Elder Financial Exploitation

Gang-Driven Fraud & Identity Theft

Impersonation & APP Fraud

Check Fraud & Mail Theft

Add fraud-specific geographic intelligence

Cross-attribute anomaly detection

Six geographic matching dimensions

Zip-to-Phone

Zip-to-IP Address

Phone-to-IP

Zip-to-CounterParty

CounterParty-to-FI

Closest Branch Distance

Detect geographic anomalies in real time

Deep enrichment for every data point

Phone, IP, address & bank routing

Phone Area Code

IP Address

Address & Zip

Bank Routing Number

Branch Proximity

Entity Type Classification

Risk signals regulators expect you to catch

MSB & High-Risk Sector Indicators

Registered Agent & Shared Address Detection

Business Impersonation Detection

Crypto Company Watchlist

NAICS Code Prediction

Defensible entity risk scoring

Sector Risk

Geographic Risk Intelligence
for AML & Fraud