Data reusability: The next step in the evolution of analytics

Jul 25

Data reusability: The next step in the evolution of analytics 2017-07-25T16:36:19+00:00

Editor's Note: This article originally appeared on The Asian Banker on July 20, 2017.

Data reusability will lessen the response time to emerging opportunities and risks, allowing organisations to remain competitive in the digital economies of the future.

If data's meaning can be defined across an enterprise, the insights that can be derived from it expand exponentially
When financial institutions work together to identify useful data analytics solutions they can produce great results and add a lot of value to their customers
The analytic systems of tomorrow should be able to take the same data set and process them without modifying them

If data is the new oil, then many of the analytical tools being used to value data require their own specific grade of gasoline, akin to needing to drive to a particular gas station with a specific grade of gasoline with only one such gas station within a 500-mile radius. It sounds completely ridiculous and unsustainable, but that is how many analytical tools are set up today.

Many organisations have data sets that can be used with a myriad of analytical tools. Financial institutions, for example, can use their customer data as an input to determine client profitability, credit risk, anti-money laundering compliance, or fraud risk. However, the current paradigm for many analytical tools requires that the data to be used must conform to a specific model in order to work. That is often like trying to fit a square peg in a round hole, and there are operational costs associated with maintaining each custom-built tunnel of information.

The advent of big data has opened up a whole host of possibilities in the analytics space. By distributing workloads across a network of computers, complex computations can be performed on numerous data at a very fast pace. For information-rich and regulatory-burdened organisations such as financial institutions, this has value, but it doesn’t address the wasteful costs associated with inflexible analytic systems.

What are data lakes?

The "data lake" can provide a wide array of benefits for organisations, but the data that flows into the lake should ideally go through a rigorous data integrity process to ensure that the insights produced can be trusted. The data lake is where the conversation about data analytics can shift from what it really ought to be.

Data lakes are supposed to centralise all the core data of the enterprise, but if each data set is replicated and slightly modified for each risk system that consumes it, then the data lake’s overall value to the organisation becomes diminished. The analytic systems of tomorrow should be able to take the same data set and process them without modifying them. If any data modifications are required it could be handled within the risk system itself.

That would require robust new computing standards, but at the end of the day, a date is still a date. Ultimately, it doesn’t matter what date format convention is being followed because it represents the same concept. While there may be a need to convert data to a local date-time in a global organisation, some risk systems enforce date format standards which may not align with the original data set. This is just one example of pushing the responsibility of data maintenance on the client as opposed to handling a robust array of data formats seamlessly in-house.

The conversation needs to shift to what data actually means, and how it can be valued. If data’s meaning can be defined across the enterprise, the insights that can be derived from it expand exponentially. The current paradigm of the data model in the analytics space pushes the maintenance costs onto the organisations which use the tools, often impeding new product deployment. With each proposed change to an operational system, risk management systems end up adjusting their own data plumbing in order to ensure that they don’t have any gaps in coverage.

Data analytics solutions add value

When financial institutions work together to identify useful data analytics solutions they can produce great results and add a lot of value to their customers. The launch of Zelle is a perfect example, since customers from different banks can now send near real-time payments directly to one another using a mobile app.

A similar strategy should be used to nudge the software analytics industry in the right direction. If major financial institutions banded together with economies of scale to create a big data consortium where one of the key objectives was to make data reusable, then the software industry would undoubtedly create new products to accommodate it, and data maintenance costs would eventually go down. Ongoing maintenance costs would eventually migrate from financial institutions to the software industry, which has the operational and cost advantages.

There are naturally costs associated with managing risk effectively, but wasteful spending on inflexible data models takes money away from other things and stymies innovation. US regulators are notoriously aggressive when it comes to non-compliance, so reducing costs in one area could encourage investment into other areas, and ultimately strengthen the risk management ecosphere. Making data reusable and keeping its natural format would also increase data integrity and quality, and improve risk quantification based on a given model’s approximation of reality.

Reusable data will allow institutions to have a "first mover advantage"

Is the concept of reusable data too far ahead of its time? Not for those who use it, need it, and pay for it. Clearly, the institution(s) that embrace the concept will have the first mover advantage, and given the speed with which disruptive innovations are proceeding, it would appear that this is an idea whose time has come. As the world moves more towards automation and digitisation it is becoming increasingly clear that the sheer diversity and sophistication of risks makes streamlining processes and costs a daunting organisational task.

The speed at which organisations must react to risks in order to remain competitive, cost-efficient and compliant is decreasing, while response times are increasing, right along with a plethora of inefficiencies. Being in a position to recycle data for risk and analytics systems would decrease response times and enhance overall competitiveness. Both will no doubt prove to be essential components of successful organisations in the digital economies of the future.

Keith Furst is founder and financial crimes technology expert of Data Derivatives; and Daniel Wagner is managing director of Risk Cooperative.

Big dataanalyticsrisk managementdata reusabilitydata lakedata integritybig data consortium

Keith Furst

Data reusability: The next step in the evolution of analytics

MERCHANT-BASED MONEY LAUNDERING PART 3: THE MEDIUM IS THE METHOD

Guest blog: answers to 15 extra questions from our beneficial ownership webinar

Geographic Risk Intelligencefor AML & Fraud

Five products. One API.

GeoAML

GeoFraud

GeoExtend

GeoAnalytics

GeoDynamic

Know Your Geography

1. Collect

2. Normalize & Aggregate

3. Feature Engineering

4. Machine Learning

5. Risk Indicators

Built for the full financial crime lifecycle

CDD & Risk Rating

Transaction Monitoring

Banking Out of Jurisdiction

Elder Abuse Detection

Fraud & Scam Classification

New Location Risk Assessment

Measurable results from day one

Fewer false positive alerts

Granularity over binary flags

API enrichment

Ready to enrich your risk data?

Eight ML-driven risk factors. One composite score.

What GeoAML covers

HIDTA

HIFCA

Geographic Targeting Orders

Southwest Border

Drug Trafficking Risk

Industry Risk

International Nexus

Trade-Based Money Laundering

Why zip-code-level ML matters

Traditional: County-Level HIDTA

GeoAML: Zip-Code ML Scoring

See GeoAML in action

Geographic signals for fraud detection

Fraud-specific geographic risk dimensions

Nearest Bank Branch Analysis

Elder Abuse Area Classification

Counterparty Banking Out of Jurisdiction

Gang Territory Mapping

Mapped to real-world typologies

Elder Financial Exploitation

Gang-Driven Fraud & Identity Theft

Impersonation & APP Fraud

Check Fraud & Mail Theft

Add fraud-specific geographic intelligence

Cross-attribute anomaly detection

Six geographic matching dimensions

Zip-to-Phone

Zip-to-IP Address

Phone-to-IP

Zip-to-CounterParty

CounterParty-to-FI

Closest Branch Distance

Detect geographic anomalies in real time

Deep enrichment for every data point

Phone, IP, address & bank routing

Phone Area Code

IP Address

Address & Zip

Bank Routing Number

Branch Proximity

Entity Type Classification

Risk signals regulators expect you to catch

MSB & High-Risk Sector Indicators

Registered Agent & Shared Address Detection

Business Impersonation Detection

Crypto Company Watchlist

NAICS Code Prediction

Defensible entity risk scoring

Sector Risk

Entity Risk

Geographic Risk Intelligence
for AML & Fraud