Country Extraction Algorithm

Many risk management functions leverage country risk as a key input for their models.  In anti-money laundering (AML) programs country risk is a vital component to calculate customer and transaction risk.  Even in sanctions screening country risk can play a role in how sensitive the matching algorithm is calibrated.

There are instances where an institution has control over the process and country data can be captured, stored and analyzed for a variety of purposes.  However, there are situations where an institution may not have much control over the data, but still are obligated to meet stringent regulatory demands. 

One classic example where a financial institution doesn't have control over the data is correspondent banking.  In correspondent banking, institutions transfer messages to another through a variety of protocols such as the Society for Worldwide Interbank Financial Telecommunication (SWIFT) which has been called the 'backbone of the global financial system.'

Country risk is a critical component for assessing AML risk of cross-border transfers.  If, financial institutions monitor SWIFT messages to meet their transaction monitoring requirements then many will extract the country codes from global codes such as the Bank Identifier Code (BIC).  The BIC or SWIFT code is structured in a standard format as follows:




Extracting the BIC / SWIFT code from this structured data element is fairly straightforward.  However, if the financial institution wanted to have a more robust approach to transaction monitoring then they could extract the country code from the free text address fields, in a SWIFT MT 103, provided by the originating and beneficiary customers.

Extracting the country code from a free text address in a SWIFT message can become fairly complex for the following reasons:

  • Comprehensive and accurate geographic reference data is needed
  • Common misspellings of address data
  • Handling various conversions of accented characters
  • Handling various conversions of languages from their native orthographic form (NOF) to their romanized version


Data Derivatives has developed an algorithm to extract the country code from free text address fields which can be applied to SWIFT messages and other scenarios.  The algorithm could be deployed with an instiution's extract, transform and load (ETL) process before the data is sent to the transaction monitoring system for various detection scenarios.

One key benefit of deploying the country extraction algorithm is to identify customers banking outside their jurisdiction.  For example, most customers would use banks in their main country of residence.  Their are legitimate reasons for companies and individuals bank outside their country of residence or operations, but this is an important red flag to identify to combat financial crime. 

This can only be done if institutions extract the country codes from the free text address fields of the SWIFT message for the originating and beneficiary customers.