AML risk scoring quantifies the financial-crime risk of each customer (and where appropriate, each transaction). It is the operational expression of the risk-based approach (RBA) at the heart of FATF Recommendation 1. A well-built risk scoring model focuses the team's effort on high-risk relationships, applies lighter monitoring to low-risk volume, and produces a defensible answer to a supervisor's "why was this level of due diligence applied?" question. This HowTo walks through the seven steps of building one from scratch.

Step 1: Identify Risk Factors

The first job is naming the factors that drive customer risk. Industry practice covers three dimensions:

Customer factors:

Customer type (individual, SME, corporate, financial institution, NPO)
Industry (high risk: cash-intensive businesses, crypto, gambling, arms trade, precious metals, real estate)
PEP status (foreign PEP, domestic PEP, ex-PEP, RCA)
Ownership transparency (UBO identifiable, layered ownership present)
Customer age and relationship history (new customer vs long-established)

Geographic factors:

Customer nationality / country of residence
Trading countries
FATF grey/black list status (as of 2025 grey list includes: Burkina Faso, Cameroon, Democratic Republic of Congo, Haiti, Mali, Mozambique, Myanmar, Nigeria, Philippines, South Africa, South Sudan, Syria, Vietnam, Yemen; black list: DPRK, Iran)
High-risk jurisdiction exposure

Product/transaction factors:

Product or service used (cash-intensive, SWIFT cross-border, crypto, mobile wallet, correspondent banking, prepaid card)
Expected transaction volume
Behavioural history (anomaly count, prior STR-eligible activity, previously closed accounts)

Jurisdiction-Specific Factors

EU AMLD5 Annex III lists high-risk factors that any EU institution's model must capture: certain customer types, business relationships in unusual circumstances, customers in high-risk jurisdictions, etc. UK MLR 2017 mirrors this. JMLSG Part I, Chapter 4 provides further detail. Your model should map these regulatory anchors to specific factors.

Step 2: Design the Factor Scoring Scale

Define a scale for each factor. The common approach is 1-5:

Score	Meaning	Example (Geography)
1	Very low risk	UK, Germany, France, Sweden
2	Low risk	Norway, Switzerland, Australia
3	Medium risk	Latin America, parts of Eastern Europe
4	High risk	FATF grey-list countries
5	Very high risk	FATF black-list countries, sanctioned regimes

Some models use 1-10 or 1-100; the choice is stylistic. What matters is consistent use and documentation.

Practical recommendation: tabulate the scoring matrix and submit to compliance review. Every factor value must have a documented rationale for the score it receives.

Step 3: Set Factor Weights

Factors are not equal. PEP status carries more weight than relationship age. Geography carries more than customer type.

Typical distribution (totalling 100%):

Customer type: 15%
Industry: 15%
PEP/sanctions status: 20%
UBO transparency: 10%
Geographic risk: 20%
Product/service risk: 15%
Account history/behaviour: 5%

Two approaches to setting these:

Expert-driven. The compliance team agrees weights through structured discussion. Fast, comprehensible, but not empirically validated.

Data-driven. Statistical analysis on historical SAR/STR cases — which factors actually predicted real risk? Logistic regression or gradient boosting. Stronger but requires data (at least several hundred SAR cases).

A new institution starts expert-driven; 12-18 months in, data-driven recalibration becomes feasible.

Step 4: Combine Scores into a Customer Risk Level

Each customer's total weighted score:

Total Risk Score = Σ (Factor_i × Weight_i)

Map to risk levels:

Total Score	Risk Level	Applied Treatment
1.0 - 2.0	Low	Simplified Due Diligence (where eligible); standard monitoring
2.1 - 3.5	Medium	Standard Customer Due Diligence; normal monitoring
3.6 - 4.5	High	Enhanced Due Diligence (EDD); tighter monitoring
4.6 - 5.0	Very high	EDD + senior management approval; close monitoring; more frequent review

The threshold ranges (2.0, 3.5, 4.5) emerge from the model. The distribution they produce — what proportion of customers ends up where — must be known. Typical target: 70-80% low, 15-20% medium, 3-8% high, 0.5-2% very high.

If the distribution falls outside that envelope (e.g. 30% of customers flagged as high risk), either the thresholds or the weights are mis-calibrated.

Step 5: Apply Segmentation

A single model is not optimal across customer types. Risk factors for an individual customer differ from those for a corporate. Typical segments:

Individual — retail: PEP, geography, income/account inconsistency weighted higher
Individual — affluent/HNW: + source of wealth, multiple accounts, multi-country footprint
SME: Industry, ownership transparency, expected volume
Corporate: UBO structure, financial statement traceability, multi-jurisdiction operations
Financial institution (correspondent): Regulator quality, supervisory rating, AML programme quality
NPO: Donation sources, operating countries, beneficiary populations

Running a fully separate scoring matrix per segment is operationally heavy; the practical compromise is the same factor set with segment-specific weights.

Step 6: Validate the Model

Before going live, the model must be tested on retrospective data:

Historical backtesting. Run the model over 12-24 months of customer portfolio. Confirm that those flagged high risk are in fact the population that produced SARs.
False positive analysis. How many customers flagged high risk did not produce a real case? 100% false positive is impossible; 70-90% is normal (lower than sanctions screening FP because risk scoring is broader).
False negative analysis. Of the SAR cases you actually filed, how many had been flagged low risk? These are the model's misses. Should be close to zero.
Sensitivity analysis. How much does the result move when a single factor changes? Over-sensitive models (a 1-point factor change flips the level) are unstable; under-sensitive models (50 points and the level stays the same) lack discrimination.
Compliance review. Model documentation (factor definitions, weights, thresholds, validation results) approved by the MLRO and, depending on the institution, internal audit.

EU AMLD5 Article 8 requires risk assessment methodology to be documented and updated. FCA's SYSC chapters expect the same. The documentation is not optional — it is what a supervisor reviews.

Step 7: Operational Integration and Continuous Monitoring

Once live:

Onboarding integration. Score calculated at application time; risk level branches the KYC flow (standard vs EDD). High risk triggers additional documentation, senior management approval.

Periodic re-rating. Customer risk is not static. Annually (six-monthly for high risk), score is recalculated. Factor values may have changed (customer changed country, gained PEP status, increased transaction profile).

Event-driven re-rating. Specific events trigger immediate recalculation — SAR filed on the customer, large anomaly, adverse media hit, new sanctions match.

Monitoring threshold binding. Risk level drives transaction monitoring thresholds. A high-risk customer's £100K single transfer triggers an alert; for a low-risk customer the threshold may be £500K. This is one of the strongest false positive reduction techniques.

Case management. Higher risk levels mean closer transaction monitoring; an assigned analyst maintains a portfolio of high-risk customers; periodic review calendars auto-generate.

Model performance tracking. Monthly dashboard: risk-level distribution, SAR-to-score correlation, false positive trend, model drift. Reported quarterly to the AML governance committee.

Worked Example: Scoring a Customer

To make the method concrete, work through a typical customer:

Profile. James K., 47, UK national, resident in London, owner of a construction-materials import SME, expected monthly transaction volume £150K-400K, five-year relationship with the bank, no SAR history, business partners in the UAE and Pakistan.

Factor scores:

Factor	Value	Score	Weight	Contribution
Customer type	SME-owning individual	2	15%	0.30
Industry	Construction materials import	3	15%	0.45
PEP status	None	1	20%	0.20
UBO transparency	Clear (self-owned)	1	10%	0.10
Geographic risk	UK resident + UAE/Pakistan ties	3	20%	0.60
Product/service	Standard current account + SWIFT	2	15%	0.30
Account history	5 years clean	1	5%	0.05

Total score: 2.00 → just below medium → low risk segment.

Treatment: standard CDD. Annual review. High-value transactions (£500K+) trigger additional scrutiny because of the geographic partnership profile.

Now vary the scenario: the UAE partner is receiving payment via a Pakistan-incorporated intermediary, and Pakistan is on the FATF grey list. Geographic risk score 3 → 4. New total: 2.20 → still medium. Treatment shifts: standard CDD + six-monthly review.

Second scenario: James K. is appointed to a local council seat (domestic PEP under FATF). PEP score 1 → 3 (domestic PEP, risk-based per FATF Recommendation 12 guidance). New total: 2.60 → bordering on medium-high. Annual review compresses to six-monthly; transaction monitoring sensitivity raised.

Third scenario: James K. signs an export contract with a Syrian counterparty. Geographic risk jumps to 5 (sanctions-affected jurisdiction). New total: 3.40 → high risk threshold approached. EDD trigger; senior management approval for continued relationship.

This worked example shows how the model responds to real customer behaviour — it is dynamic, not a static snapshot.

Model Governance: Who Owns Which Decision?

The risk scoring model is not a technical tool; it is a regulator-exposed decision structure. Governance lines:

MLRO. Owner of the model documentation; every material change (new factor, weight change, threshold shift) requires MLRO sign-off. In a supervisory review the MLRO is accountable for explaining the model.

Compliance. Tracks model application day-to-day; monthly reporting; pattern detection (model drift, anomalous analyst output, segment performance).

Risk function. Engages with the model from the institution-wide risk perspective, particularly the proportion of customers landing in high-risk segments. Risk committee (typically monthly) reviews model performance.

Internal audit. Annual model validation; independent test results reported to the board.

IT / data engineering. Operates the technical implementation; data-quality monitoring, model integration, dashboard production. Not a decision-maker on model parameters — operational maintenance only.

Board / governance committee. Annual report on model changes; approval threshold defined. Board adjusts risk appetite which feeds back into the model.

Decision-rights matrix (RACI-style):

Decision	MLRO	Compliance	Risk	IT	Board
New factor added	Approve	Recommend	Consult	Implement	Inform
Factor weight change	Approve	Recommend	Consult	Implement	Inform
Threshold change	Approve	Recommend	Consult	Implement	(if material) Inform
Annual model validation	Approve	Participate	Participate	Data supply	Outcome inform
Model retirement / redesign	Approve	Participate	Participate	Implement	Approve

This structure produces a clean answer to "who owns the model and who took the decisions" at supervisory review.

How Risk Scoring Connects to Other AML Processes

Risk scoring is not isolated — it wires into:

Onboarding flow: the risk score branches the KYC path (standard vs EDD)
Screening intensity: higher risk → lower match threshold; lower risk → higher
Transaction monitoring thresholds: risk level sets the trigger amounts for monitoring rules
Review cycles: high risk annual → six-monthly, very high → quarterly
SAR assessment: a transaction looks suspicious; a high-risk customer triggers a SAR faster
EDD depth: source of funds documentation depth, monitoring cadence calibrated to risk level

For the combined approach see the segment-based threshold table in our how to reduce AML false positives article — it maps risk level + list type combinations to threshold matrices.

Common Mistakes

Geography + PEP only. Industry and account behaviour are critical; models without them underdiscriminate.

Too few segments. A single model across all customers is either over-sensitive (alert flood) or under-sensitive (missed risk).

Frozen weights. Without annual recalibration the model goes stale — it does not adapt to changing risk patterns.

"Nothing" for low risk. Low risk means standard monitoring, not absence of monitoring. In a supervisory review, "why was no monitoring applied?" has no defensible answer.

Documentation gap. If how the model works is not written down, decisions look unreasoned at inspection.

Frequently Asked Questions

Is risk scoring automated or manual?

Hybrid. Calculation is automated. Some factor assessments (e.g. UBO transparency, behaviour anomaly) are algorithm + analyst judgement. High risk levels require senior management approval — that step is manual. Hybrid is what scales with human accountability.

Is the risk score updated daily?

No, expensive and unnecessary. Standard practice: annual regular review (six-monthly for high risk), plus event-driven recalculation (new PEP status, sanctions match, anomaly, SAR). Some institutions run nightly batch to refresh changing factors (geography, expected volume) but this is optional.

Are AI/ML models required?

No. Rule-based weighted-sum models are entirely accepted and most institutions use them. ML models (gradient boosting, neural networks) can offer better predictive power but must be explainable — at supervisory review you must articulate why this customer scored this way. Black-box ML models are problematic for compliance use.

What if a low-risk customer triggers a SAR requirement?

File the SAR. Risk level does not change SAR obligation. Risk level determines proactive monitoring intensity; once a specific transaction is suspicious it is reported regardless of risk level. SAR obligation under MLR 2017, AMLD5 and equivalents is unconditional.

No. The risk score is an internal assessment; not shared. When additional documentation is requested for a high-risk customer, "as part of our internal AML assessment" is a sufficient general rationale. Sharing the specific score or formula is both operationally risky and beyond regulator expectation.

How Legichain Helps

Legichain's AML platform includes a configurable risk scoring engine. Factor templates for customer type, geography (with auto-updating FATF grey/black list status), PEP/sanctions integration, product and behaviour are built in; weights and thresholds are configurable from the admin panel. Segment templates align with EU AMLD5 Annex III and UK JMLSG Chapter 4 high-risk indicators.

Periodic recalculation runs on cron; event-driven triggers (sanctions match, anomaly score) launch re-rating. Model documentation auto-generates as PDF — the inspection-ready model report is a single click.

Next Steps

Legichain Team· Compliance editorial

Written by Legichain's compliance editorial team — regulated-financial-services veterans who built and integrated AML platforms for banks and crypto exchanges across EMEA.

Be screen-ready in an afternoon.

Spin up a free workspace, paste your first API key into a curl, ship a verified onboarding flow before your next stand-up.

Start free Book 30 min with sales

How to Build an AML Risk Scoring Model: Step-by-Step

Step 1: Identify Risk Factors

Jurisdiction-Specific Factors

Step 2: Design the Factor Scoring Scale

Step 3: Set Factor Weights

Step 4: Combine Scores into a Customer Risk Level

Step 5: Apply Segmentation

Step 6: Validate the Model

Step 7: Operational Integration and Continuous Monitoring

Worked Example: Scoring a Customer

Model Governance: Who Owns Which Decision?

How Risk Scoring Connects to Other AML Processes

Common Mistakes

Frequently Asked Questions

Is risk scoring automated or manual?

Is the risk score updated daily?

Are AI/ML models required?

What if a low-risk customer triggers a SAR requirement?

How Legichain Helps

Next Steps

You may also like

Sanctions Screening Explained: Lists, Sources, and Workflows

OFAC vs UN vs EU vs UK Sanctions Lists Compared

Sanctions Screening for Payment Service Providers

Be screen-ready in an afternoon.

How to Build an AML Risk Scoring Model: Step-by-Step

Step 1: Identify Risk Factors

Jurisdiction-Specific Factors

Step 2: Design the Factor Scoring Scale

Step 3: Set Factor Weights

Step 4: Combine Scores into a Customer Risk Level

Step 5: Apply Segmentation

Step 6: Validate the Model

Step 7: Operational Integration and Continuous Monitoring

Worked Example: Scoring a Customer

Model Governance: Who Owns Which Decision?

How Risk Scoring Connects to Other AML Processes

Common Mistakes

Frequently Asked Questions

Is risk scoring automated or manual?

Is the risk score updated daily?

Are AI/ML models required?

What if a low-risk customer triggers a SAR requirement?

Should we share the risk score with the customer?

How Legichain Helps

Next Steps

You may also like

Sanctions Screening Explained: Lists, Sources, and Workflows

OFAC vs UN vs EU vs UK Sanctions Lists Compared

Sanctions Screening for Payment Service Providers

Be screen-ready in an afternoon.