Scoring Methodology

How VIGILIX calculates risk scores and determines matches across sanctions lists, PEP databases, and adverse news sources.

Overview

VIGILIX uses a multi-layered screening approach. Each person is checked against three independent data sources. The final risk assessment combines name matching, date of birth verification, PEP status, and adverse media findings.

1

Name Normalization

Names are stripped of accents, diacritics, punctuation and converted to lowercase before any comparison. This ensures "Müller", "Mueller" and "MULLER" are treated identically.

2

Fuzzy Name Matching

Multiple name combinations are tested (First Last, Last First, Last only) using sequence matching and token-sort algorithms. The best score across all combinations is retained.

3

Date of Birth Adjustment

If a date of birth is provided, the name score is adjusted up or down based on whether the DOB matches, partially matches, or conflicts with the list entry.

4

Verdict Assignment

A final verdict (ALERTE / ATTENTION / OK) is assigned based on the adjusted score and the configured similarity threshold.

Name Matching Score

The name score (0–100) measures how similar the queried name is to an entry in the sanctions list. Two algorithms are used and the highest score is retained:

# Algorithm 1 — RapidFuzz ratio (C++, ×10-20 faster than difflib) score_seq = fuzz.ratio(query_name, list_name) / 100 × 100 # Algorithm 2 — Token sort (handles word order variations) score_tok = fuzz.token_sort_ratio(sorted_query, sorted_list) / 100 × 100 # Best score across all name combinations name_score = max(score_seq, score_tok) # First name penalty (reduces false positives) # If both first names provided and similarity < 80%: penalty = (0.80 − first_name_similarity) × 50 # max ~12pts for Bashar/Maher name_score = name_score − penalty # e.g. Assad Bashar vs Assad Maher: −19pts # Tested combinations → "First Last" vs "First Last" → "Last First" vs "Last First" → "Last" vs "Last"

Examples

QueryList EntryScoreWhy
Assad, BasharAssad, Bashar Al-100%Exact match after normalization
Assad, BasharAssad, Maher88%Same family name, different first name — first name penalty applied (−12pts). Returns as ATTENTION, not ALERT.
Putin, VladimirPoutine, Vladimir93%Transliteration variant — same first name, no penalty
Al-Qadhafi, MuammarKadhafi, Mouammar82%Arabic transliteration variant
Assad, BasharAssad, Maher (no DOB)100%If no first name provided, no penalty is applied — provide DOB to disambiguate
Smith, JohnPoutine, Vladimir38%No match — well below threshold

Date of Birth Adjustment

Providing a date of birth significantly improves accuracy by confirming or invalidating potential matches. The system accepts multiple date formats (YYYY-MM-DD, DD.MM.YYYY, DD/MM/YYYY).

✓ MATCH
+15 pts
Exact date match. Significantly increases confidence. A name score of 80% becomes 95%.
~ YEAR ONLY
+5 pts
Same birth year, different day/month. Slight confidence boost.
✗ MISMATCH
−25 pts
Different date. Strong indicator of a false positive. A score of 88% drops to 63% — below threshold.
💡 Tips: Always provide a date of birth when available — it is the single most effective way to reduce false positives. Always provide a first name too: without it, the first name penalty is not applied (e.g. "Assad" with no first name will match both Bashar and Maher at 100%).

Verdict & Alert Levels

After DOB adjustment, the final score determines the verdict. The threshold is configurable (default: 80%).

≥ 95
🚨 ALERTE
Near-certain match. Immediate manual review required. Do not proceed without compliance officer sign-off.
80–94
⚠ ATTENTION
Probable match. Manual verification required. Could be a transliteration variant or common surname.
< 80
✓ OK
No significant match found. Person is not identified in the screened lists above the threshold.
⚠️ Important: An OK verdict does not guarantee the person is not sanctioned. It means no match was found above the configured threshold in the currently loaded lists. Lists must be kept up to date.

Sanctions Lists

VIGILIX queries four official sanctions lists, automatically updated every night at 2:00 AM (server time).

ListSourceLegal BasisUpdate Frequency
🇨🇭 SECO sesam.search.admin.ch Embargo Act (LEmb), SR 946.231 Weekly or on change
🇪🇺 European Union webgate.ec.europa.eu EU Council Regulations Daily
🇺🇳 United Nations scsanctions.un.org UN Security Council Resolutions On change
🇺🇸 OFAC / SDN treasury.gov IEEPA, TWEA, UN Participation Act Multiple times per week

PEP Screening

Politically Exposed Persons (PEPs) are individuals who hold or have held prominent public functions. Under Swiss AML law (LBA Art. 2a), financial intermediaries must identify PEPs and apply enhanced due diligence.

VIGILIX queries Wikidata (the structured database behind Wikipedia) in real time to identify whether a person has held or currently holds a political position. This approach is:

⚠️ Wikidata coverage is excellent for prominent figures but may be incomplete for lower-profile local politicians. For high-risk clients, complement with commercial PEP databases.

Adverse News Screening

VIGILIX searches for negative media coverage from two sources simultaneously:

Results from both sources are merged, deduplicated by domain, and scored. The search engine used is displayed alongside each article. Six risk categories are screened:

CategoryKeywords searched
⛔ Sanctionssanction, embargo, asset freeze, blacklist, OFAC, SECO, designated, restricted
🚔 Crimefraud, criminal, convicted, trial, sentenced, arrested, charged, guilty, prison
💸 Corruptioncorruption, bribery, bribe, embezzlement, kickback, misappropriation, kleptocracy
🏦 Launderingmoney laundering, illicit finance, financial crime, shell company, offshore
💣 Terrorismterrorism, terrorist, terror financing, extremist, jihadist, ISIS, Al-Qaeda
🔎 Investigationinvestigation, indicted, probe, inquiry, accused, suspect, warrant, Interpol

Scoring algorithm

Unlike the sanctions score, the adverse news score is based on source reliability and article freshness — not the number of categories found. Categories are used only to filter out irrelevant results.

FactorWeightHow it works
Source reliability80%Reuters, BBC, OCCRP → 88-92pts. Government sources → 95-100pts. Unknown blogs → 45pts.
Article freshness20%<1 month → 95pts. <1 year → 75pts. <5 years → 45pts. Older → 15-30pts.

The final score is: score = source_reliability × 0.80 + date_freshness × 0.20

A HIGH risk article from Reuters published last month scores ~90. The same article from an unknown blog scores ~50.

Name detection

To include an article, the person's name must appear in the search result. The system checks in this order:

  1. Exact match in title — last name appears verbatim in the article title
  2. Exact match in snippet — last name appears in the short extract returned by the search engine
  3. First name + last name in title — both provided names appear in the title
  4. Fuzzy match on title — handles transliterations (e.g. "Poutine" → "Putin"). Similarity threshold: 85%
  5. Compound names — all parts of a compound last name appear anywhere in title + snippet (e.g. "Salinas" + "Pliego")
⚠️ Limitation: If the name only appears in the body of the article (not in the title or snippet returned by the search engine), the article will be missed. This is a constraint of search engine APIs which only return short excerpts.