Overview#
A single money-laundering investigation might span hundreds of documents referencing the same individual as "J. Smith", "John Smith", "Mr J. Smith", and a company directorship record listing "Jonathan A. Smith". Each reference is a data point. Together, they form a profile. The AI Entity Extraction platform identifies all of those mentions, resolves them to a single canonical entity, and maps the relationships between them without requiring an analyst to manually cross-reference each source.
Purpose-built for compliance teams, intelligence analysts, and data enrichment applications, this system recognises and resolves entities across 17 entity types in 94 languages, transforming unstructured text into structured, queryable entity databases aligned to the POLE model (Person, Organisation, Location, Object, Event).
Diagram
flowchart TD
A[Unstructured Document] --> B[Named Entity Recognition]
B --> C[17 Entity Types Identified]
C --> D[Entity Resolution & Disambiguation]
D --> E[Canonical Entity Profiles]
E --> F[Relationship Extraction]
F --> G[Knowledge Graph Construction]
G --> H[Knowledge Base Linking]
H --> I[Structured Entity Database]
I --> J[Cross-Document Tracking]Key Features#
-
Advanced Named Entity Recognition (NER): Identifies and classifies entity mentions across 17 standard and domain-specific types including persons, organisations, locations, dates, monetary amounts, and specialised types like IBAN, SWIFT codes, cryptocurrency addresses, case numbers, and statute references. Handles nested entities, abbreviated forms, and multilingual text across 94 languages.
-
Entity Resolution and Disambiguation: Maps entity mentions to unique real-world entities, merging different references to the same entity and linking to external knowledge bases. Handles name variations, homonyms, abbreviations, pronouns, and cross-document entity tracking to create unified entity profiles across document collections.
-
Entity Relationship Extraction: Identifies and classifies connections between entities including corporate structures, employment, financial transactions, legal relationships, personal connections, and geographic associations. Builds knowledge graphs representing how entities interact, relate, or transact, with temporal tracking of when relationships began or ended.
-
Domain-Specific Entity Types: Specialised recognition for financial services (IBAN, SWIFT, cryptocurrency addresses, ticker symbols), legal (case numbers, statutes, citations), healthcare (patient IDs, diagnosis codes, medications), and identity documents (passports, national IDs, tax IDs).
-
Cross-Document Entity Tracking: Tracks entities across entire document collections and identifies when entities mentioned differently across documents refer to the same real-world entity.
-
Knowledge Base Linking: Links extracted entities to authoritative external knowledge bases for enrichment, providing additional context and structured properties for identified entities.
Use Cases#
Financial Services Compliance#
Automatically extract entities from transaction records, compliance documents, and correspondence to identify parties, amounts, dates, and financial identifiers. Entity resolution links mentions across documents while relationship extraction reveals hidden connections for AML and KYC investigation.
Law Enforcement Intelligence Analysis#
Extract and link entities across intelligence reports to build network maps of persons, organisations, locations, and transactions following the POLE model. Cross-document entity tracking and relationship extraction reveal patterns and connections across disparate information sources, supporting link analysis and prosecution file preparation.
Due Diligence Operations#
Accelerate due diligence by automatically extracting key parties, amounts, dates, and relationships from contracts and corporate filings. Entity resolution merges information about the same entity from multiple sources into comprehensive profiles.
Healthcare Fraud Investigation#
Extract patient identifiers, provider details, billing codes, and transaction amounts from claims data to identify anomalies, duplicate billing, and relationships between fraudulent provider networks.
Integration#
Programmable API access is available for real-time and batch entity extraction, entity resolution, relationship extraction, knowledge graph construction, and entity search across document collections. SDK libraries for Python, Node.js, Java, and Go. Pre-built integrations with document management systems, case management platforms, and business intelligence tools.
Security & Compliance#
TLS 1.3 for all document and entity operations. Enterprise-grade encryption for stored entity data and relationships. Entity-level permissions control access to sensitive data. Automatic PII anonymisation and pseudonymisation options. Complete audit logging of all extractions and queries. GDPR compliant with data residency controls and on-premise deployment option.
Last Reviewed: 2026-02-23 Last Updated: 2026-04-14