Overview#
Two cryptocurrency wallets appear completely unrelated. Different addresses, different transaction partners, different geographic indicators. But when a forensics analyst runs structural similarity analysis, the behavioural fingerprints are nearly identical: the same distribution of transaction sizes, the same timing patterns, the same fan-out ratios when moving funds to exchange addresses. These are almost certainly the same controlling entity operating what looks like two independent wallets. Without graph similarity matching, that connection never surfaces.
The Graph Similarity and Matching module delivers advanced pattern recognition capabilities that detect structurally similar subgraphs across multi-million node graphs. Five integrated matching algorithms identify duplicate entities, detect known schemes in new data, discover isomorphic structures, and compare organisational network patterns. Entity resolution for deduplication across data sources is a core capability, particularly valuable when the platform ingests from 153 third-party integrations with overlapping and inconsistent entity records.
Key Features#
- Five matching algorithms: node similarity scoring, graph isomorphism detection, graph edit distance, SimRank iterative propagation, and role-based structural similarity
- High similarity accuracy correctly identifying structurally similar patterns with low false positive rates
- Composite node similarity scoring combining attribute, structural, neighbourhood, and connection metrics with configurable weights
- Graph isomorphism detection using advanced algorithms for exact and subgraph pattern matching
- Graph edit distance computation quantifying the minimum operations needed to transform one graph into another for fuzzy matching
- SimRank algorithm discovering hidden relationships through iterative similarity propagation across network neighbourhoods
- Role-based structural similarity identifying nodes playing similar functional roles regardless of graph position
- Automatic role discovery using recursive feature extraction and non-negative matrix factorisation
- GPU-accelerated isomorphism detection for large-scale pattern matching operations
- Approximate matching capabilities returning best matches with similarity scoring when exact matches are unavailable
- Multi-pattern matching executing multiple pattern queries in a single graph traversal for efficient batch analysis
- Cross-graph role comparison enabling structural equivalence analysis across different networks
- Embedding-based similarity using graph neural network representations for deep structural comparison
- Entity resolution capabilities linking duplicate entities across different data sources through composite scoring
Use Cases#
- Duplicate Entity Detection: Cryptocurrency forensics teams identify same-entity control across multiple addresses through behavioural and structural similarity analysis
- Money Laundering Pattern Matching: Financial institutions detect known laundering schemes in new transaction data through subgraph isomorphism matching against pattern libraries
- Criminal Organisation Structure Comparison: Law enforcement compares suspected organisational structures against known criminal archetypes to identify network hierarchies and key actors
- Attack Pattern Recognition: Cybersecurity teams match known adversary techniques against network activity graphs to detect lateral movement and exploitation patterns
Integration#
- Connects with the Neo4j graph analysis layer for similarity computation across investigation and operational data
- Compatible with case management platforms for automated entity resolution and case enrichment
- Supports batch similarity queries for bulk analysis across large datasets
- Role-based access controls with pattern library encryption and query audit logging
- Privacy protection through minimum similarity thresholds preventing overly broad matching
- Compliance with GDPR, SOC 2, and ISO 27001 data protection standards
Open Standards#
- GraphQL (June 2018 specification): All similarity queries, graph traversal, and entity resolution operations are exposed as GraphQL queries, mutations, and subscriptions via the Strawberry schema layer.
- GEXF 1.3 (Graph Exchange XML Format): Graph data is exported in GEXF 1.3 XML format, enabling interoperability with external graph analysis tools such as Gephi.
- W3C PROV-DM (Provenance Data Model): Entity merge operations performed during deduplication record full provenance in accordance with the W3C Provenance Data Model, capturing agent, activity, and source entity relationships.
- STIX 2.1 / OASIS TAXII 2.1: Threat intelligence imported via TAXII 2.1 collections is converted from STIX 2.1 SDOs into internal entities, providing the pattern library against which subgraph isomorphism matching operates.
- MITRE ATT&CK: Attack pattern nodes in the graph store MITRE ATT&CK technique identifiers (e.g., T1003), enabling structural matching of observed network activity graphs against known adversary technique graphs.
- ISO/IEC 27001: Data protection controls for similarity queries, including minimum similarity thresholds and query audit logging, are implemented in accordance with ISO/IEC 27001 information security requirements.
- GDPR (Regulation (EU) 2016/679): Privacy protections built into the matching engine, such as minimum similarity thresholds that prevent overly broad entity matching, align with GDPR data minimisation and purpose limitation obligations.
Last Reviewed: 2026-02-05 Last Updated: 2026-04-14