[Developers]

Profile Merge and Split

During a complex fraud investigation, an analyst discovers that two profiles in the system both belong to the same individual: one created during onboarding, the other ingested from an external data feed under a slightly

Category: InvestigationLast Updated: Feb 5, 2026
investigationcompliance

Overview#

During a complex fraud investigation, an analyst discovers that two profiles in the system both belong to the same individual: one created during onboarding, the other ingested from an external data feed under a slightly different name. Merging them gives investigators a complete picture. Six months later, a review reveals the opposite problem: two distinct individuals, a father and son sharing a surname and postal address, were incorrectly consolidated into a single profile. The split operation needs to unpick that merger precisely, distributing attributes, relationships, and timeline events to the right profile without losing anything.

The Profile Merge and Split module supports both operations with confidence-scored recommendations, conflict detection, full preview before execution, and instant rollback if something goes wrong.

Open Standards#

  • W3C PROV-DM (Provenance Data Model): Every merge and split operation is recorded as a first-class provenance event using the W3C PROV-DM Entity/Activity/Agent model, with wasGeneratedBy, wasDerivedFrom, and wasAttributedTo relationships persisted to both PostgreSQL and a graph store.
  • W3C PROV-JSON: Provenance records for merged or split profiles can be exported in the W3C PROV-JSON serialisation format (conforming to https://www.w3.org/TR/prov-json/) for external audit and interoperability with compliant tooling.
  • Privacy-Preserving Record Linkage (PPRL), Schnell et al. 2009 (DOI:10.1186/1472-6947-9-41): The merge recommendation engine encodes candidate profile attributes as Bloom filters using bigram decomposition and SHA-256 hashing, then computes Dice-coefficient similarity scores without retaining plaintext matching criteria across the comparison boundary.
  • FIPS 180-4 / SHA-256: SHA-256 digests are used both within PPRL Bloom-filter encoding and to construct a tamper-evident Merkle hash chain over the audit trail of merge and split operations, enabling offline integrity verification.
  • FIPS 140-2 / AES-256-GCM (NIST SP 800-38D): Sensitive PII fields stored in before-and-after profile snapshots are protected with AES-256-GCM field-level encryption, with versioned key rotation and encrypted key wrapping.
  • GraphQL: All merge and split operations, including similarity queries, merge execution, split execution, and rollback, are exposed through a strongly-typed GraphQL API using Strawberry, ensuring a consistent contract for client integrations.
  • RFC 4122 (UUID): Every profile, merge operation, split operation, and provenance record is identified by a version-4 UUID, guaranteeing globally unique, collision-resistant identifiers across distributed tenants.
  • ISO 8601: All timestamps attached to merge history, split events, provenance records, and rollback snapshots are serialised in ISO 8601 format, ensuring unambiguous temporal ordering across systems and audit consumers.

Last Reviewed: 2026-02-05 Last Updated: 2026-04-14

Key Features#

  • Merge Recommendation Engine: Multi-factor analysis evaluates identifier matches, name similarity, demographic overlap, behavioural patterns, relationship connections, and document similarity to recommend profile consolidation with quantified confidence scores and detailed risk assessments.
  • Conflict Detection and Resolution: Contradictory data between merge candidates is automatically identified with severity classification. Configurable resolution options, including use-latest, use-most-reliable, keep-both, and manual entry, ensure data integrity during consolidation.
  • Merge Preview: Before execution, investigators preview the complete merged profile showing field additions, modifications, removals, relationship changes, and impact on connected investigations and alerts.
  • Profile Splitting: Precise separation of incorrectly merged profiles distributes attributes, relationships, documents, and timeline events to resulting profiles based on temporal, spatial, contextual, or evidence-based criteria.
  • Split Analysis: Automated detection of split indicators including temporal gaps, location inconsistencies, and identity mismatches identifies profiles that may represent incorrectly merged distinct individuals.
  • Complete History Preservation: Full lineage of all merge and split operations is maintained with before-and-after profile snapshots, field-level change tracking, and compressed encrypted storage for regulatory compliance.
  • Instant Rollback: Any merge or split operation can be reversed through instant rollback with checksum verification and data integrity validation, providing a safety net for error correction.
  • Batch Merge Operations: Large-scale data cleansing projects process thousands of merge operations per hour with configurable auto-merge thresholds, progress tracking, and comprehensive result statistics.
  • Impact Assessment: Before executing merges or splits, the system analyses effects on connected investigations, active alerts, existing relationships, and downstream notifications to inform decision-making.

Use Cases#

  • Duplicate Profile Consolidation: Data stewards review merge recommendations to consolidate duplicate entity profiles, creating clean golden records with complete attribute coverage from all contributing sources.
  • Data Quality Remediation: Batch merge operations systematically reduce duplicate rates across entity populations, improving search accuracy, screening effectiveness, and investigation efficiency.
  • Incorrect Merge Correction: When entities are incorrectly merged, split operations precisely separate the profiles with attribute assignment based on temporal, geographic, and contextual evidence while preserving all data.
  • Investigation Entity Management: Investigators merge or link entity profiles discovered during case work, consolidating intelligence about subjects while maintaining audit trails of all identity resolution decisions.
  • Compliance Audit Support: Complete merge and split history with before-and-after snapshots, decision rationale, and rollback records provides documentation for regulatory examinations and internal quality reviews.
  • Onboarding Deduplication: New entity profiles created during customer onboarding are checked against existing records, with high-confidence duplicates auto-merged and ambiguous matches routed for manual review.

Integration#

The Profile Merge and Split module integrates with the platform's profile management, entity resolution, investigation management, and alert management systems. Merge and split operations automatically update all connected systems including investigation case records, active alerts, relationship graphs, and risk assessments. Complete operation history feeds into audit trail systems, and rollback capabilities ensure data integrity across all downstream integrations.

Ready to Build?

Get started with our APIs or contact our integration team for support.