[Developers]

PostgreSQL Graph Layer

Intelligence platforms live or die on their graph layer. Network analysis, provenance tracking, and entity relationship traversal all depend on the ability to ask "what is connected to what, and how deep does it go?" The

Category: InvestigationLast Updated: Apr 14, 2026
investigation

Overview#

Intelligence platforms live or die on their graph layer. Network analysis, provenance tracking, and entity relationship traversal all depend on the ability to ask "what is connected to what, and how deep does it go?" The traditional answer has been a dedicated graph database. The cost is a second system to operate, synchronise, monitor, and reason about in production.

The Argus PostgreSQL Graph Layer replaces Neo4j APOC path traversal with native PostgreSQL recursive CTEs executing directly against the graph_relationships table. Every traversal, every subgraph extraction, and every shortest-path query runs in the same transactional database that is already the source of truth for all relationship data. There is no synchronisation lag, no replica divergence, and no second infrastructure component to fail at 03:00. Vector similarity search for entity embeddings uses pgvector with an HNSW index for high-recall approximate nearest-neighbour queries at millisecond latency.

Last Reviewed: 2026-04-14 Last Updated: 2026-04-14

Key Features#

  • Recursive CTE Graph Traversal: All subgraph, neighbour, and shortest-path queries use SQL:1999 WITH RECURSIVE CTEs executing natively in PostgreSQL. No external graph engine, no APOC procedures, no additional runtime dependency. Traversals are depth-limited and cycle-safe via path array membership checks to prevent infinite loops on cyclic graphs.

  • Organisation-Scoped Isolation: Every traversal query includes organization_id in the JOIN condition against graph_relationships. Cross-tenant graph data is architecturally inaccessible regardless of node connectivity. This is a hard requirement under EDF/PESCO multi-tenancy rules and is enforced at the query level, not the application level.

  • Soft-Delete Aware: All traversal functions filter on is_deleted = FALSE, ensuring deleted relationships are excluded from subgraph results, neighbour lookups, and path searches without requiring physical row deletion. Audit trails and provenance chains remain intact.

  • pgvector HNSW Index for Entity Similarity: The entity_embeddings table carries a native vector(768) column indexed with an HNSW (Hierarchical Navigable Small World) graph index using cosine similarity. HNSW provides better recall-query-time tradeoff than IVFFlat for the entity corpus sizes encountered in operational deployments. The index coexists with the existing REAL[] column to support a gradual migration path.

  • PROV-DM Graph Relationships in PostgreSQL: W3C PROV-DM provenance relationships (wasGeneratedBy, wasAttributedTo, wasAssociatedWith, wasDerivedFrom) are written to graph_relationships using RELATED_TO with prov_type stored in the properties JSONB column. This avoids schema CHECK constraint changes while preserving full PROV-DM semantics and making provenance chains traversable by the same recursive CTE engine.

  • Neo4j Remains Optional: The GraphService constructor accepts a Neo4j client as an optional parameter. When Neo4j is not configured (self.db is None), all traversal operations route to pg_graph_traversal functions. When Neo4j is present, it is used as a replica for read queries with PostgreSQL as the mandatory write destination. This enables a zero-downtime migration path: remove Neo4j configuration to activate the PostgreSQL-only mode.

  • Reusable SQL Functions: pg_subgraph() and pg_shortest_path() are installed as server-side SQL functions, callable directly from any SQL client, migration script, or ad-hoc query for debugging and operational investigation without requiring application layer involvement.

Use Cases#

  • Investigation Graph Traversal: Operators open a node and expand to adjacent entities, following relationship chains up to a configured depth to build situational awareness. Recursive CTE traversal returns results at consistent sub-second latency for graphs of the operational scale encountered in EDF/PESCO deployments.
  • Shortest Path Between Entities: Analysts ask whether two entities are connected and by what route. The get_path_between function executes BFS-style recursive CTE path search and returns the shortest undirected path as a node ID list.
  • Provenance Chain Retrieval: W3C PROV-DM provenance for any entity is traversable via the same graph engine that handles investigative relationships, enabling operators to trace the origin and transformation history of any data entity from the investigation view.
  • Entity Similarity Search: Entity embeddings stored as pgvector columns support ANN cosine similarity queries for entity deduplication, link suggestion, and semantic search across the entity corpus.
  • Operational Graph Export: The pg_subgraph() SQL function allows analysts and data engineers to extract subgraphs directly via SQL for offline analysis, reporting, and export without going through the application API.

Integration#

  • graph_relationships Table: Single source of truth for all entity relationships. 60+ relationship types, 9 indexes, organisation-scoped, soft-delete support.
  • GraphService: Primary service class consuming pg_graph_traversal functions. All public method signatures are preserved; existing callers are unaffected.
  • ProvDmService: W3C PROV-DM provenance writes routed exclusively to graph_relationships via pg_graph_traversal. Neo4j dependency removed.
  • entity_embeddings Table: pgvector column and HNSW index coexist with the existing REAL[] column and cosine_similarity() SQL function for a non-breaking migration path.
  • Graph Algorithm Library: Community detection, centrality analysis, and motif detection modules consume GraphService public methods and benefit from the PostgreSQL path without code changes.

Open Standards#

StandardDescription
SQL:1999 Recursive CTEs (WITH RECURSIVE)ISO/IEC 9075-2:1999 standard recursive query syntax. Supported natively by PostgreSQL 8.4+. No extension required. https://www.iso.org/standard/27641.html
pgvector (Apache 2.0)Open-source PostgreSQL extension for vector similarity search. Implements ivfflat and HNSW indexes. https://github.com/pgvector/pgvector
HNSW AlgorithmMalkov, Y. & Yashunin, D. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence. arXiv:1603.09320.
W3C PROV-DMW3C Provenance Data Model, W3C Recommendation 30 April 2013. https://www.w3.org/TR/prov-dm/
W3C PROV-JSONW3C PROV-JSON Serialization, W3C Note 23 April 2013. https://www.w3.org/TR/prov-json/

All graph traversal and vector search operations use open standards and open-source components with no proprietary graph engine dependency.

Ready to Build?

Get started with our APIs or contact our integration team for support.