[Developers]

IPTC News Taxonomy and Automated Classification

An intelligence analyst monitoring extremist activity across forty news sources in five languages cannot manually categorise every article. When an article published in Romanian about far-right financing needs to reach t

Category: ModulesLast Updated: Mar 2, 2026
modulesreal-time

Overview#

An intelligence analyst monitoring extremist activity across forty news sources in five languages cannot manually categorise every article. When an article published in Romanian about far-right financing needs to reach the same analyst workflow as a French-language article on the same topic, a shared classification standard is the only practical solution. The IPTC News Taxonomy and Automated Classification module applies the International Press Telecommunications Council Media Topic taxonomy to solve exactly this problem: consistent, language-neutral content organisation across thousands of daily articles.

The system uses natural language processing to analyse content and assign IPTC codes across the taxonomy's 1,400+ subject categories, enabling cross-source correlation and topic-based intelligence monitoring regardless of the originating language or publisher.

Key Features#

  • Automated IPTC Classification: Natural language processing models analyse article content and assign IPTC Media Topic codes with confidence scoring. Multi-label classification handles articles spanning multiple topics.
  • Full Taxonomy Coverage: Support for the complete IPTC Media Topic taxonomy including all top-level subjects: politics, economics, crime, disasters, education, health, science, sports, and more, with granular sub-topic classification.
  • Multi-Language Classification: Classify content in 40+ languages using cross-lingual models that apply consistent IPTC codes regardless of source language, aligned with the platform's 40+ language support via next-intl.
  • Storyboard Integration: Classified articles feed into investigation storyboards where analysts organise intelligence by topic, create narrative threads, and link related articles across sources and time periods.
  • Topic Monitoring: Configure automated monitoring for specific IPTC topics relevant to active operations or investigations, with real-time alerts when new content matching monitored topics is published.
  • Source Correlation: Identify when multiple independent sources publish content on the same IPTC topic within a configurable time window, surfacing emerging stories and corroborating intelligence from multiple angles.
  • Classification Accuracy Feedback: Analysts correct automated classifications and feedback feeds into model retraining for continuous accuracy improvement on organisation-specific content.
  • Taxonomy Browser: Interactive exploration of the IPTC taxonomy hierarchy with article counts, trend visualisation, and drill-down navigation from broad subjects to specific sub-topics.

Use Cases#

  • Intelligence Monitoring: Automatically classify incoming news feeds and intelligence reports by IPTC topic, enabling analysts to focus on subject areas relevant to their operations without manually triaging every article.
  • Threat Awareness: Monitor IPTC categories related to crime, conflict, disaster, and security to maintain situational awareness of emerging threats in operational areas of interest.
  • Investigation Research: Search for all news content classified under specific IPTC topics related to an investigation, discovering relevant articles that keyword searches would miss due to language variation.
  • Trend Analysis: Track topic volumes over time to identify emerging trends, seasonal patterns, and anomalous spikes in coverage of operationally relevant subjects.

Integration#

This module connects to the news intelligence platform for content ingestion, the OSINT investigation tools for classified content delivery, the storyboard system for narrative organisation, and the alert management platform for topic-based monitoring alerts. Classification metadata is stored alongside article records in the PostgreSQL primary data store for search and filtering across the intelligence ecosystem.

Open Standards#

  • IPTC Media Topics taxonomy: The core classification scheme implemented throughout the module; each article is assigned one or more IPTC Media Topic codes expressed as medtop: qcodes with canonical URIs resolving to cv.iptc.org/newscodes/mediatopic/, enabling language-neutral, interoperable topic identification across all ingested content.
  • RSS 2.0 / Atom (RFC 4287): News sources are ingested via RSS and Atom feeds using the feedparser library; each configured source stores an rss_feed_url and a background crawl job fetches, parses, and normalises entries from both feed formats on a scheduled basis.
  • GraphQL (June 2018 specification): All classification queries, taxonomy browsing, topic-monitoring alerts, and investigation storyboard mutations are exposed through a typed GraphQL API built with Strawberry, enabling clients to request precisely the taxonomy fields they need.
  • ISO 8601: Datetime values for article publication timestamps, feed fetch records, trend windows, and alert notification times are stored and exchanged in ISO 8601 format, with explicit handling of the Z UTC suffix during dark-web ingestion.
  • ISO 639-1 / BCP 47: Article and source records carry a language_code field conforming to ISO 639-1 two-letter codes, used by the cross-lingual classifier to apply consistent IPTC codes regardless of source language across the 40+ supported languages.
  • JSON (ECMA-404 / RFC 8259): Taxonomy payloads, alert criteria, entity metadata, classification confidence scores, and storyboard linkage data are all serialised and stored as JSON, forming the interchange format between the classifier, the PostgreSQL store, and downstream consumers.

Availability#

  • Enterprise Plan: Full IPTC classification suite with all taxonomy categories included.
  • Professional Plan: Core classification with top-level categories; granular sub-topic classification available as an add-on.

Last Reviewed: 2026-03-02 Last Updated: 2026-04-14

Ready to Build?

Get started with our APIs or contact our integration team for support.