An Intelligent Content Pipeline for GRC and SecOps tools
The Intelligent Content Pipeline The Unified Compliance Framework’s mapping process1 has always...
The Intelligent Content Pipeline
The Unified Compliance Framework’s mapping process1 has always been a four-step process:
-
Catalog the Authority Document;
-
Analyze the contents of the Authority Document;
-
Align the contents of the Authority Document; and
-
Harmonize the contents of the Authority Document to Common Controls.
This mapping process has served us well since our inception.
However, new technologies such as Generative AI (GAI) and the usage rights that are taking shape around these technologies have given rise to new, more granular and more sophisticated methodologies for ingesting and sharing GRC and SecOps content. We now introduce the Intelligent Content Pipeline.
An ingestion process built for AI
The UCF has used Artificial Intelligence (AI) to analyze, align, and harmonize content since early 20142. AI is nothing new to our process. With that said, how we utilize AI in the mapping process has greatly evolved since then and has the ability to both acquire and provide content through APIs.
Below is an over-arching diagram of the mapping four-step process with the underlying AI activity sets that support each process. Below the processes and activity sets are the various elements of the Unified Compliance Framework showing the path that the Intelligent Content Pipeline follows from cataloging through harmonization.
Each step in the process leverages multiple AI-driven or AI-assisted activity sets that transform original content into a Common Data Format JSON-LD structure. Each step in the process builds upon the previous step’s learning and ruleset. This allows the UCF to be broken down not only into individual elements, but also a reusable vocabulary expressed as JSON-LD and shareable across a unified framework of APIs supported by our ISVs and enjoyed by our customers.
An output system built for APIs
The Intelligent Content Pipeline supports data model separation from syntax and promotes the use of discoverable, consistently formatted identifiers. These practices ensure that API data is not only machine-readable (while delivering human-readable content simultaneously) but universally understandable, making data from various APIs easily integrable and manageable. Here’s a breakdown of what this means:
-
Data Model Separation from Syntax: This practice refers to the design principle where the structure of the data (the data model) is kept distinct from the syntax used to format it (such as JSON, XML, etc.). In the context of JSON-LD, this means that the underlying data can be described and manipulated without being tightly coupled to the way it is represented in JSON. This separation makes it easier to adapt the data for different purposes without altering the data model itself.
-
Use of Discoverable, Consistently Formatted Identifiers: In JSON-LD, identifiers are often URLs or URIs that not only uniquely identify a resource but also are retrievable or discoverable. For instance, if an identifier is a URL where metadata about the resource can be accessed, it aids in the self-descriptiveness of the API. Consistent formatting ensures that these identifiers are used uniformly across different datasets or APIs, which simplifies integration tasks and reduces errors.
-
Machine-readable and Universally Understandable: JSON-LD is designed to be machine-readable, making it straightforward for programs to parse and utilize. By adhering to standardized practices like using GRCSchema.org vocabularies, the data becomes universally understandable—not just by the specific system it was intended for but across different systems. This universality is crucial for integrating data from various APIs, as it ensures that different systems can understand and process the data in a consistent manner.
-
Easily Integrable and Manageable: By standardizing how data is structured and identified, it becomes much easier to integrate data from different sources. This integration is essential for creating more comprehensive and robust systems where data from disparate sources can be combined to provide greater insights or functionality. The manageable aspect refers to the ease of maintaining and updating data when it follows a consistent, standardized format.
In an Intelligent Content Pipeline, these practices ensure that data flowing through the system is optimized for automated processing and integration, crucial for systems relying on AI and machine learning to process and analyze large datasets dynamically.