Tutorial: from databases to JSON-LD
From databases to JSON-LD Databases are great. However, no-one is going to let you connect directly...
From databases to JSON-LD
Databases are great. However, no-one is going to let you connect directly with their database to share data. We aren’t going to spend any time on all of the ways in which data have been shared between databases (EDI, XML, direct API queries); the answer today is JavaScript Object Notation.
JavaScript Object Notation, JSON from here on out, has a small, well-defined and logical set of rules, enabling you to encode, store and retrieve structured data in a format that is easily readable by both humans and machines. It has become the data exchange format on the Web, and if you aren’t already working with it, you will be working with it when communicating your compliance framework.
The structure of JSON
JSON has only two structures; objects and arrays. Everything in JSON is either an object or an array. And more importantly, objects can have embedded objects as well as embedded arrays. And arrays can have embedded objects. Way cool.
To explain this, we’ll go back to a couple of the tables from above; a simple name table (top) and a complex name/address table (bottom):
Object syntax
The properties of every JSON object are derived from three elements:
- Key – think field name or table name here.
- Value – this is the content of the field, and is left blank for an object or array.
- Type – this describes one of three things: an object, an array, or the field type for the content.
JSON turns the combination of keys and values into a property, which are paired together separated by a colon in a JSON Object. This is also called the key:value pair, where the property name is expressed then the property value “property name”: “property value”.
This is more easily understood once you start adding content to a JSON file. Let’s start with a blank one, below:
The Root object in JSON is always described as a simple pair of curly brackets “{ }”. At this point, there are no keys, no values, and no defined types.
To this, we are going to add the various types of content.
Writing Objects
Objects (think a single record in a table even if it only has one field), are surrounded by curly brackets “{ }” to denote that everything inside of the brackets is a single object, that consist of strings of comma-separated key:value pairs (pairings of keys and then their values separated by colons). Here’s the first row of the name table in JSON.
In tree format, the object looks like this:
In JSON format, it looks like this, where each of the fields in the tree is returned as a new line ending in a comma. The key (field name) always precedes the value (field contents):
The type isn’t presented in basic JSON code (we’ll get to that as a part of JSON-LD in a bit).
Arrays
Simple arrays, as a JSON type, are surrounded by square brackets “[ ]” and consist of comma-separated values. If you wanted to present the column of three numbers as we did in the spreadsheet, it would be expressed as a simple array. In tree format, the simple array looks like this:
In JSON format, it looks like this, with the account number split from the account name.
That won’t work well to create a table. And this is where objects as a type come in to play. If we re-arrange the tree to add a set of object brackets “{ }”, we can then add the key to the property and display each of the name records with their individual fields in the array:
Now the JSON structure will separate each record and display the key/value pair for each record in the array:
This JSON clearly tells us that this table is called Names and has two fields (keys) named Acct: and Name.
Complex Object Arrays
Now that we understand objects and arrays, let’s go back to that name and address table and combine them. When finished, what we want is:
- an array of names; and
- an array of addresses for each name.
So we build out a tree that looks like the one that follows:
With this, the JSON structure embeds an array for each name and then embeds another array within that one for each address.
While this is great stuff, there isn’t yet enough information to tell a developer how to automatically translate this into a database structure. Remember that basic JSON doesn’t even pass along the key/value type.
We must turn to JSON-LD for more information.
Adding structured and linked data to JSON
There is no doubt that JSON is the shareable language that all systems are currently using to share data back and forth. However, there was no standardized methodology to share JSON with the ubiquitous web browsers that everyone uses to communicate.
In 2011, Google, Bing, Yahoo!, and Yandex created a joint effort to unify a structured data vocabulary for the web and the output was twofold; JavaScript Object Notation for Linked Data (JSON-LD) and the vocabulary repository for it at Schema.org[1].
The initial goal for JSON-LD was to annotate elements on a web page, structuring the data, which can then be used by search engines to disambiguate elements and establish facts surrounding entities, which is then associated with creating a more organized, better web overall[2].
The Context
The first element that retains a permanent place in JSON-LD markup is the @context with the value of the schema URL you are going to use. Currently, there are two known schemas that support compliance frameworks, http://schema.org and https://grcschema.org. In the tree view, the context is laid out as an array of information.
One thing to notice here in JSON-LD is the wealth of information about the data structure that is also passed to the reader! The rdfs:labels tell give you the object name while the rdfs:comment gives you the information about the object you are dealing with.
The Type
The second element in the JSON-LD Schema “always there” squad is the @type specification (after the colon, it becomes all data annotation). @type specifies the item type being marked up. All Types have as their top-level, Thing as shown below:
Schema Properties
Within JSON-LD, each object’s properties are described in-depth. Below we present the property for _first_name_ and are able to tell the reader that this is text element and, in the comments, that it represents a person’s first name.
Schema Arrays
In addition to standard schema properties, JSON Context can also tell the reader that what is being presented is an array. In the example below, the person object allows for an array of additional e-mail addresses. This is described, in JSON-LD Context, as a set (“@set”):
It should be noted the @set property is not a requirement and makes it extremely cumbersome to work with as it requires special addressing techniques when using it. Per schema.org you can see examples of arrays where they don’t leverage it. It seems to be more of a way to describe what a container object type is within @context and doesn’t appear to be utilized as an object/property itself. Here is documentation about this:
https://www.w3.org/TR/json-ld11/#lists-and-sets
By labeling each type of thing in a JSON object, you can provide the necessary code to developers to create structured pages that use either Microdata or RDF to tag HTML tag attributes that correspond to the user-visible content that you want to describe.
Endnotes
- Although Schema.org is only one of the JSON-LD repositories as we’ll see later on. Organizations such as NIST have their own model maps (“Catalog JSON Model Map” n.d.) and so does GRCschema.org (which we’ll be working with. ↑
- (“A Guide to JSON-LD for Beginners” n.d.) ↑