Skip to main content

Linked Data Introduction

Origins of the Semantic Web & Linked Data

Tim Berners Lee, the inventor of the World Wide Web, laid out his vision of a Semantic Web in 1999.

In his vision, the web transitions from web pages that link to web pages, to data that link directly to other data.

The current web - webpages link to the URL of other webpages:

Current web - pages link to pages

The semantic web - data links to data:

The semantic web - data links to data

In this semantic web, data links to other data by using the same URL for the same node in a graph, even though the graphs can live in different places. Such a web of linked data would effectively form one global graph of information in a universal data format.

Furthermore:

  • the data is both readable by humans and machines.
  • it supports reasoning (A.I.)
  • it can enable self sovereign data ownership (users owning their own data).

Progress over time

The Semantic Web didn't really take off quite as quickly as some had hoped. Some even declared it dead.

However, over time, the technologies have matured and adoption has grown significantly. To the point where today, there are thousands of interconnected open data sets built on linked data and over 30% of the websites live today use some form of linked data.

Here are some of the highlights of the developments over the years:

  • In 2006, the term "linked data" was first coined. Linked data got clear rules & definitions of how to link data, which helped adoption.

  • In 2010 the JSON-LD standard was released, which allowed developers to express linked data in the popular format of JSON. This allowed developers to focus on the core principles of the semantic web - linking data, with easy to use tools.

  • In 2011, schema.org was launched by Bing, Google and Yahoo (the world's largest search engines at that time). Schema was built to create a common set of schemas for structured data markup on web pages so that these search engines can understand and categorise the content of the web pages better. The search engines mostly use this to form rich snippets in their results. JSON-LD was soon supported and this gave a massive boost to the amount of linked data published.

Present day

Since then, the global set of Linked Open Data has continued to grow at an accelerating rate.

Today, linked data (or "structured data" as google calls it) is used by over 30% of the websites on the web, mostly using schema.org to enable rich snippets in search engines.

And linked data is used in many other industries and sciences, as well as by governments. The US and EU governments have released thousands of large open public data sets as linked data in recent years. And recently during the COVID pandemic, linked data played a crucial role in enabling open collaboration and data insights.

This was the entire set of open linked data sets (so each node in the picture is a data set) in 2007.

Overview of linked data sets in 2007

Here's roughly what it looks like today. Thousands and thousands of open data sets, linked to each other, collectively containing billions of facts.

Overview of linked data sets in 2017

Graph technology in general is hot

Linked data stores data as a graph. Graph database technologies in general are seeing an incredible development and adoption rate:

  • In 2019, forbes wrote "graph databases are going mainstream".

  • In 2021, graph databases were used by more than 75% of the Fortune 100 companies.

  • According to Gartner, "By 2025, graph technologies will be used in 80% of data and analytics innovations, up from 10% in 2021, facilitating rapid decision making across the enterprise.”

Excited about linked data yet?

Good!

Core concepts of Linked Data

Let's dive into what linked data really is and the core concepts you'll come across when working with linked data.

These core concepts are also described in the RDF specification, which is a W3C standard & recommendation.

But instead of reading through these dry standards, we try to explain the same things on these pages in a much more digestible format.

Go ahead and continue with triples & quads