From 1aa96094bddb6b1aacdfd7250314d5b2baeb140f Mon Sep 17 00:00:00 2001 From: ccamel Date: Tue, 26 Sep 2023 17:33:52 +0200 Subject: [PATCH] docs(ontology): improve description of ontology --- .../ontology/general-principles.md | 249 ------------------ .../ontology/okp4-ontology.md | 106 ++++++++ .../ontology/the-power-of-ontologies.md | 37 +++ .../ontology/what-are-ontologies.md | 124 +++++++++ docusaurus.config.js | 2 +- 5 files changed, 268 insertions(+), 250 deletions(-) delete mode 100644 docs/technical-documentation/ontology/general-principles.md create mode 100644 docs/technical-documentation/ontology/okp4-ontology.md create mode 100644 docs/technical-documentation/ontology/the-power-of-ontologies.md create mode 100644 docs/technical-documentation/ontology/what-are-ontologies.md diff --git a/docs/technical-documentation/ontology/general-principles.md b/docs/technical-documentation/ontology/general-principles.md deleted file mode 100644 index ccde86901c3..00000000000 --- a/docs/technical-documentation/ontology/general-principles.md +++ /dev/null @@ -1,249 +0,0 @@ ---- -sidebar_position: 1 ---- - -# OKP4 Ontology principles - -## What is an ontology ? - -In computer science, ontology is a formal and structured representation of the concepts, relationships, and properties of a particular domain. An ontology generally comprises the following basic elements: concepts, relationships, properties, axioms, and instances. These can be graphically represented by the simplified equation shown below. - -![ontology_equation](/img/content/technical-documentation/ontology_equation.webp) - -Some definitions: - -- **Concepts**: represent the main formalized elements of the domain. -- **Relationships**: represent links between concepts. -- **Properties**: represent specific attributes or characteristics of the concepts themselves. -- **Axioms**: represent logical statements or rules that define relationships between concepts, properties, and instances, ensuring the consistency and coherence of the knowledge represented within the ontology.. -- **Instances**: the concrete instances of concepts representing objects in the application domain. In OKP4, instances are used to represent all the resources of the dataverse. - -Some examples of ontology: - -An ontology of sheep and goat (source : OKP4): - -```mermaid -flowchart TD - A[Carnivore] -->|is| B[Animal] - A -->|eats| B - C[Herbivore] -->|is| B[Animal] - C -->|eats| D[Plants] - E[Sheep]-->|is| C - F[Wolf]-->|eats| E - F[Wolf]-->|is| B - F[Wolf]-.->|implies| A -``` - -An ontology of water resources (source : OKP4 from SAREF extension for water) : - -```mermaid -classDiagram - class WaterAsset{ - +hasName: string - } - - class SourceAsset{ - +hasName: string - +hasSurface: string - } - - class SinkAsset{ - +hasName: string - +hasSurface: string - } - - class Glacier { - +hasName: string - +hasSurface: string - } - - class Lake { - +hasName: string - +hasSurface: string - +hasLocation: string - } - - class Lagoon { - +hasName: string - +hasSurface: string - +hasLocation: string - } - - class Ocean { - +hasName: string - +hasSurface: string - } - - class River { - +hasName: string - +hasSurface: string - } - - class Sea { - +hasName: string - +hasSurface: string - } - - SourceAsset --> WaterAsset : is - SinkAsset --> WaterAsset : is - Glacier --> SourceAsset : isType - Lake --> SourceAsset : isType - Lagoon --> SourceAsset : isType - Ocean --> SinkAsset : isType - River --> SinkAsset : isType - Sea --> SinkAsset : isType - Glacier --> Sea : isLocated -``` - -## Why ontologies ? - -For OKP4, ontology is essential as it enables the description of shared knowledge. Participants can better understand and interpret the exchanged information, even if they come from different backgrounds. - -This ontology allows us to achieve: - -- Standardization of terminology: standardized terminology is used for concepts and relationships in a given domain, clarifying and avoiding misunderstandings between participants. -- Structuring of data: data is structured in a coherent and organized way, making it easier to access, process, and analyze. -- Interoperability of systems and tools: a well-designed ontology enables interoperability between systems and tools, facilitating the sharing of knowledge among different stakeholders. -- Improved data research and analysis by accurately describing concepts and relationships in a particular domain. - -The knowledge representation language chosen for OKP4 is [RDF Schema](http://www.w3.org/TR/rdf-schema/) and [Web Ontology Language](http://www.w3.org/TR/owl2-overview/) on top of the framework [Resource Description Framework](http://www.w3.org/TR/rdf-concepts/). - -### A formal model for the OKP4 blockchain - -This ontology describes and defines the different forms of vocabularies used in the [OKP4](https://okp4.space) protocol in a standard and well designed format. The aim is to model a semantic network of all the _entities_ (Data Spaces, data, services, processing workflows) by semantically characterizing what they are and the relationships they maintain between them. Thus, the ontology provides a complete living understanding and knowledge of the datasets within a Data Space, their transformation (by the services), as well as the governance rules that apply (data sharing, consents, policy rules). - -### Ontology at the heart of the blockchain - -Ontology is at the heart of the [OKP4](https://github.com/okp4/okp4d) protocol as much of the information is encoded and stored as an ontology _on-chain_ in the blockchain transactions. This means that (almost) all the semantics of the transactions submitted to the blockchain are expressed through this ontology - for instance the creation of a dataspace, the execution of a service, the description of a dataset, etc. - -## The OKP4 ontology - -The OKP4 protocol orchestrates the various resources of the Dataverse (datasets and services) using different blockchain elements such as smart contracts, logic modules, and ontology. All these elements allow for fine management of dataset and service workflows for knowledge creation within a Data Space with personalized governance. As seen previously, the ontology must stand for the different concepts of the protocol, their relationships, and their properties. - -The following diagram depicts the introduced concepts and their relationship with the already existing concepts of the ontology. - -```mermaid -classDiagram - Dataset --> Service : providedBy - - class Metadata{ - <> - } - Metadata <|.. ServiceGeneralMetadata - Metadata <|.. DatasetGeneralMetadata - Metadata <|.. ServiceExecutionMetadata - Metadata <|.. ServiceSpecificationMetadata - - ServiceExecutionMetadata ..> ServiceExecution : describes - - ServiceGeneralMetadata ..> Service : describes - ServiceSpecificationMetadata ..> Service : describes - DatasetGeneralMetadata ..> Dataset : describes - - ServiceExecutionOrder --> Service : hasObject - ServiceExecutionOrder --> Service : has contractor - - ServiceExecution --> ServiceExecutionOrder : specifiedBy - - ServiceExecution --> Service : hasObject - ServiceExecution --> ServiceExecution : hasParent - - ServiceExecution --> Dataset : hasInput - ServiceExecution --> Dataset : produces - ServiceExecution --> Metadata : produces -``` - -### Class and properties - -The following concepts and properties are found within the OKP4 ontology: - -#### Data - - This refers to the data contained within a dataset. - -#### Dataset - -- hasIdentifier - -This is a dataset made available by a user on the protocol. - -#### DatasetGeneralMetadata - -- hasTag -- hasCreator -- hasDescription -- hasPublisher -- hasTitle -- hasSpatialCoverage -- hasTemporalCoverage -- hasImage - -This is the description of a given dataset in metadata form. - -#### Data Space - -A Data Space groups resources. - -#### DIDURI - -A decentralized identifier URI. A URI that identifies a subject in a decentralized system and is managed independently of any centralized registry. - -#### Metadata - -The information data about something (i.e. data about the data). This something can be a Dataset, a Service, a Dataspace, or any other entity that can be described. -Metadata is an abstract concept which is refined in Metadata Profiles used to provide a formal specification that defines the set of metadata elements, their semantics, and their syntax to be used in a particular context or application. The OKP4 protocol proposes several profiles at the core of the ontology, such as GeneralMetadata for describing services or datasets. - -#### Resource - -Services or datasets, a resource belongs to a Data Space. - -#### Service - -- hasIdentifier -A service consumes a resource and produces data. - -This ontology in OWL language is written as follows: - -```text -Class: example:Data -Annotations: -rdfs:label "Data" -rdfs:comment "Define a data" -Class: example:Dataset -Annotations: -rdfs:label "Dataset" -rdfs:comment "Define a data" -SubClassOf: owl:Thing -ObjectProperty: example:hasIdentifier -Characteristics: owl:FunctionalProperty -Domain: example:Dataset -Range: xsd:string -ObjectProperty: example:providedBy -Characteristics: owl:ObjectProperty -Domain: example:Dataset -Range: example:Service -ObjectProperty: example:hasRegistrar -Characteristics: owl:ObjectProperty -Domain: example:Dataset -Range: example:DIDURI - -Class: example:DatasetGeneralMetadata -Annotations: -rdfs:label "Dataset General Metadata" -rdfs:comment "Define a data" -``` - -With all these concepts, their properties, and their relationships, we can create the OKP4 ontology and explain the workings of the OKP4 protocol in a structured and formalized way. This ontology can be expressed in different formats, more or less understandable by humans or machines. It can be expressed in French or English, RDF, OWL, JSON-LD, N-Triples, Notation3 RDF/XML, Turtle, etc. - -## Documentation - -Last released version of OKP4 ontology documentation is available here: . - -## Some assumptions - -- There's no one correct way to model a domain and a trade-off must be found between the meaning given to ontology, its expressiveness, its extensibility and its usage. -- The OKP4 ontology is not frozen. It is built step by step in an iterative process (see next section), and some decisions made here may be changed later. -- It should be understood that OWL modeling is different from UML modeling (or more simply of the Oriented Object interpretation that one would be tempted to make). As such, the following readings are relevant: - - [A detailed comparison of UML and OWL](https://madoc.bib.uni-mannheim.de/1898/1/TR2008_004.pdf) - - [A common misconception regarding owl properties](https://henrietteharmse.com/2018/06/22/a-common-misconception-regarding-owl-properties/) -- OWL being a logical description language, some deductions can be made by an OWL reasoner. However, as far as possible, it will be best to make explicit what could be deduced by an OWL reasoner. diff --git a/docs/technical-documentation/ontology/okp4-ontology.md b/docs/technical-documentation/ontology/okp4-ontology.md new file mode 100644 index 00000000000..4fac3b14c8f --- /dev/null +++ b/docs/technical-documentation/ontology/okp4-ontology.md @@ -0,0 +1,106 @@ +--- +sidebar_position: 3 +--- + +# OKP4 Ontology + +The OKP4 protocol orchestrates the various resources of the Dataverse (datasets and services) using different blockchain elements such as smart contracts, logic modules, and ontology. All these elements allow for fine management of dataset and service workflows for knowledge creation within a Zone with personalized governance. As seen previous sections, the ontology must stand for the different concepts of the protocol, their relationships, and their properties. + +## The big picture + +The following diagram depicts the introduced concepts and their relationship with the already existing concepts of the ontology. + +```mermaid +classDiagram + Dataset --> Service : providedBy + + class Metadata{ + <> + } + Metadata <|.. ServiceGeneralMetadata + Metadata <|.. DatasetGeneralMetadata + Metadata <|.. ServiceExecutionMetadata + Metadata <|.. ServiceSpecificationMetadata + + ServiceExecutionMetadata ..> ServiceExecution : describes + + ServiceGeneralMetadata ..> Service : describes + ServiceSpecificationMetadata ..> Service : describes + DatasetGeneralMetadata ..> Dataset : describes + + ServiceExecutionOrder --> Service : hasObject + ServiceExecutionOrder --> Service : has contractor + + ServiceExecution --> ServiceExecutionOrder : specifiedBy + + ServiceExecution --> Service : hasObject + ServiceExecution --> ServiceExecution : hasParent + + ServiceExecution --> Dataset : hasInput + ServiceExecution --> Dataset : produces + ServiceExecution --> Metadata : produces +``` + +## Class and properties + +The following concepts and properties are found within the OKP4 ontology: + +### Data + + This refers to the data contained within a dataset. + +### Dataset + +- hasIdentifier + +This is a dataset made available by a user on the protocol. + +### DatasetGeneralMetadata + +- hasTag +- hasCreator +- hasDescription +- hasPublisher +- hasTitle +- hasSpatialCoverage +- hasTemporalCoverage +- hasImage + +This is the description of a given dataset in metadata form. + +### Zone + +Zone is a conceptual framework that is established based on a set of rules, within which recognized `Resources` must conform, considering associated consents. + +Zones are described by a set of metadata providing information about various aspects of the Zone, such as the zone's name, general information about the provider, and more. + +Specific data description vocabularies and thesauri are used to structure this metadata. A dedicated metadata profile outlines the rules that govern the Zone, expressing its fundamental principles, intentions, scope, and ultimate objectives. These rules encompass the entities involved and the Resources that interact within the Zone. + +They can be customized to address specific use cases, industry sectors, partnership networks, or geographic regions, facilitating tailored governance arrangements within a specific context. + +### DIDURI + +A decentralized identifier URI. A URI that identifies a subject in a decentralized system and is managed independently of any centralized registry. + +### Metadata + +The information data about something (i.e. data about the data). This something can be a Zone, a Dataset, a Service, or any other entity that can be described. + +Metadata is an abstract concept which is refined in Metadata Profiles used to provide a formal specification that defines the set of metadata elements, their semantics, and their syntax to be used in a particular context or application. The OKP4 protocol proposes several profiles at the core of the ontology, such as GeneralMetadata for describing services or datasets. + +### Resource + +Services or datasets, a resource belongs to the Dataverse. + +### Service + +- hasIdentifier +A service consumes a resource and produces data. + +## Conclusion + +With all these concepts, their properties, and their relationships, we can create the OKP4 ontology and explain the workings of the OKP4 protocol in a structured and formalized way. This ontology can be expressed in different formats, both understandable by humans and machines. It can be expressed in French or English, RDF, OWL, JSON-LD, N-Triples, Notation3 RDF/XML, Turtle, etc. + +## Documentation + +Last released version of OKP4 ontology documentation is available here: . diff --git a/docs/technical-documentation/ontology/the-power-of-ontologies.md b/docs/technical-documentation/ontology/the-power-of-ontologies.md new file mode 100644 index 00000000000..4a2b7de11f7 --- /dev/null +++ b/docs/technical-documentation/ontology/the-power-of-ontologies.md @@ -0,0 +1,37 @@ +--- +sidebar_position: 1 +--- + +# The Power Of Ontologies + +In the OKP4 protocol, ontologies are indispensable. They facilitate a comprehensive depiction of the dataverse, capturing even its most intricate details. Ontologies provide detailed descriptions of both datasets and services, enhancing overall comprehension. Moreover, they bridge the connection between governance and orchestration within the dataverse. + +
+ +
+ +This ontology allows to achieve: + +- **Standardization of terminology**: standardized terminology is used for concepts and relationships in a given domain, clarifying and avoiding misunderstandings between participants. +- **Structuring of data**: data is structured in a coherent and organized way, making it easier to access, process, and analyze. +- **Interoperability of systems and tools**: a well-designed ontology enables interoperability between systems and tools, facilitating the sharing of knowledge among different stakeholders. +- **Improved data research and analysis** by accurately describing concepts and relationships in a particular domain. + +## A formal model for the OKP4 protocol + +This ontology describes and defines the different forms of vocabularies used in the [OKP4](https://okp4.space) protocol in a standard and well designed format. The aim is to model a semantic network of all the _entities_ (Zones, data, services, processing workflows) by semantically characterizing what they are and the relationships they maintain between them. Thus, the ontology provides a complete living understanding and knowledge of the datasets within a Zone, their transformation (by the services), as well as the governance rules that apply (data sharing, consents, policy rules). + +Several languages are used to express the OKP4 ontologies: + +- [OWL (Web Ontology Language)](https://www.w3.org/TR/owl2-overview/): a standard language of the World Wide Web Consortium (W3C) for representing ontologies. OWL is based on descriptive logic and allows for the definition of classes, subclasses, properties, and relationships. +- [RDF (Resource Description Framework)](https://www.w3.org/TR/rdf11-concepts/): a markup language for representing information about resources on the Web, including ontologies. RDF describes resources in terms of their properties and relationships with other resources. +- RDFS (RDF Schema): an ontology representation language that defines classes and properties and relationships between them. RDFS is an extension of RDF. +- [SKOS (Simple Knowledge Organization System)](https://en.wikipedia.org/wiki/Simple_Knowledge_Organization_System): a language for representing ontology that allows the description of classification systems and thesauri. SKOS allows the definition of concepts, relationships, and properties. + +
+ +
+ +## Ontology at the heart of the blockchain + +The Ontology is at the heart of the [OKP4](https://github.com/okp4/okp4d) protocol as much of the information is encoded and stored as an ontology _on-chain_ in the blockchain transactions. This means that (almost) all the semantics of the transactions submitted to the blockchain are expressed through this ontology - for instance the creation of a Zone, the execution of a Service, the description of a Dataset, etc. diff --git a/docs/technical-documentation/ontology/what-are-ontologies.md b/docs/technical-documentation/ontology/what-are-ontologies.md new file mode 100644 index 00000000000..c281ab81352 --- /dev/null +++ b/docs/technical-documentation/ontology/what-are-ontologies.md @@ -0,0 +1,124 @@ +--- +sidebar_position: 2 +--- + +# What are ontologies ? + +## Definition + +In computer science, ontology is a formal and structured representation of the concepts, relationships, and properties of a particular domain. An ontology generally comprises the following basic elements: concepts, relationships, properties, axioms, and instances. These can be graphically represented by the simplified equation shown below. + +
+ Ontology equation +
+ +Some definitions: + +- **Concepts**: represent the main formalized elements of the domain. +- **Relationships**: represent links between concepts. +- **Properties**: represent specific attributes or characteristics of the concepts themselves. +- **Axioms**: represent logical statements or rules that define relationships between concepts, properties, and instances, ensuring the consistency and coherence of the knowledge represented within the ontology.. +- **Instances**: the concrete instances of concepts representing objects in the application domain. In OKP4, instances are used to represent all the resources of the dataverse. + +## Examples + +### Sheep and goat + +source: OKP4 + +```mermaid +flowchart TD + A[Carnivore] -->|is| B[Animal] + A -->|eats| B + C[Herbivore] -->|is| B[Animal] + C -->|eats| D[Plants] + E[Sheep]-->|is| C + F[Wolf]-->|eats| E + F[Wolf]-->|is| B + F[Wolf]-.->|implies| A +``` + +### Water resources + +source: OKP4 from SAREF extension for water + +```mermaid +classDiagram + class WaterAsset{ + +hasName: string + } + + class SourceAsset{ + +hasName: string + +hasSurface: string + } + + class SinkAsset{ + +hasName: string + +hasSurface: string + } + + class Glacier { + +hasName: string + +hasSurface: string + } + + class Lake { + +hasName: string + +hasSurface: string + +hasLocation: string + } + + class Lagoon { + +hasName: string + +hasSurface: string + +hasLocation: string + } + + class Ocean { + +hasName: string + +hasSurface: string + } + + class River { + +hasName: string + +hasSurface: string + } + + class Sea { + +hasName: string + +hasSurface: string + } + + SourceAsset --> WaterAsset : is + SinkAsset --> WaterAsset : is + Glacier --> SourceAsset : isType + Lake --> SourceAsset : isType + Lagoon --> SourceAsset : isType + Ocean --> SinkAsset : isType + River --> SinkAsset : isType + Sea --> SinkAsset : isType + Glacier --> Sea : isLocated +``` + +## Ontology construction process + +The construction of this ontology follows a number of steps which are described below: + +- **Ontology scope definition (1) & knowledge acquisition (2)**: Identification and definition of key concepts and relationships in the domain of interest and the terms that refer to such concepts, in natural language. +- **Ontology specification (3) & conceptualization (4)**: Formalizing of the elements identified in the previous step in the form of a knowledge representation, using the building blocks of ontologies: classes, attributes, relationships, subsumption. +- **Ontology implementation (5)**: Encoding the ontology according to the OWL grammar. +- **Ontology evaluation (6)**: Association of key concepts and terms in the ontology with concepts and terms of other ontologies. + +
+ Ontology construction process +
+ +## Underlying Assumptions + +- Modeling a domain is not a one-size-fits-all process. There's a balance to be struck between the ontology's meaning, its expressivity, its potential for expansion, and its actual application. +- The development of the OKP4 ontology is dynamic. It evolves progressively through an iterative process (detailed in the subsequent section). Some decisions taken now may be revisited in the future. +- It's crucial to recognize the distinction between OWL modeling and [UML](https://en.wikipedia.org/wiki/Unified_Modeling_Language) modeling. The latter, often rooted in Object-Oriented interpretations, varies from OWL's approach. For deeper insights, the following resources are recommended: + - [Comparing UML and OWL in Depth](https://madoc.bib.uni-mannheim.de/1898/1/TR2008_004.pdf) + - [Dispelling a Common Myth about OWL Properties](https://henrietteharmse.com/2018/06/22/a-common-misconception-regarding-owl-properties/) +- Given that OWL operates as a logical description language, certain inferences can be drawn using an OWL reasoner. Nevertheless, wherever feasible, it's preferable to clarify aspects that could otherwise be left for an OWL reasoner to deduce. \ No newline at end of file diff --git a/docusaurus.config.js b/docusaurus.config.js index eb7986347ba..9664da62812 100644 --- a/docusaurus.config.js +++ b/docusaurus.config.js @@ -87,7 +87,7 @@ async function createconfig() { activeBasePath: '/nodes' }, { - to: '/technical-documentation/ontology/general-principles', + to: '/technical-documentation/ontology/the-power-of-ontologies', position: 'left', label: 'Technical documentation', activeBasePath: '/technical-documentation'