diff --git a/README.md b/README.md index 803feb57c..10514d880 100644 --- a/README.md +++ b/README.md @@ -5,11 +5,14 @@ Data Privacy Vocabularies and Controls Community Group (DPVCG) repository The mission of the W3C Data Privacy Vocabularies and Controls CG (DPVCG) is to develop a taxonomy of privacy and data protection related terms, which include in particular terms from the new European General Data Protection Regulation (GDPR), such as a taxonomy of personal data as well as a classification of purposes (i.e., purposes for data collection), and events of disclosures, consent, and processing such personal data. +> Newcomers to the DPV are recommended to start with the [Primer](https://w3id.org/dpv/primer) to familiarise themselves with the concepts, semantics, and usefulness of the DPV. + License: All work produced by DPVCG and provided through this repo or elsewhere is provided by contributors under the [W3C Document License](https://www.w3.org/Consortium/Legal/2015/doc-license). A copy of the license is provided in the [LICENSE.md](./LICENSE.md) file. Outputs: - * Data Privacy Vocabulary (DPV) - [https://w3.org/ns/dpv](https://w3.org/ns/dpv) - * GDPR terms for Data Privacy Vocabulary (DPV-GDPR) [https://w3.org/ns/dpv-gdpr](https://w3.org/ns/dpv-gdpr) + * Primer: Introduction to the Data Privacy Vocabulary - [https://w3id.org/dpv/primer](https://w3id.org/dpv/primer) + * Data Privacy Vocabulary (DPV) - [https://w3id.org/dpv](https://w3id.org/dpv) + * GDPR terms for Data Privacy Vocabulary (DPV-GDPR) [https://w3id.org/dpv/dpv-gdpr](https://w3id.org/dpv/dpv-gdpr) Publication: * Pandit H.J. et al. (2019) Creating a Vocabulary for Data Privacy. In: Panetto H., Debruyne C., Hepp M., Lewis D., Ardagna C., Meersman R. (eds) On the Move to Meaningful Internet Systems: OTM 2019 Conferences. OTM 2019. Lecture Notes in Computer Science, vol 11877. Springer, Cham. https://doi.org/10.1007/978-3-030-33246-4_44 @@ -29,15 +32,23 @@ The vocabulary provides terms to describe: * rights as applicable * risks as applicable -The namespace for DPV terms is `http://www.w3.org/ns/dpv#` with suggested prefix `dpv`. The IRI for DPV is currently redirected to serve the files hosted in this repository from GitHub pages i.e. `https://w3c.github.io/dpv/dpv/` (thanks to @bert-github for setting this up). Content-negotiation should therefore be supported for all files/serialisations of the DPV and its modules. +The namespace for DPV terms is `http://www.w3id.org/dpv#` with suggested prefix `dpv`. The IRI for DPV is currently redirected to serve the files hosted in this repository from GitHub pages i.e. `https://w3c.github.io/dpv/dpv/` (thanks to @bert-github for setting this up). Content-negotiation should therefore be supported for all files/serialisations of the DPV and its modules. + +## DPV Family of Documents + +* [DPV-Primer](https://www.w3id.org/dpv/dpv-primer): The Primer serves as an introductory document to DPV and provides an overview of its concepts. +* [DPV](https://www.w3id.org/dpv/): The DPV Specification is the formal and normative description of DPV and its concepts. It provides a serialisation of the concepts as a taxonomy using SKOS. -### DPV-GDPR +**Extensions to Concepts** -The [**DPV-GDPR**](https://w3.org/ns/dpv-gdpr) vocabulary expands on the DPV vocabulary to provide the specific legal basis, rights, and concepts defined within GDPR. It expands or specialises the concepts in DPV for use with GDPR. +* [DPV-GDPR](https://www.w3id.org/dpv/dpv-gdpr): Extension expands on the DPV vocabulary to provide the specific legal basis, rights, and concepts defined within GDPR. It expands or specialises the concepts in DPV for use with GDPR. +* [DPV-PD](https://www.w3id.org/dpv/dpv-pd): Extension to the DPV providing a taxonomy of personal data categories. +* [DPV-NACE](https://www.w3id.org/dpv-nace): [NACE](https://ec.europa.eu/eurostat/ramon/nomenclatures/index.cfm?TargetUrl=LST_NOM_DTL&StrNom=NACE_REV2) industry standard classification system used in the EU serialised in RDFS -### DPV-NACE +**Serialisations of DPV** -The [**DPV-NACE**](https://github.com/w3c/dpv/tree/master/dpv-nace) vocabulary provides a RDFS and DPV compatible serialisation of the [NACE](https://ec.europa.eu/eurostat/ramon/nomenclatures/index.cfm?TargetUrl=LST_NOM_DTL&StrNom=NACE_REV2) industry standard classification system used in the EU. +* [DPV-SKOS](https://www.w3id.org/dpv/dpv-skos): A serialisation of the DPV using RDFS and SKOS to enable its use as a lightweight ontology for modelling or annotating information. This serialisation can be used in cases where the DPV is to be used as a 'data model' or 'schema' without formal logical assertions. It is suitable in cases where simple(r) inferences are required, or where the strict interpretation or restrictions of OWL are not needed, or the rules/constraints are expressed in another language (e.g. SWRL or SHACL). +* [DPV-OWL](https://www.w3id.org/dpv/dpv-owl): a serialisation of the DPV specification using OWL language. It should be used where the additional semantic relationships offered by OWL (based on description logic) are needed for modelling knowledge and describing desired inferences. OWL offers more powerful (and complex) features compared to RDFS regarding expression of information and its use to produce desired inferences in a coherent manner. ### DPV and Modules diff --git a/documentation-generator/001_download_vocab_in_csv.py b/documentation-generator/001_download_vocab_in_csv.py index 085c7295e..57af68fe3 100755 --- a/documentation-generator/001_download_vocab_in_csv.py +++ b/documentation-generator/001_download_vocab_in_csv.py @@ -14,19 +14,35 @@ # The sheet names are assumed to be valid IRIs # If they are not, escape them for IRI/HTML representation DPV_SHEETS = ( + # Namespaces + 'Namespaces', + 'Namespaces_Other', # DPV 'BaseOntology', 'BaseOntology_properties', - 'PersonalDataCategory', + 'PersonalData', + 'PersonalData_properties', + 'dpv-pd', 'Purpose', 'Purpose_properties', + 'Context', + 'Context_properties', 'Processing', 'Processing_properties', + 'ProcessingContext', + 'ProcessingContext_properties', 'TechnicalOrganisationalMeasure', 'TechnicalOrganisationalMeasure_properties', 'Entities', 'Entities_properties', + 'Entities_LegalRole', + 'Entities_LegalRole_properties', + 'Entities_Organisation', + 'Entities_DataSubjects', + 'Jurisdictions', + 'Jurisdictions_properties', 'LegalBasis', + 'LegalBasis_properties', 'Consent_properties', # DPV-GDPR 'GDPR_LegalBasis', diff --git a/documentation-generator/002_parse_csv_to_rdf.py b/documentation-generator/002_parse_csv_to_rdf.py index c658243af..807b80715 100755 --- a/documentation-generator/002_parse_csv_to_rdf.py +++ b/documentation-generator/002_parse_csv_to_rdf.py @@ -20,9 +20,10 @@ IMPORT_CSV_PATH = './vocab_csv' EXPORT_DPV_PATH = '../dpv' -EXPORT_DPV_MODULE_PATH = '../dpv/rdf' +EXPORT_DPV_MODULE_PATH = '../dpv/modules' EXPORT_DPV_GDPR_PATH = '../dpv-gdpr' -EXPORT_DPV_GDPR_MODULE_PATH = '../dpv-gdpr/rdf' +EXPORT_DPV_GDPR_MODULE_PATH = '../dpv-gdpr/modules' +EXPORT_DPV_PD_PATH = '../dpv-pd' # serializations in the form of extention: rdflib name RDF_SERIALIZATIONS = { @@ -51,9 +52,11 @@ DEBUG = logging.debug INFO = logging.info +# Namespaces are in two files: +# 1. Namespaces.csv for DPV issued namespaces +# 2. Namespaces_other for External namespaces + DCT = Namespace('http://purl.org/dc/terms/') -DPV = Namespace('http://www.w3.org/ns/dpv#') -DPV_GDPR = Namespace('http://www.w3.org/ns/dpv-gdpr#') FOAF = Namespace('http://xmlns.com/foaf/0.1/') ODRL = Namespace('http://www.w3.org/ns/odrl/2/') PROV = Namespace('http://www.w3.org/ns/prov#') @@ -68,14 +71,23 @@ SW = Namespace('http://www.w3.org/2003/06/sw-vocab-status/ns#') TIME = Namespace('http://www.w3.org/2006/time#') +DPV = Namespace('https://w3id.org/dpv#') +DPV_NACE = Namespace('https://w3id.org/dpv/nace#') +DPV_GDPR = Namespace('https://w3id.org/dpv/gdpr#') +DPV_PD = Namespace('https://w3id.org/dpv/pd#') +DPVS = Namespace('https://w3id.org/dpv/dpv-skos#') +DPVS_GDPR = Namespace('https://w3id.org/dpv/dpv-skos/gdpr#') +DPVS_PD = Namespace('https://w3id.org/dpv/dpv-skos/pd#') +DPVO = Namespace('https://w3id.org/dpv/dpv-owl#') +DPVO_GDPR = Namespace('https://w3id.org/dpv/dpv-owl/gdpr#') +DPVO_PD = Namespace('https://w3id.org/dpv/dpv-owl/pd#') + # The dpv namespace is the default base for all terms # Later, this is changed to write terms under DPV-GDPR namespace BASE = DPV NAMESPACES = { 'dct': DCT, - 'dpv': DPV, - 'dpv-gdpr': DPV_GDPR, 'foaf': FOAF, 'odrl': ODRL, 'owl': OWL, @@ -93,32 +105,42 @@ 'sw': SW, 'time': TIME, 'xsd': XSD, + # DPV + 'dpv': DPV, + 'dpv-nace': DPV_NACE, + 'dpv-gdpr': DPV_GDPR, + 'dpv-pd': DPV_PD, + 'dpvs': DPVS, + 'dpvs-gdpr': DPVS_GDPR, + 'dpvs-pd': DPVS_PD, + 'dpvo': DPVO, + 'dpvo-gdpr': DPVO_GDPR, + 'dpvo-pd': DPVO_PD, } # the field labels are based on what they should be translated to DPV_Class = namedtuple('DPV_Class', [ - 'term', 'rdfs_label', 'dct_description', 'rdfs_subclassof', - 'rdfs_seealso', 'relation', 'rdfs_comment', 'rdfs_isdefinedby', - 'dct_created', 'dct_modified', 'sw_termstatus', 'dct_creator', - 'resolution']) - + 'term', 'skos_prefLabel', 'skos_definition', 'dpv_isSubTypeOf', + 'skos_related', 'relation', 'skos_note', 'skos_scopeNote', + 'dct_created', 'dct_modified', 'sw_termstatus', 'dct_creator', + 'resolution']) DPV_Property = namedtuple('DPV_Property', [ - 'term', 'rdfs_label', 'dct_description', - 'rdfs_domain', 'rdfs_range', 'rdfs_subpropertyof', - 'rdfs_seealso', 'relation', 'rdfs_comment', 'rdfs_isdefinedby', - 'dct_created', 'dct_modified', 'sw_termstatus', 'dct_creator', - 'resolution']) + 'term', 'skos_prefLabel', 'skos_definition', + 'rdfs_domain', 'rdfs_range', 'rdfs_subpropertyof', + 'skos_related', 'relation', 'skos_note', 'skos_scopeNote', + 'dct_created', 'dct_modified', 'sw_termstatus', 'dct_creator', + 'resolution']) LINKS = {} -def extract_terms_from_csv(filepath, Class): +def extract_terms_from_csv(filepath, Mapping): '''extracts data from file.csv and creates instances of Class - returns list of Class instances''' + returns list of Mapping-defined instances''' # this is a hack to get parseable number of fields from CSV # it relies on the internal data structure of a namedtuple - attributes = Class.__dict__ + attributes = Mapping.__dict__ attributes = len(attributes['_fields']) with open(filepath) as fd: csvreader = csv.reader(fd) @@ -131,7 +153,7 @@ def extract_terms_from_csv(filepath, Class): # extract required amount of terms, ignore any field after that row = [term.strip() for term in row[:attributes]] # create instance of required class - terms.append(Class(*row)) + terms.append(Mapping(*row)) return terms @@ -143,19 +165,17 @@ def add_common_triples_for_all_terms(term, graph): graph: rdflib graph returns: None''' + graph.add((BASE[f'{term.term}'], RDF.type, SKOS.Concept)) # rdfs:label - graph.add((BASE[f'{term.term}'], RDFS.label, Literal(term.rdfs_label, lang='en'))) + graph.add((BASE[f'{term.term}'], SKOS.prefLabel, Literal(term.skos_prefLabel, lang='en'))) # dct:description - graph.add((BASE[f'{term.term}'], DCT.description, Literal(term.dct_description, lang='en'))) + graph.add((BASE[f'{term.term}'], SKOS.definition, Literal(term.skos_definition, lang='en'))) # rdfs:seeAlso - # TODO: use relation field for relevant terms - # currently this considers all terms that are related to use rdfs:seeAlso - # the next column contains the relation, parse and use that - if term.rdfs_seealso: - links = [l.strip() for l in term.rdfs_seealso.split(',')] + if term.skos_related: + links = [l.strip() for l in term.skos_related.split(',')] for link in links: if link.startswith('http'): - graph.add((BASE[f'{term.term}'], RDFS.seeAlso, URIRef(link))) + graph.add((BASE[f'{term.term}'], SKOS.related, URIRef(link))) elif ':' in link: # assuming something like rdfs:Resource prefix, label = link.split(':') @@ -163,15 +183,15 @@ def add_common_triples_for_all_terms(term, graph): # will throw an error if namespace is not registered # dpv internal terms are expected to have the prefix i.e. dpv:term link = NAMESPACES[prefix][f'{label}'] - graph.add((BASE[f'{term.term}'], RDFS.seeAlso, link)) + graph.add((BASE[f'{term.term}'], SKOS.related, link)) else: - graph.add((BASE[f'{term.term}'], RDFS.seeAlso, Literal(link, datatype=XSD.string))) + graph.add((BASE[f'{term.term}'], SKOS.related, Literal(link, datatype=XSD.string))) # rdfs:comment - if term.rdfs_comment: - graph.add((BASE[f'{term.term}'], RDFS.comment, Literal(term.rdfs_comment, lang='en'))) + if term.skos_note: + graph.add((BASE[f'{term.term}'], SKOS.note, Literal(term.skos_note, lang='en'))) # rdfs:isDefinedBy - if term.rdfs_isdefinedby: - links = [l.strip() for l in term.rdfs_isdefinedby.replace('(','').replace(')','').split(',')] + if term.skos_scopeNote: + links = [l.strip() for l in term.skos_scopeNote.replace('(','').replace(')','').split(',')] link_iterator = iter(links) for label in link_iterator: link = next(link_iterator) @@ -180,9 +200,9 @@ def add_common_triples_for_all_terms(term, graph): LINKS[link] = label # add link to graph if link.startswith('http'): - graph.add((BASE[f'{term.term}'], RDFS.isDefinedBy, URIRef(link))) + graph.add((BASE[f'{term.term}'], DCT.source, URIRef(link))) else: - graph.add((BASE[f'{term.term}'], RDFS.isDefinedBy, Literal(link, datatype=XSD.string))) + graph.add((BASE[f'{term.term}'], DCT.source, Literal(link, datatype=XSD.string))) # dct:created graph.add((BASE[f'{term.term}'], DCT.created, Literal(term.dct_created, datatype=XSD.date))) # dct:modified @@ -195,6 +215,8 @@ def add_common_triples_for_all_terms(term, graph): authors = [a.strip() for a in term.dct_creator.split(',')] for author in authors: graph.add((BASE[f'{term.term}'], DCT.creator, Literal(author, datatype=XSD.string))) + # is defined by this vocabulary + graph.add((BASE[f'{term.term}'], RDFS.isDefinedBy, BASE[''])) # resolution # do nothing @@ -212,23 +234,27 @@ def add_triples_for_classes(classes, graph): if cls.sw_termstatus not in VOCAB_TERM_ACCEPT: continue # rdf:type - graph.add((BASE[f'{cls.term}'], RDF.type, RDFS.Class)) + DEBUG(cls.term) + graph.add((BASE[f'{cls.term}'], RDF.type, DPV.Concept)) # rdfs:subClassOf - if cls.rdfs_subclassof: - parents = [p.strip() for p in cls.rdfs_subclassof.split(',')] + if cls.dpv_isSubTypeOf: + parents = [p.strip() for p in cls.dpv_isSubTypeOf.split(',')] for parent in parents: if parent.startswith('http'): - graph.add((BASE[f'{cls.term}'], RDFS.subClassOf, URIRef(parent))) + graph.add((BASE[f'{cls.term}'], DPV.isSubTypeOf, URIRef(parent))) elif ':' in parent: + if parent == "dpv:Concept": + continue # assuming something like rdfs:Resource prefix, term = parent.split(':') + prefix = prefix.replace("sc__", "") # gets the namespace from registered ones and create URI # will throw an error if namespace is not registered # dpv internal terms are expected to have the prefix i.e. dpv:term parent = NAMESPACES[prefix][f'{term}'] - graph.add((BASE[f'{cls.term}'], RDFS.subClassOf, parent)) + graph.add((BASE[f'{cls.term}'], DPV.isSubTypeOf, parent)) else: - graph.add((BASE[f'{cls.term}'], RDFS.subClassOf, Literal(parent, datatype=XSD.string))) + graph.add((BASE[f'{cls.term}'], DPV.isSubTypeOf, Literal(parent, datatype=XSD.string))) add_common_triples_for_all_terms(cls, graph) @@ -246,7 +272,8 @@ def add_triples_for_properties(properties, graph): if prop.sw_termstatus not in VOCAB_TERM_ACCEPT: continue # rdf:type - graph.add((BASE[f'{prop.term}'], RDF.type, RDF.Property)) + DEBUG(prop.term) + graph.add((BASE[f'{prop.term}'], RDF.type, DPV.Relation)) # rdfs:domain if prop.rdfs_domain: # assuming something like rdfs:Resource @@ -272,6 +299,8 @@ def add_triples_for_properties(properties, graph): if parent.startswith('http'): graph.add((BASE[f'{prop.term}'], RDFS.subPropertyOf, URIRef(parent))) elif ':' in parent: + if parent == "dpv:Relation": + continue # assuming something like rdfs:Resource prefix, term = parent.split(':') # gets the namespace from registered ones and create URI @@ -299,28 +328,59 @@ def serialize_graph(graph, filepath): 'base': { 'classes': f'{IMPORT_CSV_PATH}/BaseOntology.csv', 'properties': f'{IMPORT_CSV_PATH}/BaseOntology_properties.csv', + 'model': 'vocabulary', }, - 'personal_data_categories': { - 'classes': f'{IMPORT_CSV_PATH}/PersonalDataCategory.csv', + 'personal_data': { + 'classes': f'{IMPORT_CSV_PATH}/PersonalData.csv', + 'properties': f'{IMPORT_CSV_PATH}/PersonalData_properties.csv', + 'model': 'ontology', + 'topconcept': DPV.PersonalData, }, 'purposes': { 'classes': f'{IMPORT_CSV_PATH}/Purpose.csv', 'properties': f'{IMPORT_CSV_PATH}/Purpose_properties.csv', + 'model': 'taxonomy', + 'topconcept': DPV.Purpose, + }, + 'context': { + 'classes': f'{IMPORT_CSV_PATH}/Context.csv', + 'properties': f'{IMPORT_CSV_PATH}/Context_properties.csv', + 'model': 'taxonomy', + 'topconcept': DPV.Context, }, 'processing': { 'classes': f'{IMPORT_CSV_PATH}/Processing.csv', 'properties': f'{IMPORT_CSV_PATH}/Processing_properties.csv', + 'model': 'taxonomy', + 'topconcept': DPV.Processing, + }, + 'processing_context': { + 'classes': f'{IMPORT_CSV_PATH}/ProcessingContext.csv', + 'properties': f'{IMPORT_CSV_PATH}/ProcessingContext_properties.csv', + 'model': 'taxonomy', }, 'technical_organisational_measures': { 'classes': f'{IMPORT_CSV_PATH}/TechnicalOrganisationalMeasure.csv', 'properties': f'{IMPORT_CSV_PATH}/TechnicalOrganisationalMeasure_properties.csv', + 'model': 'taxonomy', + 'topconcept': DPV.TechnicalOrganisationalMeasure, }, 'entities': { 'classes': f'{IMPORT_CSV_PATH}/Entities.csv', - 'properties': f'{IMPORT_CSV_PATH}/Entities_properties.csv' + 'properties': f'{IMPORT_CSV_PATH}/Entities_properties.csv', + 'model': 'ontology', + 'topconcept': DPV.Entity, + }, + 'jurisdictions': { + 'classes': f'{IMPORT_CSV_PATH}/Jurisdictions.csv', + 'properties': f'{IMPORT_CSV_PATH}/Jurisdictions_properties.csv', + 'model': 'ontology', }, 'legal_basis': { 'classes': f'{IMPORT_CSV_PATH}/LegalBasis.csv', + 'properties': f'{IMPORT_CSV_PATH}/LegalBasis_properties.csv', + 'model': 'taxonomy', + 'topconcept': DPV.LegalBasis, }, 'consent': { # 'classes': f'{IMPORT_CSV_PATH}/Consent.csv', @@ -330,9 +390,11 @@ def serialize_graph(graph, filepath): # this graph will get written to dpv.ttl DPV_GRAPH = Graph() - +DPV_GRAPH.add((BASE[''], RDF.type, SKOS.ConceptScheme)) for name, module in DPV_CSV_FILES.items(): graph = Graph() + DEBUG('------') + DEBUG(f'Processing {name} module') for prefix, namespace in NAMESPACES.items(): graph.namespace_manager.bind(prefix, namespace) if 'classes' in module: @@ -343,29 +405,38 @@ def serialize_graph(graph, filepath): properties = extract_terms_from_csv(module['properties'], DPV_Property) DEBUG(f'there are {len(properties)} properties in {name}') add_triples_for_properties(properties, graph) + # add collection representing concepts + graph.add((BASE[f'{name.title()}Concepts'], RDF.type, SKOS.Collection)) + graph.add((BASE[f'{name.title()}Concepts'], DCT.title, Literal(f'{name.title()} Concepts', datatype=XSD.string))) + for concept, _, _ in graph.triples((None, RDF.type, SKOS.Concept)): + graph.add((BASE[f'{name.title()}Concepts'], SKOS.member, concept)) + DPV_GRAPH.add((concept, SKOS.inScheme, DPV[''])) + # serialize + graph.load('ontology_metadata/dpv-semantics.ttl', format='turtle') serialize_graph(graph, f'{EXPORT_DPV_MODULE_PATH}/{name}') + if 'topconcept' in module: + DPV_GRAPH.add((BASE[''], SKOS.hasTopConcept, module['topconcept'])) DPV_GRAPH += graph # add information about ontology # this is assumed to be in file dpv-ontology-metadata.ttl graph = Graph() -graph.load('dpv-ontology-metadata.ttl', format='turtle') +graph.load('ontology_metadata/dpv-gdpr.ttl', format='turtle') +graph.load('ontology_metadata/dpv-semantics.ttl', format='turtle') DPV_GRAPH += graph for prefix, namespace in NAMESPACES.items(): DPV_GRAPH.namespace_manager.bind(prefix, namespace) serialize_graph(DPV_GRAPH, f'{EXPORT_DPV_PATH}/dpv') +############################################################################## + # DPV-GDPR # # dpv-gdpr is the exact same as dpv in terms of requirements and structure # except that the namespace is different # so instead of rewriting the entire code again for dpv-gdpr, # here I become lazy and instead change the DPV namespace to DPV-GDPR -BASE = NAMESPACES['dpv-gdpr'] - -DPV_GDPR_GRAPH = Graph() - DPV_GDPR_CSV_FILES = { 'legal_basis': { 'classes': f'{IMPORT_CSV_PATH}/GDPR_LegalBasis.csv', @@ -378,8 +449,14 @@ def serialize_graph(graph, filepath): }, } +BASE = NAMESPACES['dpv-gdpr'] +DPV_GDPR_GRAPH = Graph() +DPV_GDPR_GRAPH.add((BASE[''], RDF.type, SKOS.ConceptScheme)) + for name, module in DPV_GDPR_CSV_FILES.items(): graph = Graph() + DEBUG('------') + DEBUG(f'Processing {name} module') for prefix, namespace in NAMESPACES.items(): graph.namespace_manager.bind(prefix, namespace) if 'classes' in module: @@ -390,17 +467,58 @@ def serialize_graph(graph, filepath): properties = extract_terms_from_csv(module['properties'], DPV_Property) DEBUG(f'there are {len(properties)} properties in {name}') add_triples_for_properties(properties, graph) + # add collection representing concepts + graph.add((BASE[f'{name.title()}Concepts'], RDF.type, SKOS.Collection)) + graph.add((BASE[f'{name.title()}Concepts'], DCT.title, Literal(f'{name.title()} Concepts', datatype=XSD.string))) + for concept, _, _ in graph.triples((None, RDF.type, SKOS.Concept)): + graph.add((BASE[f'{name.title()}Concepts'], SKOS.member, concept)) + DPV_GDPR_GRAPH.add((concept, SKOS.inScheme, DPV_GDPR[''])) + # serialize serialize_graph(graph, f'{EXPORT_DPV_GDPR_MODULE_PATH}/{name}') + if 'topconcept' in module: + DPV_GDPR_GRAPH.add((BASE[''], SKOS.hasTopConcept, module['topconcept'])) DPV_GDPR_GRAPH += graph graph = Graph() -graph.load('dpv-gdpr-ontology-metadata.ttl', format='turtle') +graph.load('ontology_metadata/dpv-gdpr.ttl', format='turtle') DPV_GDPR_GRAPH += graph for prefix, namespace in NAMESPACES.items(): DPV_GDPR_GRAPH.namespace_manager.bind(prefix, namespace) serialize_graph(DPV_GDPR_GRAPH, f'{EXPORT_DPV_GDPR_PATH}/dpv-gdpr') +############################################################################## + +# DPV-PD # +# dpv-gdpr is the exact same as dpv in terms of requirements and structure +# except that the namespace is different +# so instead of rewriting the entire code again for dpv-gdpr, +# here I become lazy and instead change the DPV namespace to DPV-PD + +DPV_PD_CSV_FILES = f'{IMPORT_CSV_PATH}/dpv-pd.csv' + +BASE = NAMESPACES['dpv-pd'] +DPV_PD_GRAPH = Graph() + +DEBUG('------') +DEBUG(f'Processing DPV-PD') +for prefix, namespace in NAMESPACES.items(): + DPV_PD_GRAPH.namespace_manager.bind(prefix, namespace) +classes = extract_terms_from_csv(DPV_PD_CSV_FILES, DPV_Class) +DEBUG(f'there are {len(classes)} classes in {name}') +add_triples_for_classes(classes, DPV_PD_GRAPH) +# add collection representing concepts +DPV_PD_GRAPH.add((BASE[f'PersonalDataConcepts'], RDF.type, SKOS.Collection)) +DPV_PD_GRAPH.add((BASE[f'PersonalDataConcepts'], DCT.title, Literal(f'Personal Data Concepts', datatype=XSD.string))) +for concept, _, _ in DPV_PD_GRAPH.triples((None, RDF.type, SKOS.Concept)): + DPV_PD_GRAPH.add((BASE[f'PersonalDataConcepts'], SKOS.member, concept)) +# serialize +DPV_PD_GRAPH.load('ontology_metadata/dpv-pd.ttl', format='turtle') + +for prefix, namespace in NAMESPACES.items(): + DPV_PD_GRAPH.namespace_manager.bind(prefix, namespace) +serialize_graph(DPV_PD_GRAPH, f'{EXPORT_DPV_PD_PATH}/dpv-pd') + # ############################################################################# # Save collected links as resource for generating HTML A HREF in JINJA2 templates diff --git a/documentation-generator/002_parse_csv_to_rdf_owl.py b/documentation-generator/002_parse_csv_to_rdf_owl.py new file mode 100755 index 000000000..4903290c2 --- /dev/null +++ b/documentation-generator/002_parse_csv_to_rdf_owl.py @@ -0,0 +1,544 @@ +#!/usr/bin/env python3 +#author: Harshvardhan J. Pandit + +'''Take CSV and generate RDF from it''' + +######################################## +# How to read and understand this file # +# 1. Start from the end of the file +# 2. This script reads CSV files explicitly declared +# 3. It generates RDF terms using rdflib for Classes and Properties +# 4. It writes those terms to a file - one per each module +# 5. It combines all written files into dpv.ttl and dpv-gdpr.ttl + +# This script assumes the input if well structured and formatted +# If it isn't, the 'erors' may silently propogate + +# CSV FILES are in IMPORT_CSV_PATH +# RDF FILES are written to EXPORT_DPV_MODULE_PATH +######################################## + +IMPORT_CSV_PATH = './vocab_csv' +EXPORT_DPV_PATH = '../dpv-owl' +EXPORT_DPV_MODULE_PATH = '../dpv-owl/modules' +EXPORT_DPV_GDPR_PATH = '../dpv-owl/dpv-gdpr' +EXPORT_DPV_GDPR_MODULE_PATH = '../dpv-owl/dpv-gdpr/modules' +EXPORT_DPV_PD_PATH = '../dpv-owl/dpv-pd' + +# serializations in the form of extention: rdflib name +RDF_SERIALIZATIONS = { + 'rdf': 'xml', + 'ttl': 'turtle', + 'n3': 'n3', + 'jsonld': 'json-ld' + } + +VOCAB_TERM_ACCEPT = ('accepted', 'changed') +VOCAB_TERM_REJECT = ('deprecated', 'removed') + +import csv +from collections import namedtuple + +from rdflib import Graph, Namespace +from rdflib.compare import graph_diff +from rdflib.namespace import XSD +from rdflib import RDF, RDFS, OWL +from rdflib.term import Literal, URIRef, BNode + +import logging +# logging configuration for debugging to console +logging.basicConfig( + level=logging.DEBUG, format='%(levelname)s - %(funcName)s :: %(lineno)d - %(message)s') +DEBUG = logging.debug +INFO = logging.info + +# Namespaces are in two files: +# 1. Namespaces.csv for DPV issued namespaces +# 2. Namespaces_other for External namespaces + +DCT = Namespace('http://purl.org/dc/terms/') +FOAF = Namespace('http://xmlns.com/foaf/0.1/') +ODRL = Namespace('http://www.w3.org/ns/odrl/2/') +PROV = Namespace('http://www.w3.org/ns/prov#') +SKOS = Namespace('http://www.w3.org/2004/02/skos/core#') +SPL = Namespace('http://www.specialprivacy.eu/langs/usage-policy#') +SVD = Namespace('http://www.specialprivacy.eu/vocabs/data#') +SVDU = Namespace('http://www.specialprivacy.eu/vocabs/duration#') +SVL = Namespace('http://www.specialprivacy.eu/vocabs/locations#') +SVPR = Namespace('http://www.specialprivacy.eu/vocabs/processing#') +SVPU = Namespace('http://www.specialprivacy.eu/vocabs/purposes#') +SVR = Namespace('http://www.specialprivacy.eu/vocabs/recipients') +SW = Namespace('http://www.w3.org/2003/06/sw-vocab-status/ns#') +TIME = Namespace('http://www.w3.org/2006/time#') + +DPV = Namespace('https://w3id.org/dpv#') +DPV_NACE = Namespace('https://w3id.org/dpv/nace#') +DPV_GDPR = Namespace('https://w3id.org/dpv/gdpr#') +DPV_PD = Namespace('https://w3id.org/dpv/pd#') +DPVS = Namespace('https://w3id.org/dpv/dpv-skos#') +DPVS_GDPR = Namespace('https://w3id.org/dpv/dpv-skos/gdpr#') +DPVS_PD = Namespace('https://w3id.org/dpv/dpv-skos/pd#') +DPVO = Namespace('https://w3id.org/dpv/dpv-owl#') +DPVO_GDPR = Namespace('https://w3id.org/dpv/dpv-owl/gdpr#') +DPVO_PD = Namespace('https://w3id.org/dpv/dpv-owl/pd#') + +# The dpv namespace is the default base for all terms +# Later, this is changed to write terms under DPV-GDPR namespace +BASE = DPVO + +NAMESPACES = { + 'dct': DCT, + 'foaf': FOAF, + 'odrl': ODRL, + 'owl': OWL, + 'prov': PROV, + 'rdf': RDF, + 'rdfs': RDFS, + 'skos': SKOS, + 'spl': SPL, + 'svd': SVD, + 'svdu': SVDU, + 'svl': SVL, + 'svpr': SVPR, + 'svpu': SVPU, + 'svr': SVR, + 'sw': SW, + 'time': TIME, + 'xsd': XSD, + # DPV + 'dpv': DPV, + 'dpv-nace': DPV_NACE, + 'dpv-gdpr': DPV_GDPR, + 'dpv-pd': DPV_PD, + 'dpvs': DPVS, + 'dpvs-gdpr': DPVS_GDPR, + 'dpvs-pd': DPVS_PD, + 'dpvo': DPVO, + 'dpvo-gdpr': DPVO_GDPR, + 'dpvo-pd': DPVO_PD, +} +NAMESPACES_DPV_OWL = { + 'dpv': DPVO, + 'dpv-nace': DPV_NACE, + 'dpv-gdpr': DPVO_GDPR, + 'dpv-pd': DPVO_PD, + 'dpvo': DPVO, + 'dpvo-gdpr': DPVO_GDPR, + 'dpvo-pd': DPVO_PD, +} + +# the field labels are based on what they should be translated to + +DPV_Class = namedtuple('DPV_Class', [ + 'term', 'rdfs_label', 'dct_description', 'rdfs_subclassof', + 'rdfs_seealso', 'relation', 'rdfs_comment', 'rdfs_isdefinedby', + 'dct_created', 'dct_modified', 'sw_termstatus', 'dct_creator', + 'resolution']) + +DPV_Property = namedtuple('DPV_Property', [ + 'term', 'rdfs_label', 'dct_description', + 'rdfs_domain', 'rdfs_range', 'rdfs_subpropertyof', + 'rdfs_seealso', 'relation', 'rdfs_comment', 'rdfs_isdefinedby', + 'dct_created', 'dct_modified', 'sw_termstatus', 'dct_creator', + 'resolution']) + +LINKS = {} + + +def extract_terms_from_csv(filepath, Mapping): + '''extracts data from file.csv and creates instances of Class + returns list of Mapping-defined instances''' + # this is a hack to get parseable number of fields from CSV + # it relies on the internal data structure of a namedtuple + attributes = Mapping.__dict__ + attributes = len(attributes['_fields']) + with open(filepath) as fd: + csvreader = csv.reader(fd) + next(csvreader) + terms = [] + for row in csvreader: + # skip empty rows + if not row[0].strip(): + continue + # extract required amount of terms, ignore any field after that + row = [term.strip() for term in row[:attributes]] + # create instance of required class + terms.append(Mapping(*row)) + + return terms + + +def add_common_triples_for_all_terms(term, graph): + '''Adds triples for any term to graph + Common triples are those shared by Class and Property + terms: data structure of term; is object with attributes + graph: rdflib graph + returns: None''' + + # rdfs:label + graph.add((BASE[f'{term.term}'], RDFS.label, Literal(term.rdfs_label, lang='en'))) + # dct:description + graph.add((BASE[f'{term.term}'], DCT.description, Literal(term.dct_description, lang='en'))) + # rdfs:seeAlso + # TODO: use relation field for relevant terms + # currently this considers all terms that are related to use rdfs:seeAlso + # the next column contains the relation, parse and use that + if term.rdfs_seealso: + links = [l.strip() for l in term.rdfs_seealso.split(',')] + for link in links: + if link.startswith('http'): + graph.add((BASE[f'{term.term}'], RDFS.seeAlso, URIRef(link))) + elif ':' in link: + # assuming something like rdfs:Resource + prefix, label = link.split(':') + # gets the namespace from registered ones and create URI + # will throw an error if namespace is not registered + # dpv internal terms are expected to have the prefix i.e. dpv:term + link = NAMESPACES[prefix][f'{label}'] + graph.add((BASE[f'{term.term}'], RDFS.seeAlso, link)) + else: + graph.add((BASE[f'{term.term}'], RDFS.seeAlso, Literal(link, datatype=XSD.string))) + # rdfs:comment + if term.rdfs_comment: + graph.add((BASE[f'{term.term}'], RDFS.comment, Literal(term.rdfs_comment, lang='en'))) + # rdfs:isDefinedBy + if term.rdfs_isdefinedby: + links = [l.strip() for l in term.rdfs_isdefinedby.replace('(','').replace(')','').split(',')] + link_iterator = iter(links) + for label in link_iterator: + link = next(link_iterator) + # add link to a temp file so that the label can be displayed in HTML + if not link in LINKS: + LINKS[link] = label + # add link to graph + if link.startswith('http'): + graph.add((BASE[f'{term.term}'], DCT.source, URIRef(link))) + else: + graph.add((BASE[f'{term.term}'], DCT.source, Literal(link, datatype=XSD.string))) + # dct:created + graph.add((BASE[f'{term.term}'], DCT.created, Literal(term.dct_created, datatype=XSD.date))) + # dct:modified + if term.dct_modified: + graph.add((BASE[f'{term.term}'], DCT.modified, Literal(term.dct_modified, datatype=XSD.date))) + # sw:term_status + graph.add((BASE[f'{term.term}'], SW.term_status, Literal(term.sw_termstatus, lang='en'))) + # dct:creator + if term.dct_creator: + authors = [a.strip() for a in term.dct_creator.split(',')] + for author in authors: + graph.add((BASE[f'{term.term}'], DCT.creator, Literal(author, datatype=XSD.string))) + # is defined by this vocabulary + graph.add((BASE[f'{term.term}'], RDFS.isDefinedBy, BASE[''])) + # resolution + # do nothing + + return None + + +def add_triples_for_classes(classes, graph): + '''Adds triples for classes to graph + classes: list of CSV data rows + graph: rdflib graph + returns: None''' + + for cls in classes: + # only add accepted classes + if cls.sw_termstatus not in VOCAB_TERM_ACCEPT: + continue + # rdf:type + DEBUG(cls.term) + graph.add((BASE[f'{cls.term}'], RDF.type, OWL.Class)) + # rdfs:subClassOf + if cls.rdfs_subclassof: + parents = [p.strip() for p in cls.rdfs_subclassof.split(',')] + for parent in parents: + if parent == 'dpv:Concept': + continue + if parent.startswith('http'): + graph.add((BASE[f'{cls.term}'], RDFS.subClassOf, URIRef(parent))) + elif ':' in parent: + # assuming something like rdfs:Resource + prefix, term = parent.split(':') + prefix = prefix.replace("sc__", "") + # gets the namespace from registered ones and create URI + # will throw an error if namespace is not registered + # dpv internal terms are expected to have the prefix i.e. dpv:term + if 'o__' in prefix: + # explicit owl declaration + prefix = prefix.replace('o__') + parent = NAMESPACES[prefix][f'{term}'] + else: + parent = NAMESPACES_DPV_OWL[f'{prefix}'][f'{term}'] + DEBUG(f'has parent: {parent}') + graph.add((BASE[f'{cls.term}'], RDFS.subClassOf, parent)) + else: + graph.add((BASE[f'{cls.term}'], RDFS.subClassOf, Literal(parent, datatype=XSD.string))) + + add_common_triples_for_all_terms(cls, graph) + + return None + + +def add_triples_for_properties(properties, graph): + '''Adds triples for properties to graph + properties: list of CSV data rows + graph: rdflib graph + returns: None''' + + for prop in properties: + # only record accepted classes + if prop.sw_termstatus not in VOCAB_TERM_ACCEPT: + continue + # rdf:type + DEBUG(prop.term) + graph.add((BASE[f'{prop.term}'], RDF.type, RDF.Property)) + if prop.rdfs_domain or prop.rdfs_range: + graph.add((BASE[f'{prop.term}'], RDF.type, OWL.ObjectProperty)) + else: + graph.add((BASE[f'{prop.term}'], RDF.type, OWL.AnnotationProperty)) + # rdfs:domain + if prop.rdfs_domain: + # assuming something like rdfs:Resource + prefix, label = prop.rdfs_domain.split(':') + if 'o__' in prefix: + # explicit owl declaration + link = prefix.replace('o__') + link = NAMESPACES[prefix][f'{label}'] + elif prefix == 'dpv': + if label == 'Concept': + link = OWL.Thing + else: + link = NAMESPACES_DPV_OWL[f'{prefix}'][f'{label}'] + else: + link = NAMESPACES[prefix][f'{label}'] + # gets the namespace from registered ones and create URI + # will throw an error if namespace is not registered + # dpv internal terms are expected to have the prefix i.e. dpv:term + graph.add((BASE[f'{prop.term}'], RDFS.domain, link)) + # rdfs:range + if prop.rdfs_range: + # assuming something like rdfs:Resource + prefix, label = prop.rdfs_range.split(':') + if 'o__' in prefix: + # explicit owl declaration + link = prefix.replace('o__') + link = NAMESPACES[prefix][f'{label}'] + elif prefix == 'dpv': + if label == 'Concept': + link = OWL.Thing + else: + link = NAMESPACES_DPV_OWL[f'{prefix}'][f'{label}'] + else: + link = NAMESPACES[prefix][f'{label}'] + # gets the namespace from registered ones and create URI + # will throw an error if namespace is not registered + # dpv internal terms are expected to have the prefix i.e. dpv:term + graph.add((BASE[f'{prop.term}'], RDFS.range, link)) + # rdfs:subPropertyOf + if prop.rdfs_subpropertyof: + parents = [p.strip() for p in prop.rdfs_subpropertyof.split(',')] + for parent in parents: + if parent == 'dpv:Relation': + continue + if parent.startswith('http'): + graph.add((BASE[f'{prop.term}'], RDFS.subPropertyOf, URIRef(parent))) + elif ':' in parent: + # assuming something like rdfs:Resource + prefix, term = parent.split(':') + # gets the namespace from registered ones and create URI + # will throw an error if namespace is not registered + # dpv internal terms are expected to have the prefix i.e. dpv:term + if 'o__' in prefix: + # explicit owl declaration + parent = prefix.replace('o__') + parent = NAMESPACES[prefix][f'{term}'] + elif prefix == 'dpv': + parent = NAMESPACES_DPV_OWL[f'{prefix}'][f'{term}'] + else: + parent = NAMESPACES[prefix][f'{term}'] + graph.add((BASE[f'{prop.term}'], RDFS.subPropertyOf, parent)) + else: + graph.add((BASE[f'{prop.term}'], RDFS.subPropertyOf, Literal(parent, datatype=XSD.string))) + add_common_triples_for_all_terms(prop, graph) + + +def serialize_graph(graph, filepath): + '''serializes given graph at filepath with defined formats''' + for ext, format in RDF_SERIALIZATIONS.items(): + graph.serialize(f'{filepath}.{ext}', format=format) + INFO(f'wrote {filepath}.{ext}') + + +# ############################################################################# + +# DPV # + +DPV_CSV_FILES = { + 'base': { + 'classes': f'{IMPORT_CSV_PATH}/BaseOntology.csv', + 'properties': f'{IMPORT_CSV_PATH}/BaseOntology_properties.csv', + 'model': 'vocabulary', + }, + 'personal_data': { + 'classes': f'{IMPORT_CSV_PATH}/PersonalData.csv', + 'properties': f'{IMPORT_CSV_PATH}/PersonalData_properties.csv', + 'model': 'ontology', + 'topconcept': DPV.PersonalData, + }, + 'purposes': { + 'classes': f'{IMPORT_CSV_PATH}/Purpose.csv', + 'properties': f'{IMPORT_CSV_PATH}/Purpose_properties.csv', + 'model': 'taxonomy', + 'topconcept': DPV.Purpose, + }, + 'context': { + 'classes': f'{IMPORT_CSV_PATH}/Context.csv', + 'properties': f'{IMPORT_CSV_PATH}/Context_properties.csv', + 'model': 'taxonomy', + 'topconcept': DPV.Context, + }, + 'processing': { + 'classes': f'{IMPORT_CSV_PATH}/Processing.csv', + 'properties': f'{IMPORT_CSV_PATH}/Processing_properties.csv', + 'model': 'taxonomy', + 'topconcept': DPV.Processing, + }, + 'processing_context': { + 'classes': f'{IMPORT_CSV_PATH}/ProcessingContext.csv', + 'properties': f'{IMPORT_CSV_PATH}/ProcessingContext_properties.csv', + 'model': 'taxonomy', + }, + 'technical_organisational_measures': { + 'classes': f'{IMPORT_CSV_PATH}/TechnicalOrganisationalMeasure.csv', + 'properties': f'{IMPORT_CSV_PATH}/TechnicalOrganisationalMeasure_properties.csv', + 'model': 'taxonomy', + 'topconcept': DPV.TechnicalOrganisationalMeasure, + }, + 'entities': { + 'classes': f'{IMPORT_CSV_PATH}/Entities.csv', + 'properties': f'{IMPORT_CSV_PATH}/Entities_properties.csv', + 'model': 'ontology', + 'topconcept': DPV.Entity, + }, + 'jurisdictions': { + 'classes': f'{IMPORT_CSV_PATH}/Jurisdictions.csv', + 'properties': f'{IMPORT_CSV_PATH}/Jurisdictions_properties.csv', + 'model': 'ontology', + }, + 'legal_basis': { + 'classes': f'{IMPORT_CSV_PATH}/LegalBasis.csv', + 'properties': f'{IMPORT_CSV_PATH}/LegalBasis_properties.csv', + 'model': 'taxonomy', + 'topconcept': DPV.LegalBasis, + }, + 'consent': { + # 'classes': f'{IMPORT_CSV_PATH}/Consent.csv', + 'properties': f'{IMPORT_CSV_PATH}/Consent_properties.csv', + }, + } + +# this graph will get written to dpv.ttl +DPV_GRAPH = Graph() +for name, module in DPV_CSV_FILES.items(): + graph = Graph() + DEBUG('------') + DEBUG(f'Processing {name} module') + for prefix, namespace in NAMESPACES.items(): + graph.namespace_manager.bind(prefix, namespace) + if 'classes' in module: + classes = extract_terms_from_csv(module['classes'], DPV_Class) + DEBUG(f'there are {len(classes)} classes in {name}') + add_triples_for_classes(classes, graph) + if 'properties' in module: + properties = extract_terms_from_csv(module['properties'], DPV_Property) + DEBUG(f'there are {len(properties)} properties in {name}') + add_triples_for_properties(properties, graph) + # serialize + serialize_graph(graph, f'{EXPORT_DPV_MODULE_PATH}/{name}') + DPV_GRAPH += graph + +# add information about ontology +# this is assumed to be in file dpv-ontology-metadata.ttl +graph = Graph() +graph.load('ontology_metadata/dpv-owl.ttl', format='turtle') +DPV_GRAPH += graph + +for prefix, namespace in NAMESPACES.items(): + DPV_GRAPH.namespace_manager.bind(prefix, namespace) +serialize_graph(DPV_GRAPH, f'{EXPORT_DPV_PATH}/dpv') + +############################################################################## + +# DPV-GDPR # +# dpv-gdpr is the exact same as dpv in terms of requirements and structure +# except that the namespace is different +# so instead of rewriting the entire code again for dpv-gdpr, +# here I become lazy and instead change the DPV namespace to DPV-GDPR + +DPV_GDPR_CSV_FILES = { + 'legal_basis': { + 'classes': f'{IMPORT_CSV_PATH}/GDPR_LegalBasis.csv', + }, + 'rights': { + 'classes': f'{IMPORT_CSV_PATH}/GDPR_LegalRights.csv', + }, + 'data_transfers': { + 'classes': f'{IMPORT_CSV_PATH}/GDPR_DataTransfers.csv', + }, + } + +BASE = NAMESPACES['dpvo-gdpr'] +DPV_GDPR_GRAPH = Graph() + +for name, module in DPV_GDPR_CSV_FILES.items(): + graph = Graph() + DEBUG('------') + DEBUG(f'Processing {name} module') + for prefix, namespace in NAMESPACES.items(): + graph.namespace_manager.bind(prefix, namespace) + if 'classes' in module: + classes = extract_terms_from_csv(module['classes'], DPV_Class) + DEBUG(f'there are {len(classes)} classes in {name}') + add_triples_for_classes(classes, graph) + if 'properties' in module: + properties = extract_terms_from_csv(module['properties'], DPV_Property) + DEBUG(f'there are {len(properties)} properties in {name}') + add_triples_for_properties(properties, graph) + # serialize + serialize_graph(graph, f'{EXPORT_DPV_GDPR_MODULE_PATH}/{name}') + DPV_GDPR_GRAPH += graph + +graph = Graph() +graph.load('ontology_metadata/dpv-owl-gdpr.ttl', format='turtle') +DPV_GDPR_GRAPH += graph + +for prefix, namespace in NAMESPACES.items(): + DPV_GDPR_GRAPH.namespace_manager.bind(prefix, namespace) +serialize_graph(DPV_GDPR_GRAPH, f'{EXPORT_DPV_GDPR_PATH}/dpv-gdpr') + +############################################################################## + +# DPV-PD # +# dpv-gdpr is the exact same as dpv in terms of requirements and structure +# except that the namespace is different +# so instead of rewriting the entire code again for dpv-gdpr, +# here I become lazy and instead change the DPV namespace to DPV-PD + +DPV_PD_CSV_FILES = f'{IMPORT_CSV_PATH}/dpv-pd.csv' + +BASE = NAMESPACES['dpvo-pd'] +DPV_PD_GRAPH = Graph() + +DEBUG('------') +DEBUG(f'Processing DPV-PD') +for prefix, namespace in NAMESPACES.items(): + DPV_PD_GRAPH.namespace_manager.bind(prefix, namespace) +classes = extract_terms_from_csv(DPV_PD_CSV_FILES, DPV_Class) +DEBUG(f'there are {len(classes)} classes in {name}') +add_triples_for_classes(classes, DPV_PD_GRAPH) +# serialize +DPV_PD_GRAPH.load('ontology_metadata/dpv-owl-pd.ttl', format='turtle') + +for prefix, namespace in NAMESPACES.items(): + DPV_PD_GRAPH.namespace_manager.bind(prefix, namespace) +serialize_graph(DPV_PD_GRAPH, f'{EXPORT_DPV_PD_PATH}/dpv-pd') diff --git a/documentation-generator/002_parse_csv_to_rdf_skos.py b/documentation-generator/002_parse_csv_to_rdf_skos.py new file mode 100755 index 000000000..0806ab73d --- /dev/null +++ b/documentation-generator/002_parse_csv_to_rdf_skos.py @@ -0,0 +1,580 @@ +#!/usr/bin/env python3 +#author: Harshvardhan J. Pandit + +'''Take CSV and generate RDF from it''' + +######################################## +# How to read and understand this file # +# 1. Start from the end of the file +# 2. This script reads CSV files explicitly declared +# 3. It generates RDF terms using rdflib for Classes and Properties +# 4. It writes those terms to a file - one per each module +# 5. It combines all written files into dpv.ttl and dpv-gdpr.ttl + +# This script assumes the input if well structured and formatted +# If it isn't, the 'erors' may silently propogate + +# CSV FILES are in IMPORT_CSV_PATH +# RDF FILES are written to EXPORT_DPV_MODULE_PATH +######################################## + +IMPORT_CSV_PATH = './vocab_csv' +EXPORT_DPV_PATH = '../dpv-skos' +EXPORT_DPV_MODULE_PATH = '../dpv-skos/modules' +EXPORT_DPV_GDPR_PATH = '../dpv-skos/dpv-gdpr' +EXPORT_DPV_GDPR_MODULE_PATH = '../dpv-skos/dpv-gdpr/modules' +EXPORT_DPV_PD_PATH = '../dpv-skos/dpv-pd' + +# serializations in the form of extention: rdflib name +RDF_SERIALIZATIONS = { + 'rdf': 'xml', + 'ttl': 'turtle', + 'n3': 'n3', + 'jsonld': 'json-ld' + } + +VOCAB_TERM_ACCEPT = ('accepted', 'changed') +VOCAB_TERM_REJECT = ('deprecated', 'removed') + +import csv +from collections import namedtuple + +from rdflib import Graph, Namespace +from rdflib.compare import graph_diff +from rdflib.namespace import XSD +from rdflib import RDF, RDFS, OWL +from rdflib.term import Literal, URIRef, BNode + +import logging +# logging configuration for debugging to console +logging.basicConfig( + level=logging.DEBUG, format='%(levelname)s - %(funcName)s :: %(lineno)d - %(message)s') +DEBUG = logging.debug +INFO = logging.info + +# Namespaces are in two files: +# 1. Namespaces.csv for DPV issued namespaces +# 2. Namespaces_other for External namespaces + +DCT = Namespace('http://purl.org/dc/terms/') +FOAF = Namespace('http://xmlns.com/foaf/0.1/') +ODRL = Namespace('http://www.w3.org/ns/odrl/2/') +PROV = Namespace('http://www.w3.org/ns/prov#') +SKOS = Namespace('http://www.w3.org/2004/02/skos/core#') +SPL = Namespace('http://www.specialprivacy.eu/langs/usage-policy#') +SVD = Namespace('http://www.specialprivacy.eu/vocabs/data#') +SVDU = Namespace('http://www.specialprivacy.eu/vocabs/duration#') +SVL = Namespace('http://www.specialprivacy.eu/vocabs/locations#') +SVPR = Namespace('http://www.specialprivacy.eu/vocabs/processing#') +SVPU = Namespace('http://www.specialprivacy.eu/vocabs/purposes#') +SVR = Namespace('http://www.specialprivacy.eu/vocabs/recipients') +SW = Namespace('http://www.w3.org/2003/06/sw-vocab-status/ns#') +TIME = Namespace('http://www.w3.org/2006/time#') + +DPV = Namespace('https://w3id.org/dpv#') +DPV_NACE = Namespace('https://w3id.org/dpv/nace#') +DPV_GDPR = Namespace('https://w3id.org/dpv/gdpr#') +DPV_PD = Namespace('https://w3id.org/dpv/pd#') +DPVS = Namespace('https://w3id.org/dpv/dpv-skos#') +DPVS_GDPR = Namespace('https://w3id.org/dpv/dpv-skos/gdpr#') +DPVS_PD = Namespace('https://w3id.org/dpv/dpv-skos/pd#') +DPVO = Namespace('https://w3id.org/dpv/dpv-owl#') +DPVO_GDPR = Namespace('https://w3id.org/dpv/dpv-owl/gdpr#') +DPVO_PD = Namespace('https://w3id.org/dpv/dpv-owl/pd#') + +# The dpv namespace is the default base for all terms +# Later, this is changed to write terms under DPV-GDPR namespace +BASE = DPVS + +NAMESPACES = { + 'dct': DCT, + 'foaf': FOAF, + 'odrl': ODRL, + 'owl': OWL, + 'prov': PROV, + 'rdf': RDF, + 'rdfs': RDFS, + 'skos': SKOS, + 'spl': SPL, + 'svd': SVD, + 'svdu': SVDU, + 'svl': SVL, + 'svpr': SVPR, + 'svpu': SVPU, + 'svr': SVR, + 'sw': SW, + 'time': TIME, + 'xsd': XSD, + # DPV + 'dpv': DPV, + 'dpv-nace': DPV_NACE, + 'dpv-gdpr': DPV_GDPR, + 'dpv-pd': DPV_PD, + 'dpvs': DPVS, + 'dpvs-gdpr': DPVS_GDPR, + 'dpvs-pd': DPVS_PD, + 'dpvo': DPVO, + 'dpvo-gdpr': DPVO_GDPR, + 'dpvo-pd': DPVO_PD, +} +NAMESPACES_DPV_SKOS = { + 'dpv': DPVS, + 'dpv-nace': DPV_NACE, + 'dpv-gdpr': DPVS_GDPR, + 'dpv-pd': DPVS_PD, + 'dpvs': DPVS, + 'dpvs-gdpr': DPVS_GDPR, + 'dpvs-pd': DPVS_PD, + 'xsd': XSD, +} + +# the field labels are based on what they should be translated to + +DPV_Class = namedtuple('DPV_Class', [ + 'term', 'skos_prefLabel', 'skos_definition', 'dpv_isSubTypeOf', + 'skos_related', 'relation', 'skos_note', 'skos_scopeNote', + 'dct_created', 'dct_modified', 'sw_termstatus', 'dct_creator', + 'resolution']) +DPV_Property = namedtuple('DPV_Property', [ + 'term', 'skos_prefLabel', 'skos_definition', + 'rdfs_domain', 'rdfs_range', 'rdfs_subpropertyof', + 'skos_related', 'relation', 'skos_note', 'skos_scopeNote', + 'dct_created', 'dct_modified', 'sw_termstatus', 'dct_creator', + 'resolution']) + +LINKS = {} + + +def extract_terms_from_csv(filepath, Mapping): + '''extracts data from file.csv and creates instances of Class + returns list of Mapping-defined instances''' + # this is a hack to get parseable number of fields from CSV + # it relies on the internal data structure of a namedtuple + attributes = Mapping.__dict__ + attributes = len(attributes['_fields']) + with open(filepath) as fd: + csvreader = csv.reader(fd) + next(csvreader) + terms = [] + for row in csvreader: + # skip empty rows + if not row[0].strip(): + continue + # extract required amount of terms, ignore any field after that + row = [term.strip() for term in row[:attributes]] + # create instance of required class + terms.append(Mapping(*row)) + + return terms + + +def add_common_triples_for_all_terms(term, graph): + '''Adds triples for any term to graph + Common triples are those shared by Class and Property + terms: data structure of term; is object with attributes + graph: rdflib graph + returns: None''' + + graph.add((BASE[f'{term.term}'], RDF.type, SKOS.Concept)) + # rdfs:label + graph.add((BASE[f'{term.term}'], SKOS.prefLabel, Literal(term.skos_prefLabel, lang='en'))) + # dct:description + graph.add((BASE[f'{term.term}'], SKOS.definition, Literal(term.skos_definition, lang='en'))) + # rdfs:seeAlso + if term.skos_related: + links = [l.strip() for l in term.skos_related.split(',')] + for link in links: + if link.startswith('http'): + graph.add((BASE[f'{term.term}'], SKOS.related, URIRef(link))) + elif ':' in link: + # assuming something like rdfs:Resource + prefix, label = link.split(':') + # gets the namespace from registered ones and create URI + # will throw an error if namespace is not registered + # dpv internal terms are expected to have the prefix i.e. dpv:term + link = NAMESPACES[prefix][f'{label}'] + graph.add((BASE[f'{term.term}'], SKOS.related, link)) + else: + graph.add((BASE[f'{term.term}'], SKOS.related, Literal(link, datatype=XSD.string))) + # rdfs:comment + if term.skos_note: + graph.add((BASE[f'{term.term}'], SKOS.note, Literal(term.skos_note, lang='en'))) + # rdfs:isDefinedBy + if term.skos_scopeNote: + links = [l.strip() for l in term.skos_scopeNote.replace('(','').replace(')','').split(',')] + link_iterator = iter(links) + for label in link_iterator: + link = next(link_iterator) + # add link to a temp file so that the label can be displayed in HTML + if not link in LINKS: + LINKS[link] = label + # add link to graph + if link.startswith('http'): + graph.add((BASE[f'{term.term}'], DCT.source, URIRef(link))) + else: + graph.add((BASE[f'{term.term}'], DCT.source, Literal(link, datatype=XSD.string))) + # dct:created + graph.add((BASE[f'{term.term}'], DCT.created, Literal(term.dct_created, datatype=XSD.date))) + # dct:modified + if term.dct_modified: + graph.add((BASE[f'{term.term}'], DCT.modified, Literal(term.dct_modified, datatype=XSD.date))) + # sw:term_status + graph.add((BASE[f'{term.term}'], SW.term_status, Literal(term.sw_termstatus, lang='en'))) + # dct:creator + if term.dct_creator: + authors = [a.strip() for a in term.dct_creator.split(',')] + for author in authors: + graph.add((BASE[f'{term.term}'], DCT.creator, Literal(author, datatype=XSD.string))) + # is defined by this vocabulary + graph.add((BASE[f'{term.term}'], RDFS.isDefinedBy, BASE[''])) + # resolution + # do nothing + + return None + + +def add_triples_for_classes(classes, graph, model, topconcept): + '''Adds triples for classes to graph + classes: list of CSV data rows + graph: rdflib graph + returns: None''' + + for cls in classes: + # only add accepted classes + if cls.sw_termstatus not in VOCAB_TERM_ACCEPT: + continue + # rdf:type + DEBUG(cls.term) + graph.add((BASE[f'{cls.term}'], RDF.type, SKOS.Concept)) + graph.add((BASE[f'{cls.term}'], RDF.type, RDFS.Class)) + topconcept_term = topconcept.split('#')[1] if topconcept else '' + if model == 'taxonomy' and cls.term != topconcept_term: + if 'dpv:Concept' not in cls.dpv_isSubTypeOf: + graph.add((BASE[f'{cls.term}'], RDF.type, topconcept)) + # rdfs:subClassOf + if cls.dpv_isSubTypeOf: + parents = [p.strip() for p in cls.dpv_isSubTypeOf.split(',')] + for parent in parents: + if parent.startswith('http'): + graph.add((BASE[f'{cls.term}'], SKOS.broaderTransitive, URIRef(parent))) + elif ':' in parent: + if parent == "dpv:Concept": + continue + # assuming something like rdfs:Resource + prefix, term = parent.split(':') + prefix = prefix.replace("sc__", "") + # gets the namespace from registered ones and create URI + # will throw an error if namespace is not registered + # dpv internal terms are expected to have the prefix i.e. dpv:term + parent = NAMESPACES_DPV_SKOS[prefix][f'{term}'] + graph.add((BASE[f'{cls.term}'], SKOS.broaderTransitive, parent)) + if model == 'ontology': + graph.add((BASE[f'{cls.term}'], RDFS.subClassOf, parent)) + else: + graph.add((BASE[f'{cls.term}'], SKOS.broaderTransitive, Literal(parent, datatype=XSD.string))) + + add_common_triples_for_all_terms(cls, graph) + + return None + + +def add_triples_for_properties(properties, graph): + '''Adds triples for properties to graph + properties: list of CSV data rows + graph: rdflib graph + returns: None''' + + for prop in properties: + # only record accepted classes + if prop.sw_termstatus not in VOCAB_TERM_ACCEPT: + continue + # rdf:type + DEBUG(prop.term) + graph.add((BASE[f'{prop.term}'], RDF.type, RDF.Property)) + # rdfs:domain + if prop.rdfs_domain: + # assuming something like rdfs:Resource + prefix, label = prop.rdfs_domain.split(':') + if 'o__' in prefix: + link = prefix.replace('o__') + link = NAMESPACES[prefix][f'{label}'] + elif prefix == 'dpv': + if label == 'Concept': + link = None + else: + link = NAMESPACES_DPV_SKOS[f'{prefix}'][f'{label}'] + else: + link = NAMESPACES[prefix][f'{label}'] + # gets the namespace from registered ones and create URI + # will throw an error if namespace is not registered + # dpv internal terms are expected to have the prefix i.e. dpv:term + if link: + graph.add((BASE[f'{prop.term}'], RDFS.domain, link)) + # rdfs:range + if prop.rdfs_range: + # assuming something like rdfs:Resource + prefix, label = prop.rdfs_range.split(':') + if 'o__' in prefix: + # explicit owl declaration + link = prefix.replace('o__') + link = NAMESPACES[prefix][f'{label}'] + elif prefix == 'dpv': + if label == 'Concept': + link = None + else: + link = NAMESPACES_DPV_SKOS[f'{prefix}'][f'{label}'] + else: + link = NAMESPACES[prefix][f'{label}'] + # gets the namespace from registered ones and create URI + # will throw an error if namespace is not registered + # dpv internal terms are expected to have the prefix i.e. dpv:term + if link: + graph.add((BASE[f'{prop.term}'], RDFS.range, link)) + # rdfs:subPropertyOf + if prop.rdfs_subpropertyof: + parents = [p.strip() for p in prop.rdfs_subpropertyof.split(',')] + for parent in parents: + if parent.startswith('http'): + graph.add((BASE[f'{prop.term}'], RDFS.subPropertyOf, URIRef(parent))) + elif ':' in parent: + if parent == "dpv:Relation": + continue + # assuming something like rdfs:Resource + prefix, term = parent.split(':') + # gets the namespace from registered ones and create URI + # will throw an error if namespace is not registered + # dpv internal terms are expected to have the prefix i.e. dpv:term + parent = NAMESPACES_DPV_SKOS[prefix][f'{term}'] + graph.add((BASE[f'{prop.term}'], RDFS.subPropertyOf, parent)) + else: + graph.add((BASE[f'{prop.term}'], RDFS.subPropertyOf, Literal(parent, datatype=XSD.string))) + add_common_triples_for_all_terms(prop, graph) + + +def serialize_graph(graph, filepath): + '''serializes given graph at filepath with defined formats''' + for ext, format in RDF_SERIALIZATIONS.items(): + graph.serialize(f'{filepath}.{ext}', format=format) + INFO(f'wrote {filepath}.{ext}') + + +# ############################################################################# + +# DPV # + +DPV_CSV_FILES = { + 'base': { + 'classes': f'{IMPORT_CSV_PATH}/BaseOntology.csv', + 'properties': f'{IMPORT_CSV_PATH}/BaseOntology_properties.csv', + 'model': 'vocabulary', + 'topconcept': '', + }, + 'personal_data': { + 'classes': f'{IMPORT_CSV_PATH}/PersonalData.csv', + 'properties': f'{IMPORT_CSV_PATH}/PersonalData_properties.csv', + 'model': 'ontology', + 'topconcept': BASE['PersonalData'], + }, + 'purposes': { + 'classes': f'{IMPORT_CSV_PATH}/Purpose.csv', + 'properties': f'{IMPORT_CSV_PATH}/Purpose_properties.csv', + 'model': 'taxonomy', + 'topconcept': BASE['Purpose'], + }, + 'context': { + 'classes': f'{IMPORT_CSV_PATH}/Context.csv', + 'properties': f'{IMPORT_CSV_PATH}/Context_properties.csv', + 'model': 'taxonomy', + 'topconcept': BASE['Context'], + }, + 'processing': { + 'classes': f'{IMPORT_CSV_PATH}/Processing.csv', + 'properties': f'{IMPORT_CSV_PATH}/Processing_properties.csv', + 'model': 'taxonomy', + 'topconcept': BASE['Processing'], + }, + 'processing_context': { + 'classes': f'{IMPORT_CSV_PATH}/ProcessingContext.csv', + 'properties': f'{IMPORT_CSV_PATH}/ProcessingContext_properties.csv', + 'model': 'taxonomy', + 'topconcept': BASE['Context'], + }, + 'technical_organisational_measures': { + 'classes': f'{IMPORT_CSV_PATH}/TechnicalOrganisationalMeasure.csv', + 'properties': f'{IMPORT_CSV_PATH}/TechnicalOrganisationalMeasure_properties.csv', + 'model': 'taxonomy', + 'topconcept': BASE['TechnicalOrganisationalMeasure'], + }, + 'entities': { + 'classes': f'{IMPORT_CSV_PATH}/Entities.csv', + 'properties': f'{IMPORT_CSV_PATH}/Entities_properties.csv', + 'model': 'ontology', + 'topconcept': BASE['Entity'], + }, + 'jurisdictions': { + 'classes': f'{IMPORT_CSV_PATH}/Jurisdictions.csv', + 'properties': f'{IMPORT_CSV_PATH}/Jurisdictions_properties.csv', + 'model': 'ontology', + 'topconcept': '', + }, + 'legal_basis': { + 'classes': f'{IMPORT_CSV_PATH}/LegalBasis.csv', + 'properties': f'{IMPORT_CSV_PATH}/LegalBasis_properties.csv', + 'model': 'taxonomy', + 'topconcept': BASE['LegalBasis'], + }, + 'consent': { + # 'classes': f'{IMPORT_CSV_PATH}/Consent.csv', + 'properties': f'{IMPORT_CSV_PATH}/Consent_properties.csv', + 'model': 'vocabulary', + 'topconcept': '', + }, + } + +# this graph will get written to dpv.ttl +DPV_GRAPH = Graph() +DPV_GRAPH.add((BASE[''], RDF.type, SKOS.ConceptScheme)) +for name, module in DPV_CSV_FILES.items(): + graph = Graph() + DEBUG('------') + model = module['model'] + topconcept = module['topconcept'] + DEBUG(f'Processing {name} {model}') + for prefix, namespace in NAMESPACES.items(): + graph.namespace_manager.bind(prefix, namespace) + if 'classes' in module: + classes = extract_terms_from_csv(module['classes'], DPV_Class) + DEBUG(f'there are {len(classes)} classes in {name}') + add_triples_for_classes(classes, graph, model, topconcept) + if 'properties' in module: + properties = extract_terms_from_csv(module['properties'], DPV_Property) + DEBUG(f'there are {len(properties)} properties in {name}') + add_triples_for_properties(properties, graph) + # add collection representing concepts + graph.add((BASE[f'{name.title()}Concepts'], RDF.type, SKOS.Collection)) + graph.add((BASE[f'{name.title()}Concepts'], DCT.title, Literal(f'{name.title()} Concepts', datatype=XSD.string))) + for concept, _, _ in graph.triples((None, RDF.type, SKOS.Concept)): + graph.add((BASE[f'{name.title()}Concepts'], SKOS.member, concept)) + DPV_GRAPH.add((concept, SKOS.inScheme, DPV[''])) + # serialize + serialize_graph(graph, f'{EXPORT_DPV_MODULE_PATH}/{name}') + if 'topconcept': + DPV_GRAPH.add((BASE[''], SKOS.hasTopConcept, BASE[f'{topconcept}'])) + DPV_GRAPH += graph + +# add information about ontology +# this is assumed to be in file dpv-ontology-metadata.ttl +graph = Graph() +graph.load('ontology_metadata/dpv-skos.ttl', format='turtle') +DPV_GRAPH += graph + +for prefix, namespace in NAMESPACES.items(): + DPV_GRAPH.namespace_manager.bind(prefix, namespace) +serialize_graph(DPV_GRAPH, f'{EXPORT_DPV_PATH}/dpv') + +############################################################################## + +# DPV-GDPR # +# dpv-gdpr is the exact same as dpv in terms of requirements and structure +# except that the namespace is different +# so instead of rewriting the entire code again for dpv-gdpr, +# here I become lazy and instead change the DPV namespace to DPV-GDPR + +DPV_GDPR_CSV_FILES = { + 'legal_basis': { + 'classes': f'{IMPORT_CSV_PATH}/GDPR_LegalBasis.csv', + 'model': 'taxonomy', + 'topconcept': DPVS['LegalBasis'], + }, + 'rights': { + 'classes': f'{IMPORT_CSV_PATH}/GDPR_LegalRights.csv', + 'model': 'taxonomy', + 'topconcept': DPVS['DataSubjectRight'], + }, + 'data_transfers': { + 'classes': f'{IMPORT_CSV_PATH}/GDPR_DataTransfers.csv', + 'model': 'taxonomy', + 'topconcept': DPVS['TechnicalOrganisationalMeasure'], + }, + } + +BASE = NAMESPACES['dpvs-gdpr'] +DPV_GDPR_GRAPH = Graph() +DPV_GDPR_GRAPH.add((BASE[''], RDF.type, SKOS.ConceptScheme)) + +for name, module in DPV_GDPR_CSV_FILES.items(): + graph = Graph() + DEBUG('------') + model = module['model'] + topconcept = module['topconcept'] + DEBUG(f'Processing {name} module') + for prefix, namespace in NAMESPACES.items(): + graph.namespace_manager.bind(prefix, namespace) + if 'classes' in module: + classes = extract_terms_from_csv(module['classes'], DPV_Class) + DEBUG(f'there are {len(classes)} classes in {name}') + add_triples_for_classes(classes, graph, model, topconcept) + if 'properties' in module: + properties = extract_terms_from_csv(module['properties'], DPV_Property) + DEBUG(f'there are {len(properties)} properties in {name}') + add_triples_for_properties(properties, graph) + # add collection representing concepts + graph.add((BASE[f'{name.title()}Concepts'], RDF.type, SKOS.Collection)) + graph.add((BASE[f'{name.title()}Concepts'], DCT.title, Literal(f'{name.title()} Concepts', datatype=XSD.string))) + for concept, _, _ in graph.triples((None, RDF.type, SKOS.Concept)): + graph.add((BASE[f'{name.title()}Concepts'], SKOS.member, concept)) + DPV_GDPR_GRAPH.add((concept, SKOS.inScheme, DPV_GDPR[''])) + # serialize + serialize_graph(graph, f'{EXPORT_DPV_GDPR_MODULE_PATH}/{name}') + if 'topconcept': + DPV_GDPR_GRAPH.add((BASE[''], SKOS.hasTopConcept, BASE[f'topconcept'])) + DPV_GDPR_GRAPH += graph + +graph = Graph() +graph.load('ontology_metadata/dpv-skos-gdpr.ttl', format='turtle') +DPV_GDPR_GRAPH += graph + +for prefix, namespace in NAMESPACES.items(): + DPV_GDPR_GRAPH.namespace_manager.bind(prefix, namespace) +serialize_graph(DPV_GDPR_GRAPH, f'{EXPORT_DPV_GDPR_PATH}/dpv-gdpr') + +############################################################################## + +# DPV-PD # +# dpv-gdpr is the exact same as dpv in terms of requirements and structure +# except that the namespace is different +# so instead of rewriting the entire code again for dpv-gdpr, +# here I become lazy and instead change the DPV namespace to DPV-PD + +DPV_PD_CSV_FILES = f'{IMPORT_CSV_PATH}/dpv-pd.csv' + +BASE = NAMESPACES['dpvs-pd'] +DPV_PD_GRAPH = Graph() + +DEBUG('------') +DEBUG(f'Processing DPV-PD') +for prefix, namespace in NAMESPACES.items(): + DPV_PD_GRAPH.namespace_manager.bind(prefix, namespace) +classes = extract_terms_from_csv(DPV_PD_CSV_FILES, DPV_Class) +DEBUG(f'there are {len(classes)} classes in {name}') +add_triples_for_classes(classes, DPV_PD_GRAPH, model='taxonomy', topconcept=DPVS['PersonalData']) +# add collection representing concepts +DPV_PD_GRAPH.add((BASE[f'PersonalDataConcepts'], RDF.type, SKOS.Collection)) +DPV_PD_GRAPH.add((BASE[f'PersonalDataConcepts'], DCT.title, Literal(f'Personal Data Concepts', datatype=XSD.string))) +for concept, _, _ in DPV_PD_GRAPH.triples((None, RDF.type, SKOS.Concept)): + DPV_PD_GRAPH.add((BASE[f'PersonalDataConcepts'], SKOS.member, concept)) +# serialize +DPV_PD_GRAPH.load('ontology_metadata/dpv-skos-pd.ttl', format='turtle') + +for prefix, namespace in NAMESPACES.items(): + DPV_PD_GRAPH.namespace_manager.bind(prefix, namespace) +serialize_graph(DPV_PD_GRAPH, f'{EXPORT_DPV_PD_PATH}/dpv-pd') + +# ############################################################################# + +# Save collected links as resource for generating HTML A HREF in JINJA2 templates +# file is in jinja2_resources/links_labels.json + +import json +with open('jinja2_resources/links_label.json', 'w') as fd: + fd.write(json.dumps(LINKS)) \ No newline at end of file diff --git a/documentation-generator/003_generate_respec_html.py b/documentation-generator/003_generate_respec_html.py index 55cc9b43c..d633ee3bb 100755 --- a/documentation-generator/003_generate_respec_html.py +++ b/documentation-generator/003_generate_respec_html.py @@ -6,11 +6,13 @@ # The vocabularies are modular IMPORT_DPV_PATH = '../dpv/dpv.ttl' -IMPORT_DPV_MODULES_PATH = '../dpv/rdf' +IMPORT_DPV_MODULES_PATH = '../dpv/modules' EXPORT_DPV_HTML_PATH = '../dpv' IMPORT_DPV_GDPR_PATH = '../dpv-gdpr/dpv-gdpr.ttl' -IMPORT_DPV_GDPR_MODULES_PATH = '../dpv-gdpr/rdf' +IMPORT_DPV_GDPR_MODULES_PATH = '../dpv-gdpr/modules' EXPORT_DPV_GDPR_HTML_PATH = '../dpv-gdpr' +IMPORT_DPV_PD_PATH = '../dpv-pd/dpv-pd.ttl' +EXPORT_DPV_PD_HTML_PATH = '../dpv-pd' from rdflib import Graph, Namespace from rdflib import RDF, RDFS, OWL @@ -38,8 +40,8 @@ def load_data(label, filepath): G = DataGraph() G.load(g) G.graph.ns = { k:v for k,v in G.graph.namespaces() } - TEMPLATE_DATA[f'{label}_classes'] = G.get_instances_of('rdfs_Class') - TEMPLATE_DATA[f'{label}_properties'] = G.get_instances_of('rdf_Property') + TEMPLATE_DATA[f'{label}_classes'] = G.get_instances_of('dpv_Concept') + TEMPLATE_DATA[f'{label}_properties'] = G.get_instances_of('dpv_Relation') def prefix_this(item): @@ -96,11 +98,13 @@ def saved_label(item): # LOAD DATA load_data('core', f'{IMPORT_DPV_MODULES_PATH}/base.ttl') -load_data('personaldata', f'{IMPORT_DPV_MODULES_PATH}/personal_data_categories.ttl') +load_data('personaldata', f'{IMPORT_DPV_MODULES_PATH}/personal_data.ttl') load_data('purpose', f'{IMPORT_DPV_MODULES_PATH}/purposes.ttl') load_data('processing', f'{IMPORT_DPV_MODULES_PATH}/processing.ttl') load_data('technical_organisational_measures', f'{IMPORT_DPV_MODULES_PATH}/technical_organisational_measures.ttl') load_data('entities', f'{IMPORT_DPV_MODULES_PATH}/entities.ttl') +load_data('context', f'{IMPORT_DPV_MODULES_PATH}/context.ttl') +load_data('jurisdictions', f'{IMPORT_DPV_MODULES_PATH}/jurisdictions.ttl') load_data('legal_basis', f'{IMPORT_DPV_MODULES_PATH}/legal_basis.ttl') load_data('consent', f'{IMPORT_DPV_MODULES_PATH}/consent.ttl') g = Graph() @@ -134,4 +138,21 @@ def saved_label(item): fd.write(template.render(**TEMPLATE_DATA)) DEBUG(f'wrote DPV-GDPR spec at f{EXPORT_DPV_GDPR_HTML_PATH}/dpv-gdpr.html') + +# DPV-PD: generate HTML + +load_data('dpv_pd', f'{IMPORT_DPV_PD_PATH}') +g = Graph() +g.load(f'{IMPORT_DPV_PD_PATH}', format='turtle') +G.load(g) + +template = template_env.get_template('template_dpv_pd.jinja2') +with open(f'{EXPORT_DPV_PD_HTML_PATH}/index.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV-PD spec at f{EXPORT_DPV_PD_HTML_PATH}/index.html') +with open(f'{EXPORT_DPV_PD_HTML_PATH}/dpv-pd.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV-PD spec at f{EXPORT_DPV_PD_HTML_PATH}/dpv-pd.html') + + DEBUG('--- END ---') \ No newline at end of file diff --git a/documentation-generator/003_generate_respec_html_owl.py b/documentation-generator/003_generate_respec_html_owl.py new file mode 100755 index 000000000..d669579af --- /dev/null +++ b/documentation-generator/003_generate_respec_html_owl.py @@ -0,0 +1,158 @@ +#!/usr/bin/env python3 +#author: Harshvardhan J. Pandit + +'''Generates ReSpec documentation for DPV using RDF and SPARQL''' + +# The vocabularies are modular + +IMPORT_DPV_PATH = '../dpv-owl/dpv.ttl' +IMPORT_DPV_MODULES_PATH = '../dpv-owl/modules' +EXPORT_DPV_HTML_PATH = '../dpv-owl' +IMPORT_DPV_GDPR_PATH = '../dpv-owl/dpv-gdpr/dpv-gdpr.ttl' +IMPORT_DPV_GDPR_MODULES_PATH = '../dpv-owl/dpv-gdpr/modules' +EXPORT_DPV_GDPR_HTML_PATH = '../dpv-owl/dpv-gdpr' +IMPORT_DPV_PD_PATH = '../dpv-owl/dpv-pd/dpv-pd.ttl' +EXPORT_DPV_PD_HTML_PATH = '../dpv-owl/dpv-pd' + +from rdflib import Graph, Namespace +from rdflib import RDF, RDFS, OWL +from rdflib import URIRef +from rdform import DataGraph, RDFS_Resource +import logging +# logging configuration for debugging to console +logging.basicConfig( + level=logging.DEBUG, format='%(levelname)s - %(funcName)s :: %(lineno)d - %(message)s') +DEBUG = logging.debug + +TEMPLATE_DATA = {} + +G = DataGraph() + +with open('./jinja2_resources/links_label.json', 'r') as fd: + import json + LINKS_LABELS = json.load(fd) + + +def load_data(label, filepath): + DEBUG(f'loading data for {label}') + g = Graph() + g.load(filepath, format='turtle') + G = DataGraph() + G.load(g) + G.graph.ns = { k:v for k,v in G.graph.namespaces() } + TEMPLATE_DATA[f'{label}_classes'] = G.get_instances_of('owl_Class') + TEMPLATE_DATA[f'{label}_properties'] = G.get_instances_of('rdf_Property') + + +def prefix_this(item): + # DEBUG(f'item: {item} type: {type(item)}') + if type(item) is RDFS_Resource: + item = item.iri + elif type(item) is URIRef: + item = str(item) + if type(item) is str and item.startswith('http'): + iri = URIRef(item).n3(G.graph.namespace_manager) + else: + iri = item + if iri.count('_') > 0: + iri = iri.split('_', 1)[1] + # DEBUG(f'prefixed {item} to: {iri}') + return iri + + +def fragment_this(item): + if '#' not in item: + return item + return item.split('#')[-1] + + +def get_subclasses(item): + return G.subclasses(item) + + +def saved_label(item): + if type(item) is RDFS_Resource: + item = item.iri + if not type(item) is str: + item = str(item) + if item in LINKS_LABELS: + return LINKS_LABELS[item] + return item + + +# JINJA2 for templating and generating HTML +from jinja2 import FileSystemLoader, Environment +JINJA2_FILTERS = { + 'fragment_this': fragment_this, + 'prefix_this': prefix_this, + 'subclasses': get_subclasses, + 'saved_label': saved_label, +} + +template_loader = FileSystemLoader(searchpath='./jinja2_resources') +template_env = Environment( + loader=template_loader, + autoescape=True, trim_blocks=True, lstrip_blocks=True) +template_env.filters.update(JINJA2_FILTERS) + + +# LOAD DATA +load_data('core', f'{IMPORT_DPV_MODULES_PATH}/base.ttl') +load_data('personaldata', f'{IMPORT_DPV_MODULES_PATH}/personal_data.ttl') +load_data('purpose', f'{IMPORT_DPV_MODULES_PATH}/purposes.ttl') +load_data('processing', f'{IMPORT_DPV_MODULES_PATH}/processing.ttl') +load_data('technical_organisational_measures', f'{IMPORT_DPV_MODULES_PATH}/technical_organisational_measures.ttl') +load_data('entities', f'{IMPORT_DPV_MODULES_PATH}/entities.ttl') +load_data('context', f'{IMPORT_DPV_MODULES_PATH}/context.ttl') +load_data('jurisdictions', f'{IMPORT_DPV_MODULES_PATH}/jurisdictions.ttl') +load_data('legal_basis', f'{IMPORT_DPV_MODULES_PATH}/legal_basis.ttl') +load_data('consent', f'{IMPORT_DPV_MODULES_PATH}/consent.ttl') +g = Graph() +g.load(f'{IMPORT_DPV_PATH}', format='turtle') +G.load(g) + +# DPV: generate HTML + +template = template_env.get_template('template_dpv_owl.jinja2') +with open(f'{EXPORT_DPV_HTML_PATH}/index.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV spec at f{EXPORT_DPV_HTML_PATH}/index.html') +with open(f'{EXPORT_DPV_HTML_PATH}/dpv.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV spec at f{EXPORT_DPV_HTML_PATH}/dpv.html') + +# DPV-GDPR: generate HTML + +load_data('legal_basis', f'{IMPORT_DPV_GDPR_MODULES_PATH}/legal_basis.ttl') +load_data('rights', f'{IMPORT_DPV_GDPR_MODULES_PATH}/rights.ttl') +load_data('data_transfers', f'{IMPORT_DPV_GDPR_MODULES_PATH}/data_transfers.ttl') +g = Graph() +g.load(f'{IMPORT_DPV_GDPR_PATH}', format='turtle') +G.load(g) + +template = template_env.get_template('template_dpv_gdpr_owl.jinja2') +with open(f'{EXPORT_DPV_GDPR_HTML_PATH}/index.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV-GDPR spec at f{EXPORT_DPV_GDPR_HTML_PATH}/index.html') +with open(f'{EXPORT_DPV_GDPR_HTML_PATH}/dpv-gdpr.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV-GDPR spec at f{EXPORT_DPV_GDPR_HTML_PATH}/dpv-gdpr.html') + + +# DPV-PD: generate HTML + +load_data('dpv_pd', f'{IMPORT_DPV_PD_PATH}') +g = Graph() +g.load(f'{IMPORT_DPV_PD_PATH}', format='turtle') +G.load(g) + +template = template_env.get_template('template_dpv_pd_owl.jinja2') +with open(f'{EXPORT_DPV_PD_HTML_PATH}/index.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV-PD spec at f{EXPORT_DPV_PD_HTML_PATH}/index.html') +with open(f'{EXPORT_DPV_PD_HTML_PATH}/dpv-pd.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV-PD spec at f{EXPORT_DPV_PD_HTML_PATH}/dpv-pd.html') + + +DEBUG('--- END ---') \ No newline at end of file diff --git a/documentation-generator/003_generate_respec_html_skos.py b/documentation-generator/003_generate_respec_html_skos.py new file mode 100755 index 000000000..681130066 --- /dev/null +++ b/documentation-generator/003_generate_respec_html_skos.py @@ -0,0 +1,158 @@ +#!/usr/bin/env python3 +#author: Harshvardhan J. Pandit + +'''Generates ReSpec documentation for DPV using RDF and SPARQL''' + +# The vocabularies are modular + +IMPORT_DPV_PATH = '../dpv-skos/dpv.ttl' +IMPORT_DPV_MODULES_PATH = '../dpv-skos/modules' +EXPORT_DPV_HTML_PATH = '../dpv-skos' +IMPORT_DPV_GDPR_PATH = '../dpv-skos/dpv-gdpr/dpv-gdpr.ttl' +IMPORT_DPV_GDPR_MODULES_PATH = '../dpv-skos/dpv-gdpr/modules' +EXPORT_DPV_GDPR_HTML_PATH = '../dpv-skos/dpv-gdpr' +IMPORT_DPV_PD_PATH = '../dpv-skos/dpv-pd/dpv-pd.ttl' +EXPORT_DPV_PD_HTML_PATH = '../dpv-skos/dpv-pd' + +from rdflib import Graph, Namespace +from rdflib import RDF, RDFS, OWL +from rdflib import URIRef +from rdform import DataGraph, RDFS_Resource +import logging +# logging configuration for debugging to console +logging.basicConfig( + level=logging.DEBUG, format='%(levelname)s - %(funcName)s :: %(lineno)d - %(message)s') +DEBUG = logging.debug + +TEMPLATE_DATA = {} + +G = DataGraph() + +with open('./jinja2_resources/links_label.json', 'r') as fd: + import json + LINKS_LABELS = json.load(fd) + + +def load_data(label, filepath): + DEBUG(f'loading data for {label}') + g = Graph() + g.load(filepath, format='turtle') + G = DataGraph() + G.load(g) + G.graph.ns = { k:v for k,v in G.graph.namespaces() } + TEMPLATE_DATA[f'{label}_classes'] = G.get_instances_of('rdfs_Class') + TEMPLATE_DATA[f'{label}_properties'] = G.get_instances_of('rdf_Property') + + +def prefix_this(item): + # DEBUG(f'item: {item} type: {type(item)}') + if type(item) is RDFS_Resource: + item = item.iri + elif type(item) is URIRef: + item = str(item) + if type(item) is str and item.startswith('http'): + iri = URIRef(item).n3(G.graph.namespace_manager) + else: + iri = item + if iri.count('_') > 0: + iri = iri.split('_', 1)[1] + # DEBUG(f'prefixed {item} to: {iri}') + return iri + + +def fragment_this(item): + if '#' not in item: + return item + return item.split('#')[-1] + + +def get_subclasses(item): + return G.subclasses(item) + + +def saved_label(item): + if type(item) is RDFS_Resource: + item = item.iri + if not type(item) is str: + item = str(item) + if item in LINKS_LABELS: + return LINKS_LABELS[item] + return item + + +# JINJA2 for templating and generating HTML +from jinja2 import FileSystemLoader, Environment +JINJA2_FILTERS = { + 'fragment_this': fragment_this, + 'prefix_this': prefix_this, + 'subclasses': get_subclasses, + 'saved_label': saved_label, +} + +template_loader = FileSystemLoader(searchpath='./jinja2_resources') +template_env = Environment( + loader=template_loader, + autoescape=True, trim_blocks=True, lstrip_blocks=True) +template_env.filters.update(JINJA2_FILTERS) + + +# LOAD DATA +load_data('core', f'{IMPORT_DPV_MODULES_PATH}/base.ttl') +load_data('personaldata', f'{IMPORT_DPV_MODULES_PATH}/personal_data.ttl') +load_data('purpose', f'{IMPORT_DPV_MODULES_PATH}/purposes.ttl') +load_data('processing', f'{IMPORT_DPV_MODULES_PATH}/processing.ttl') +load_data('technical_organisational_measures', f'{IMPORT_DPV_MODULES_PATH}/technical_organisational_measures.ttl') +load_data('entities', f'{IMPORT_DPV_MODULES_PATH}/entities.ttl') +load_data('context', f'{IMPORT_DPV_MODULES_PATH}/context.ttl') +load_data('jurisdictions', f'{IMPORT_DPV_MODULES_PATH}/jurisdictions.ttl') +load_data('legal_basis', f'{IMPORT_DPV_MODULES_PATH}/legal_basis.ttl') +load_data('consent', f'{IMPORT_DPV_MODULES_PATH}/consent.ttl') +g = Graph() +g.load(f'{IMPORT_DPV_PATH}', format='turtle') +G.load(g) + +# DPV: generate HTML + +template = template_env.get_template('template_dpv_skos.jinja2') +with open(f'{EXPORT_DPV_HTML_PATH}/index.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV spec at f{EXPORT_DPV_HTML_PATH}/index.html') +with open(f'{EXPORT_DPV_HTML_PATH}/dpv.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV spec at f{EXPORT_DPV_HTML_PATH}/dpv.html') + +# DPV-GDPR: generate HTML + +load_data('legal_basis', f'{IMPORT_DPV_GDPR_MODULES_PATH}/legal_basis.ttl') +load_data('rights', f'{IMPORT_DPV_GDPR_MODULES_PATH}/rights.ttl') +load_data('data_transfers', f'{IMPORT_DPV_GDPR_MODULES_PATH}/data_transfers.ttl') +g = Graph() +g.load(f'{IMPORT_DPV_GDPR_PATH}', format='turtle') +G.load(g) + +template = template_env.get_template('template_dpv_gdpr_skos.jinja2') +with open(f'{EXPORT_DPV_GDPR_HTML_PATH}/index.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV-GDPR spec at f{EXPORT_DPV_GDPR_HTML_PATH}/index.html') +with open(f'{EXPORT_DPV_GDPR_HTML_PATH}/dpv-gdpr.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV-GDPR spec at f{EXPORT_DPV_GDPR_HTML_PATH}/dpv-gdpr.html') + + +# DPV-PD: generate HTML + +load_data('dpv_pd', f'{IMPORT_DPV_PD_PATH}') +g = Graph() +g.load(f'{IMPORT_DPV_PD_PATH}', format='turtle') +G.load(g) + +template = template_env.get_template('template_dpv_pd_skos.jinja2') +with open(f'{EXPORT_DPV_PD_HTML_PATH}/index.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV-PD spec at f{EXPORT_DPV_PD_HTML_PATH}/index.html') +with open(f'{EXPORT_DPV_PD_HTML_PATH}/dpv-pd.html', 'w+') as fd: + fd.write(template.render(**TEMPLATE_DATA)) +DEBUG(f'wrote DPV-PD spec at f{EXPORT_DPV_PD_HTML_PATH}/dpv-pd.html') + + +DEBUG('--- END ---') \ No newline at end of file diff --git a/documentation-generator/903_html.sh b/documentation-generator/903_html.sh new file mode 100755 index 000000000..8959b97c9 --- /dev/null +++ b/documentation-generator/903_html.sh @@ -0,0 +1,10 @@ +#!/usr/bin/env bash +#author: Harshvardhan J. Pandit + +#execute DPV HTML generation scripts + +# Step3: generate HTML +./003_generate_respec_html.py +./003_generate_respec_html_skos.py +./003_generate_respec_html_owl.py + diff --git a/documentation-generator/999_all.sh b/documentation-generator/999_all.sh new file mode 100755 index 000000000..f3a349d66 --- /dev/null +++ b/documentation-generator/999_all.sh @@ -0,0 +1,18 @@ +#!/usr/bin/env bash +#author: Harshvardhan J. Pandit + +#execute all DPV generation scripts + +# Step1: download files +# ./001_download_vocab_in_csv.py + +# Step2: generate RDF +./002_parse_csv_to_rdf.py +./002_parse_csv_to_rdf_skos.py +./002_parse_csv_to_rdf_owl.py + +# Step3: generate HTML +./003_generate_respec_html.py +./003_generate_respec_html_skos.py +./003_generate_respec_html_owl.py + diff --git a/documentation-generator/README.md b/documentation-generator/README.md index 62d534662..99a872937 100644 --- a/documentation-generator/README.md +++ b/documentation-generator/README.md @@ -4,7 +4,7 @@ Downloads the CSV data for DPV and other vocabularies (such as DPV-GDPR), conver Requires: `python3` and modules `rdflib`, `rdflib-jsonld`, `jinja2` -The Data Privacy Vocabulary (DPV) is available at https://www.w3.org/ns/dpv and its repository is at https://github.com/w3c/dpv. +The Data Privacy Vocabulary (DPV) is available at https://www.w3id.org/dpv and its repository is at https://github.com/w3c/dpv. ## Quick Summary diff --git a/documentation-generator/changelog.py b/documentation-generator/changelog.py index 9d6e6b6db..39eb9375a 100755 --- a/documentation-generator/changelog.py +++ b/documentation-generator/changelog.py @@ -9,8 +9,8 @@ GITHUB_DPV_RAW = f'{GITHUB_REPO_RAW}dpv/rdf/' GITHUB_GDPR_RAW = f'{GITHUB_REPO_RAW}dpv-gdpr/rdf/' -LOCAL_DPV = '../dpv/rdf/' -LOCAL_GDPR = '../dpv-gdpr/rdf/' +LOCAL_DPV = '../dpv/modules/' +LOCAL_GDPR = '../dpv-gdpr/modules/' DPV_MODULES = ( 'base', @@ -68,7 +68,7 @@ def compare_iterations(old, new): old = download_file_to_rdf_graph(f'{GITHUB_DPV_RAW}{module}.ttl') new = Graph() new.load(f'{LOCAL_DPV}{module}.ttl', format='turtle') - added, removed, changed = compare_iterations(old, new) + removed, added, changed = compare_iterations(old, new) print(f'added: {len(added)} ; removed: {len(removed)} ; changed: {len(changed)}') if removed: print('\nTerms Removed') @@ -92,7 +92,7 @@ def compare_iterations(old, new): old = download_file_to_rdf_graph(f'{GITHUB_GDPR_RAW}{module}.ttl') new = Graph() new.load(f'{LOCAL_GDPR}{module}.ttl', format='turtle') - added, removed, changed = compare_iterations(old, new) + removed, added, changed = compare_iterations(old, new) print(f'added: {len(added)} ; removed: {len(removed)} ; changed: {len(changed)}') if removed: print('\nTerms Removed') diff --git a/documentation-generator/changelog.txt b/documentation-generator/changelog.txt index d71ef9354..00f07ae3d 100644 --- a/documentation-generator/changelog.txt +++ b/documentation-generator/changelog.txt @@ -1,6 +1,20 @@ --- DPV --- MODULE: base -added: 0 ; removed: 0 ; changed: 0 +added: 12 ; removed: 0 ; changed: 0 + +Terms Added +TechnicalOrganisationalMeasure +DataController +DataSubject +Risk +Purpose +DataSubjectRight +PersonalDataCategory +Right +Recipient +PersonalDataHandling +LegalBasis +Processing MODULE: personal_data_categories @@ -8,177 +22,290 @@ added: 0 ; removed: 0 ; changed: 0 MODULE: purposes -added: 27 ; removed: 5 ; changed: 9 - -Terms Removed -UsageAnalytics -SellTargettedAdvertisements -Security -CommercialInterest -AccessControl +added: 64 ; removed: 0 ; changed: 0 Terms Added -EnforceAccessControl -Personalisation -VendorSelectionAssessment -ServiceRecordManagement -CustomerOrderManagement -RecordManagement -CustomerSolvencyMonitoring -TechnicalServiceProvision -PublicRelations -CustomerRelationshipManagement -VendorRecordsManagement -SellProducts -EnforceSecurity +ImproveInternalCRMProcesses RequestedServiceProvision -MemberPartnerManagement +DirectMarketing +NonCommercialResearch +DeliveryOfGoods +ResearchAndDevelopment +ServiceProvision +SellProductsToDataSubject +InternalResourceOptimisation +ServicePersonalization +CommercialResearch +SocialMediaMarketing VendorPayment -VendorManagement -AccountManagement -CustomerManagement -OrganisationComplianceManagement +CustomerOrderManagement ServiceUsageAnalytics -CommunicationManagement +VendorSelectionAssessment +PublicRelations HumanResourceManagement +CommunicationManagement +SellProducts OrganisationRiskManagement -OrganisationGovernance +Payment +Advertising +SellInsightsFromData +ServiceRecordManagement +CustomerManagement +OrganisationComplianceManagement +AcademicResearch +Marketing +UserInterfacePersonalisation CustomerClaimsManagement -DisputeManagement - -Terms Changed EnforceAccessControl -EnforceSecurity -Marketing -SellDataToThirdParties -SellInsightsFromData -SellProductsToDataSubject ServiceOptimization -ServicePersonalization -ServiceUsageAnalytics +FraudPreventionAndDetection +MemberPartnerManagement +OptimisationForConsumer +SellDataToThirdParties +CustomerCare +Personalisation +CreateEventRecommendations +AccountManagement +RecordManagement +RegistrationAuthentication +VendorManagement +EnforceSecurity +CommunicationForCustomerCare +PersonalisedBenefits +OptimisationForController +VendorRecordsManagement +Sector +CreatePersonalizedRecommendations +CustomerRelationshipManagement +LegalCompliance +OptimiseUserInterface +IncreaseServiceRobustness +CreateProductRecommendations +ImproveExistingProductsAndServices +Context +DisputeManagement +PersonalisedAdvertising +IdentityVerification +OrganisationGovernance +TechnicalServiceProvision +CustomerSolvencyMonitoring MODULE: processing -added: 0 ; removed: 0 ; changed: 1 +added: 40 ; removed: 0 ; changed: 0 -Terms Changed -hasConsequence +Terms Added +Retrieve +Disclose +Remove +Transfer +Restrict +Disseminate +Organise +Combine +MatchingCombining +Acquire +Copy +Alter +Obtain +MakeAvailable +Profiling +Collect +Derive +LargeScaleProcessing +Store +Transform +Align +Destruct +SystematicMonitoring +Transmit +InnovativeUseOfNewTechnologies +Erase +Adapt +DataSource +DiscloseByTransmission +AutomatedDecisionMaking +Analyse +PseudoAnonymise +Record +Move +EvaluationScoring +Share +Consult +Structure +Use +Anonymise MODULE: technical_organisational_measures -added: 12 ; removed: 1 ; changed: 1 - -Terms Removed -Contract +added: 48 ; removed: 0 ; changed: 0 Terms Added +DPIA +LegalAgreement +NDA +DeIdentification +ConsultationWithAuthority +RegisterOfProcessingActivities +TechnicalMeasure +RegularityOfRecertification +DataTransferImpactAssessment +PrivacyByDesign +CodeOfConduct +Certification +PrivacyByDefault +AuthorisationProcedure +Seal +OrganisationalMeasure Notice -DataProcessingRecords -RecordsOfActivities -ContractualTerms -Policy +StaffTraining +PIA +DesignStandard Assessment -LegitimateInterestAssessment +ContractualTerms +StorageDeletion +RiskMitigationMeasure +AccessControlMethod +RecordsOfActivities +AuthenticationProtocols SafeguardForDataTransfer -RegisterOfProcessingActivities -DataTransferImpactAssessment +CertificationSeal +StorageDuration +Anonymization +LegitimateInterestAssessment PrivacyNotice +DataProcessingRecords +GuidelinesPrinciple +PseudonymisationEncryption +StorageLocation Safeguard - -Terms Changed -ContractualTerms +EncryptionInRest +StorageRestriction +RiskManagementProcedure +StorageRestoration +SingleSignOn +EncryptionInTransfer +ImpactAssessment +PseudoAnonymization +Policy +Consultation MODULE: entities -added: 2 ; removed: 0 ; changed: 0 +added: 12 ; removed: 0 ; changed: 0 Terms Added +VulnerableDataSubject +ThirdParty +DataProcessor +Authority +Child DataImporter DataExporter +DataProtectionAuthority +LegalEntity +Representative +DataSubProcessor +DataProtectionOfficer MODULE: legal_basis added: 14 ; removed: 0 ; changed: 0 Terms Added +VitalInterestOfNaturalPerson +OfficialAuthorityOfController VitalInterestOfDataSubject +Contract +DataTransferLegalBasis +EnterIntoContract +LegitimateInterestOfController LegitimateInterest +LegalObligation LegitimateInterestOfThirdParty -OfficialAuthorityOfController -Consent -EnterIntoContract -ContractPerformance -Contract -VitalInterestOfNaturalPerson PublicInterest -LegalObligation VitalInterest -LegitimateInterestOfController -DataTransferLegalBasis +Consent +ContractPerformance MODULE: consent -added: 0 ; removed: 1 ; changed: 0 - -Terms Removed -Consent +added: 0 ; removed: 0 ; changed: 0 --- DPV-GDPR --- MODULE: legal_basis -added: 0 ; removed: 0 ; changed: 30 +added: 34 ; removed: 0 ; changed: 0 -Terms Changed -A45-3 -A46-2-a -A46-2-b +Terms Added +A9-2-f +A6-1-a-non-explicit-consent A46-2-c -A46-2-d -A46-2-e -A46-2-f -A46-3-a +A9-2-j +A9-2-a +A9-2-b A46-3-b -A49-1-a -A49-1-b A49-1-c -A49-1-d +A9-2-i +A6-1-c +A45-3 A49-1-e -A49-1-f -A49-1-g A49-2 -A6-1-a-explicit-consent -A6-1-a-non-explicit-consent -A6-1-b -A6-1-c +A46-2-b +A9-2-g +A49-1-a +A9-2-e A6-1-d A6-1-e -A6-1-f -A9-2-a +A46-2-d +A6-1-a-explicit-consent +A49-1-d A9-2-c +A49-1-g +A49-1-f +A6-1-f A9-2-d -A9-2-g -A9-2-i -A9-2-j +A6-1-b +A46-2-a +A46-3-a +A49-1-b +A9-2-h +A46-2-e +A46-2-f MODULE: rights -added: 0 ; removed: 0 ; changed: 0 +added: 12 ; removed: 0 ; changed: 0 + +Terms Added +A22 +A17 +A15 +A77 +A16 +A18 +A7-3 +A21 +A13 +A14 +A20 +A19 MODULE: data_transfers added: 9 ; removed: 0 ; changed: 0 Terms Added -SupplementaryMeasure +SCCByCommission +SCCBySupervisoryAuthority CertificationMechanismsForDataTransfers StandardContractualClauses +AdHocContractualClauses CodesOfConductForDataTransfers +SupplementaryMeasure BindingCorporateRules DataTransferTool -SCCByCommission -AdHocContractualClauses -SCCBySupervisoryAuthority diff --git a/documentation-generator/jinja2_resources/links_label.json b/documentation-generator/jinja2_resources/links_label.json index aee13a05e..469cc8f06 100644 --- a/documentation-generator/jinja2_resources/links_label.json +++ b/documentation-generator/jinja2_resources/links_label.json @@ -1 +1 @@ -{"https://www.specialprivacy.eu/": "SPECIAL Project", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_1/oj": "GDPR Art.4-1g", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_9/oj": "GDPR Art.4-9g", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_7/oj": "GDPR Art.4-7g", "https://enterprivacy.com/wp-content/uploads/2018/09/Categories-of-Personal-Information.pdf": "EnterPrivacy Categories of Personal Information", "https://www.w3.org/community/dpvcg/": "DPVCG", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_1/oj": "GDPR Art.9-1", "https://www.privacycommission.be/nl/model-voor-een-register-van-de-verwerkingsactiviteiten": "Belgian DPA ROPA Template", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_2/oj": "GDPR Art.4-2", "https://www.specialprivacy.eu/vocabs/processing": "SPECIAL Project", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_5/oj": "GDPR Art.4-5", "https://www.iso.org/iso-31000-risk-management.html": "ISO 31000", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_8/oj": "GDPR Art.4-8", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_10/oj": "GDPR Art.4-10", "https://eur-lex.europa.eu/eli/reg/2016/679/art_37/oj": "GDPR Art.37", "https://eur-lex.europa.eu/eli/reg/2016/679/art_27/oj": "GDPR Art.27", "https://edpb.europa.eu/our-work-tools/our-documents/recommendations/recommendations-012020-measures-supplement-transfer_en": "EDPB Recommendations 01/2020 on Data Transfers", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_a/oj": "GDPR Art.6-1a", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_b/oj": "GDPR Art.6-1b", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_c/oj": "GDPR Art.6-1c", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_d/oj": "GDPR Art.6-1d", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_e/oj": "GDPR Art.6-1e", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_f/oj": "GDPR Art.6-1f", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_a/oj": "GDPR Art.9-2a", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_b/oj": "GDPR Art.9-2b", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_c/oj": "GDPR Art.9-2c", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_d/oj": "GDPR Art.9-2d", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_e/oj": "GDPR Art.9-2e", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_f/oj": "GDPR Art.9-2f", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_g/oj": "GDPR Art.9-2g", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_h/oj": "GDPR Art.9-2h", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_i/oj": "GDPR Art.9-2i", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_j/oj": "GDPR Art.9-2j", "https://eur-lex.europa.eu/eli/reg/2016/679/art_45/par_3/oj": "GDPR Art.45-3", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_a/oj": "GDPR Art.46-2a", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_b/oj": "GDPR Art.46-2b", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_c/oj": "GDPR Art.46-2c", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_d/oj": "GDPR Art.46-2d", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_e/oj": "GDPR Art.46-2e", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_f/oj": "GDPR Art.46-2f", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_3/pnt_a/oj": "GDPR Art.46-3a", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_3/pnt_b/oj": "GDPR Art.46-3b", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_a/oj": "GDPR Art.49-1a", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_b/oj": "GDPR Art.49-1b", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_c/oj": "GDPR Art.49-1c", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_d/oj": "GDPR Art.49-1d", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_e/oj": "GDPR Art.49-1e", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_f/oj": "GDPR Art.49-1f", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_g/oj": "GDPR Art.49-1g", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_2/oj": "GDPR Art.49-2", "https://eur-lex.europa.eu/eli/reg/2016/679/art_13/oj": "GDPR Art.13", "https://eur-lex.europa.eu/eli/reg/2016/679/art_14/oj": "GDPR Art.14", "https://eur-lex.europa.eu/eli/reg/2016/679/art_15/oj": "GDPR Art.15", "https://eur-lex.europa.eu/eli/reg/2016/679/art_16/oj": "GDPR Art.16", "https://eur-lex.europa.eu/eli/reg/2016/679/art_17/oj": "GDPR Art.17", "https://eur-lex.europa.eu/eli/reg/2016/679/art_18/oj": "GDPR Art.18", "https://eur-lex.europa.eu/eli/reg/2016/679/art_19/oj": "GDPR Art.19", "https://eur-lex.europa.eu/eli/reg/2016/679/art_20/oj": "GDPR Art.20", "https://eur-lex.europa.eu/eli/reg/2016/679/art_21/oj": "GDPR Art.21", "https://eur-lex.europa.eu/eli/reg/2016/679/art_22/oj": "GDPR Art.22", "https://eur-lex.europa.eu/eli/reg/2016/679/art_7/par_3/oj": "GDPR Art.7-3", "https://eur-lex.europa.eu/eli/reg/2016/679/art_77/oj": "GDPR Art.77", "https://edpb.europa.eu/system/files/2021-06/edpb_recommendations_202001vo.2.0_supplementarymeasurestransferstools_en.pdf": "EDPB Recommendations 01/2020 on Supplementary Measures and Transfer Tools", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_20/oj": "GDPR Art.4-20", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/pnt_c/oj": "GDPR Art.46", "https://edpb.europa.eu/sites/default/files/consultation/edpb_recommendations_202001_supplementarymeasurestransferstools_en.pdf": "EDPB Recommendations 01/2020 on Supplementary Measures and Transfer Tools", "https://eur-lex.europa.eu/eli/dec_impl/2021/914/oj": "Implementing Decision on SCC for Data Transfers"} \ No newline at end of file +{"https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_1/oj": "GDPR Art.4-1g", "https://www.specialprivacy.eu/": "SPECIAL Project", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_9/oj": "GDPR Art.4-9g", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_7/oj": "GDPR Art.4-7g", "https://www.w3.org/community/dpvcg/": "DPVCG", "https://www.privacycommission.be/nl/model-voor-een-register-van-de-verwerkingsactiviteiten": "Belgian DPA ROPA Template", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_2/oj": "GDPR Art.4-2", "https://www.specialprivacy.eu/vocabs/processing": "SPECIAL Project", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_5/oj": "GDPR Art.4-5", "https://www.iso.org/iso-31000-risk-management.html": "ISO 31000", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_8/oj": "GDPR Art.4-8", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_10/oj": "GDPR Art.4-10", "https://edpb.europa.eu/our-work-tools/our-documents/recommendations/recommendations-012020-measures-supplement-transfer_en": "EDPB Recommendations 01/2020 on Data Transfers", "https://eur-lex.europa.eu/eli/reg/2016/679/art_27/oj": "GDPR Art.27", "http://purl.org/adms": "ADMS controlled vocabulary", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_a/oj": "GDPR Art.6-1a", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_b/oj": "GDPR Art.6-1b", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_c/oj": "GDPR Art.6-1c", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_d/oj": "GDPR Art.6-1d", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_e/oj": "GDPR Art.6-1e", "https://eur-lex.europa.eu/eli/reg/2016/679/art_6/par_1/pnt_f/oj": "GDPR Art.6-1f", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_a/oj": "GDPR Art.9-2a", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_b/oj": "GDPR Art.9-2b", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_c/oj": "GDPR Art.9-2c", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_d/oj": "GDPR Art.9-2d", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_e/oj": "GDPR Art.9-2e", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_f/oj": "GDPR Art.9-2f", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_g/oj": "GDPR Art.9-2g", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_h/oj": "GDPR Art.9-2h", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_i/oj": "GDPR Art.9-2i", "https://eur-lex.europa.eu/eli/reg/2016/679/art_9/par_2/pnt_j/oj": "GDPR Art.9-2j", "https://eur-lex.europa.eu/eli/reg/2016/679/art_45/par_3/oj": "GDPR Art.45-3", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_a/oj": "GDPR Art.46-2a", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_b/oj": "GDPR Art.46-2b", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_c/oj": "GDPR Art.46-2c", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_d/oj": "GDPR Art.46-2d", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_e/oj": "GDPR Art.46-2e", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_2/pnt_f/oj": "GDPR Art.46-2f", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_3/pnt_a/oj": "GDPR Art.46-3a", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/par_3/pnt_b/oj": "GDPR Art.46-3b", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_a/oj": "GDPR Art.49-1a", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_b/oj": "GDPR Art.49-1b", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_c/oj": "GDPR Art.49-1c", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_d/oj": "GDPR Art.49-1d", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_e/oj": "GDPR Art.49-1e", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_f/oj": "GDPR Art.49-1f", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_1/pnt_g/oj": "GDPR Art.49-1g", "https://eur-lex.europa.eu/eli/reg/2016/679/art_49/par_2/oj": "GDPR Art.49-2", "https://eur-lex.europa.eu/eli/reg/2016/679/art_13/oj": "GDPR Art.13", "https://eur-lex.europa.eu/eli/reg/2016/679/art_14/oj": "GDPR Art.14", "https://eur-lex.europa.eu/eli/reg/2016/679/art_15/oj": "GDPR Art.15", "https://eur-lex.europa.eu/eli/reg/2016/679/art_16/oj": "GDPR Art.16", "https://eur-lex.europa.eu/eli/reg/2016/679/art_17/oj": "GDPR Art.17", "https://eur-lex.europa.eu/eli/reg/2016/679/art_18/oj": "GDPR Art.18", "https://eur-lex.europa.eu/eli/reg/2016/679/art_19/oj": "GDPR Art.19", "https://eur-lex.europa.eu/eli/reg/2016/679/art_20/oj": "GDPR Art.20", "https://eur-lex.europa.eu/eli/reg/2016/679/art_21/oj": "GDPR Art.21", "https://eur-lex.europa.eu/eli/reg/2016/679/art_22/oj": "GDPR Art.22", "https://eur-lex.europa.eu/eli/reg/2016/679/art_7/par_3/oj": "GDPR Art.7-3", "https://eur-lex.europa.eu/eli/reg/2016/679/art_77/oj": "GDPR Art.77", "https://edpb.europa.eu/system/files/2021-06/edpb_recommendations_202001vo.2.0_supplementarymeasurestransferstools_en.pdf": "EDPB Recommendations 01/2020 on Supplementary Measures and Transfer Tools", "https://eur-lex.europa.eu/eli/reg/2016/679/art_4/par_20/oj": "GDPR Art.4-20", "https://eur-lex.europa.eu/eli/reg/2016/679/art_46/pnt_c/oj": "GDPR Art.46", "https://edpb.europa.eu/sites/default/files/consultation/edpb_recommendations_202001_supplementarymeasurestransferstools_en.pdf": "EDPB Recommendations 01/2020 on Supplementary Measures and Transfer Tools", "https://eur-lex.europa.eu/eli/dec_impl/2021/914/oj": "Implementing Decision on SCC for Data Transfers", "https://enterprivacy.com/wp-content/uploads/2018/09/Categories-of-Personal-Information.pdf": "EnterPrivacy Categories of Personal Information"} \ No newline at end of file diff --git a/documentation-generator/jinja2_resources/macro_term_table.jinja2 b/documentation-generator/jinja2_resources/macro_term_table.jinja2 index 6bc665722..5d355506c 100644 --- a/documentation-generator/jinja2_resources/macro_term_table.jinja2 +++ b/documentation-generator/jinja2_resources/macro_term_table.jinja2 @@ -1,32 +1,32 @@ -{% macro table_classes(term_list) %} +{% macro table_classes(term_list, prefix=None) %}

Classes

{% for term in term_list | sort(attribute='iri') %} - {{term.rdfs_label}} | + {{term.skos_prefLabel}}{{" |" if not loop.last }} {% endfor %}

{% for term in term_list | sort(attribute='iri') %} -
-

{{term}}

+
+

{{term.skos_prefLabel}}

- + - - + + - {% if term.rdfs_subClassOf %} + {% if term.dpv_isSubTypeOf %} - + @@ -34,7 +34,7 @@ {% set children = term|subclasses %} {% if children %} - + {% endif %} - {% if term.sw_term_status == "changed" %} - - - - - {% endif %} - {% if term.rdfs_comment %} + {% if term.skos_note %} - - + + {% endif %} - {% if term.rdfs_isDefinedBy %} + {% if term.dct_source %} @@ -88,14 +82,14 @@ {% endif %} - {% if term.rdfs_seeAlso %} + {% if term.skos_related %} @@ -106,26 +100,26 @@ {% endfor %} {% endmacro %} -{% macro table_properties(term_list) %} +{% macro table_properties(term_list, prefix=None) %}

Properties

{% for term in term_list | sort(attribute='iri') %} - {{term.rdfs_label}} | + {{term.skos_prefLabel}} | {% endfor %}

{% for term in term_list | sort(attribute='iri') %} -
-

{{term}}

+
+

{{term.skos_prefLabel}}

Term:{{term.iri|fragment_this}}{% if not prefix %}{{term.iri|fragment_this}}{% else %}{{term.iri|fragment_this}}{% endif %}
Description:{{term.dct_description}}Definition:{{term.skos_definition}}
Subclass Of:SubType of: - {% if term.rdfs_subClassOf is sequence and not term.rdfs_subClassOf is string %}{% for parent in term.rdfs_subClassOf|sort %} + {% if term.dpv_isSubTypeOf is sequence and not term.dpv_isSubTypeOf is string %}{% for parent in term.dpv_isSubTypeOf|sort %} {{parent|prefix_this}}{{", " if not loop.last}} {% endfor %}{% else %} - {{term.rdfs_subClassOf|prefix_this}} + {{term.dpv_isSubTypeOf|prefix_this}} {% endif %}
Superclass Of:SuperType Of: {% for child in children|sort %} {{child|prefix_this}}{{", " if not loop.last}} @@ -42,26 +42,20 @@
Status:{{term.sw_term_status}}
Comment:{{term.rdfs_comment}}Note:{{term.skos_note}}
Source: - {% if term.rdfs_isDefinedBy is sequence and not term.rdfs_isDefinedBy is string %}{% for parent in term.rdfs_isDefinedBy|sort %} + {% if term.dct_source is sequence and not term.dct_source is string %}{% for parent in term.dct_source|sort %} {{parent|saved_label}}{{', ' if not loop.last }} {% endfor %}{% else %} - {{term.rdfs_isDefinedBy|saved_label}} + {{term.dct_source|saved_label}} {% endif %}
See Also: - {% if term.rdfs_seeAlso is sequence and not term.rdfs_seeAlso is string %}{% for link in term.rdfs_seeAlso %} + {% if term.skos_related is sequence and not term.skos_related is string %}{% for link in term.skos_related %} {{link|prefix_this}}{{", " if not loop.last}} {% endfor %}{% else %} - {{term.rdfs_seeAlso|prefix_this}} + {{term.skos_related|prefix_this}} {% endif %}
- + - + {% if term.rdfs_subPropertyOf %} @@ -134,27 +128,35 @@ {% if term.rdfs_subPropertyOf is sequence and not term.rdfs_subPropertyOf is string %}{% for parent in term.rdfs_subPropertyOf|sort %} {{parent|prefix_this}}{{", " if not loop.last}} {% endfor %}{% else %} - {{term.rdfs_subPropertyOf}} + {{term.rdfs_subPropertyOf|prefix_this}} {% endif %} {% endif %} - - - - - {% if term.rdfs_isDefinedBy %} + {% if term.dct_source %} {% endif %} + {% if term.rdfs_domain %} + + + + + {% endif %} + {% if term.rdfs_range %} + + + + + {% endif %} @@ -177,14 +179,14 @@ {% endif %} - {% if term.rdfs_seeAlso %} + {% if term.skos_related %} diff --git a/documentation-generator/jinja2_resources/macro_term_table_owl.jinja2 b/documentation-generator/jinja2_resources/macro_term_table_owl.jinja2 new file mode 100644 index 000000000..b78032f10 --- /dev/null +++ b/documentation-generator/jinja2_resources/macro_term_table_owl.jinja2 @@ -0,0 +1,198 @@ +{% macro table_classes(term_list, prefix=None) %} +

Classes

+

+ {% for term in term_list | sort(attribute='iri') %} + {{term.rdfs_label}}{{" |" if not loop.last }} + {% endfor %} +

+ + {% for term in term_list | sort(attribute='iri') %} +
+

{{term.rdfs_label}}

+
Term:{{term.iri|fragment_this}}{% if not prefix %}{{term.iri|fragment_this}}{% else %}{{term.iri|fragment_this}}{% endif %}
Description:{{term.dct_description}}{{term.skos_definition}}
Status:{{term.sw_term_status}}
Source: - {% if term.rdfs_isDefinedBy is sequence and not term.rdfs_isDefinedBy is string %}{% for parent in term.rdfs_isDefinedBy|sort %} + {% if term.dct_source is sequence and not term.dct_source is string %}{% for parent in term.dct_source|sort %} {{parent|saved_label}}{{', ' if not loop.last }} {% endfor %}{% else %} - {{term.rdfs_isDefinedBy|saved_label}} + {{term.dct_source|saved_label}} {% endif %}
Domain:{{term.rdfs_domain|prefix_this}}
Range:{{term.rdfs_range|prefix_this}}
Created:
See Also: - {% if term.rdfs_seeAlso is sequence and not term.rdfs_seeAlso is string %}{% for link in term.rdfs_seeAlso %} + {% if term.skos_related is sequence and not term.skos_related is string %}{% for link in term.skos_related %} {{link|prefix_this}}{{", " if not loop.last}} {% endfor %}{% else %} - {{term.rdfs_seeAlso|prefix_this}} + {{term.skos_related|prefix_this}} {% endif %}
+ + + + + + + + + + {% if term.rdfs_subClassOf %} + + + + + {% endif %} + {% set children = term|subclasses %} + {% if children %} + + + + + {% endif %} + {% if term.rdfs_comment %} + + + + + {% endif %} + {% if term.dct_source %} + + + + + {% endif %} + + + + + {% if term.dct_modified %} + + + + + {% endif %} + {% if term.dct_creator %} + + + + + {% endif %} + {% if term.rdfs_seeAlso %} + + + + + {% endif %} + +
Term:{% if not prefix %}{{term.iri|fragment_this}}{% else %}{{term.iri|fragment_this}}{% endif %}
Definition:{{term.dct_description}}
SubClass of: + {% if term.rdfs_subClassOf is sequence and not term.rdfs_subClassOf is string %}{% for parent in term.rdfs_subClassOf|sort %} + {{parent|prefix_this}}{{", " if not loop.last}} + {% endfor %}{% else %} + {{term.rdfs_subClassOf|prefix_this}} + {% endif %} +
SuperClass Of: + {% for child in children|sort %} + {{child|prefix_this}}{{", " if not loop.last}} + {% endfor %} +
Note:{{term.rdfs_comment}}
Source: + {% if term.dct_source is sequence and not term.dct_source is string %}{% for parent in term.dct_source|sort %} + {{parent|saved_label}}{{', ' if not loop.last }} + {% endfor %}{% else %} + {{term.dct_source|saved_label}} + {% endif %} +
Created:
Modified:
Contributor(s): + {% if term.dct_creator is sequence and not term.dct_creator is string %}{% for person in term.dct_creator|sort %} + {{person}}{{', ' if not loop.last }} + {% endfor %}{% else %} + {{term.dct_creator}} + {% endif %} +
See Also: + {% if term.rdfs_seeAlso is sequence and not term.rdfs_seeAlso is string %}{% for link in term.rdfs_seeAlso %} + {{link|prefix_this}}{{", " if not loop.last}} + {% endfor %}{% else %} + {{term.rdfs_seeAlso|prefix_this}} + {% endif %} +
+
+ {% endfor %} +{% endmacro %} + +{% macro table_properties(term_list, prefix=None) %} +

Properties

+

+ {% for term in term_list | sort(attribute='iri') %} + {{term.rdfs_label}} | + {% endfor %} +

+ + {% for term in term_list | sort(attribute='iri') %} +
+

{{term.rdfs_label}}

+ + + + + + + + + + + {% if term.rdfs_subPropertyOf %} + + + + + {% endif %} + {% if term.dct_source %} + + + + + {% endif %} + {% if term.rdfs_domain %} + + + + + {% endif %} + {% if term.rdfs_range %} + + + + + {% endif %} + + + + + {% if term.dct_date_approved %} + + + + + {% endif %} + {% if term.dct_creator %} + + + + + {% endif %} + {% if term.rdfs_seeAlso %} + + + + + {% endif %} + +
Term:{% if not prefix %}{{term.iri|fragment_this}}{% else %}{{term.iri|fragment_this}}{% endif %}
Description:{{term.dct_description}}
Sub-Property Of: + {% if term.rdfs_subPropertyOf is sequence and not term.rdfs_subPropertyOf is string %}{% for parent in term.rdfs_subPropertyOf|sort %} + {{parent|prefix_this}}{{", " if not loop.last}} + {% endfor %}{% else %} + {{term.rdfs_subPropertyOf|prefix_this}} + {% endif %} +
Source: + {% if term.dct_source is sequence and not term.dct_source is string %}{% for parent in term.dct_source|sort %} + {{parent|saved_label}}{{', ' if not loop.last }} + {% endfor %}{% else %} + {{term.dct_source|saved_label}} + {% endif %} +
Domain:{{term.rdfs_domain|prefix_this}}
Range:{{term.rdfs_range|prefix_this}}
Created:
Approved:
Contributor(s): + {% if term.dct_creator is sequence and not term.dct_creator is string %}{% for person in term.dct_creator|sort %} + {{person}}{{', ' if not loop.last }} + {% endfor %}{% else %} + {{term.dct_creator}} + {% endif %} +
See Also: + {% if term.rdfs_seeAlso is sequence and not term.rdfs_seeAlso is string %}{% for link in term.rdfs_seeAlso %} + {{link|prefix_this}}{{", " if not loop.last}} + {% endfor %}{% else %} + {{term.rdfs_seeAlso|prefix_this}} + {% endif %} +
+
+ {% endfor %} +{% endmacro %} diff --git a/documentation-generator/jinja2_resources/macro_term_table_skos.jinja2 b/documentation-generator/jinja2_resources/macro_term_table_skos.jinja2 new file mode 100644 index 000000000..dbbbc42fb --- /dev/null +++ b/documentation-generator/jinja2_resources/macro_term_table_skos.jinja2 @@ -0,0 +1,216 @@ +{% macro table_classes(term_list, prefix=None) %} +

Classes

+

+ {% for term in term_list | sort(attribute='iri') %} + {{term.skos_prefLabel}}{{" |" if not loop.last }} + {% endfor %} +

+ + {% for term in term_list | sort(attribute='iri') %} +
+

{{term.skos_prefLabel}}

+ + + + + + + + + + + {% if term.rdfs_subClassOf %} + + + + + {% endif %} + {% for pt in term.rdf_type %}{% if 'dpv' in pt.iri %} + + + + + {% endif %}{% endfor %} + {% if term.skos_broaderTransitive %} + + + + + {% endif %} + {% set children = term|subclasses %} + {% if children %} + + + + + {% endif %} + {% if term.skos_note %} + + + + + {% endif %} + {% if term.dct_source %} + + + + + {% endif %} + + + + + {% if term.dct_modified %} + + + + + {% endif %} + {% if term.dct_creator %} + + + + + {% endif %} + {% if term.skos_related %} + + + + + {% endif %} + +
Term:{% if not prefix %}{{term.iri|fragment_this}}{% else %}{{term.iri|fragment_this}}{% endif %}
Definition:{{term.skos_definition}}
SubClass of: + {% if term.rdfs_subClassOf is sequence and not term.rdfs_subClassOf is string %}{% for parent in term.rdfs_subClassOf|sort %} + {{parent|prefix_this}}{{", " if not loop.last}} + {% endfor %}{% else %} + {{term.rdfs_subClassOf|prefix_this}} + {% endif %} +
Instance of:{% for cls in term.rdf_type %}{% if 'dpv' in cls.iri %}{{cls|prefix_this}}{% endif %}{% endfor %}
Narrower than: + {% if term.skos_broaderTransitive is sequence and not term.skos_broaderTransitive is string %}{% for parent in term.skos_broaderTransitive|sort %} + {{parent|prefix_this}}{{", " if not loop.last}} + {% endfor %}{% else %} + {{term.skos_broaderTransitive|prefix_this}} + {% endif %} +
SuperType Of: + {% for child in children|sort %} + {{child|prefix_this}}{{", " if not loop.last}} + {% endfor %} +
Note:{{term.skos_note}}
Source: + {% if term.dct_source is sequence and not term.dct_source is string %}{% for parent in term.dct_source|sort %} + {{parent|saved_label}}{{', ' if not loop.last }} + {% endfor %}{% else %} + {{term.dct_source|saved_label}} + {% endif %} +
Created:
Modified:
Contributor(s): + {% if term.dct_creator is sequence and not term.dct_creator is string %}{% for person in term.dct_creator|sort %} + {{person}}{{', ' if not loop.last }} + {% endfor %}{% else %} + {{term.dct_creator}} + {% endif %} +
See Also: + {% if term.skos_related is sequence and not term.skos_related is string %}{% for link in term.skos_related %} + {{link|prefix_this}}{{", " if not loop.last}} + {% endfor %}{% else %} + {{term.skos_related|prefix_this}} + {% endif %} +
+
+ {% endfor %} +{% endmacro %} + +{% macro table_properties(term_list, prefix=None) %} +

Properties

+

+ {% for term in term_list | sort(attribute='iri') %} + {{term.skos_prefLabel}} | + {% endfor %} +

+ + {% for term in term_list | sort(attribute='iri') %} +
+

{{term.skos_prefLabel}}

+ + + + + + + + + + + {% if term.rdfs_subPropertyOf %} + + + + + {% endif %} + {% if term.dct_source %} + + + + + {% endif %} + {% if term.rdfs_domain %} + + + + + {% endif %} + {% if term.rdfs_range %} + + + + + {% endif %} + + + + + {% if term.dct_date_approved %} + + + + + {% endif %} + {% if term.dct_creator %} + + + + + {% endif %} + {% if term.skos_related %} + + + + + {% endif %} + +
Term:{% if not prefix %}{{term.iri|fragment_this}}{% else %}{{term.iri|fragment_this}}{% endif %}
Description:{{term.skos_definition}}
Sub-Property Of: + {% if term.rdfs_subPropertyOf is sequence and not term.rdfs_subPropertyOf is string %}{% for parent in term.rdfs_subPropertyOf|sort %} + {{parent|prefix_this}}{{", " if not loop.last}} + {% endfor %}{% else %} + {{term.rdfs_subPropertyOf|prefix_this}} + {% endif %} +
Source: + {% if term.dct_source is sequence and not term.dct_source is string %}{% for parent in term.dct_source|sort %} + {{parent|saved_label}}{{', ' if not loop.last }} + {% endfor %}{% else %} + {{term.dct_source|saved_label}} + {% endif %} +
Domain:{{term.rdfs_domain|prefix_this}}
Range:{{term.rdfs_range|prefix_this}}
Created:
Approved:
Contributor(s): + {% if term.dct_creator is sequence and not term.dct_creator is string %}{% for person in term.dct_creator|sort %} + {{person}}{{', ' if not loop.last }} + {% endfor %}{% else %} + {{term.dct_creator}} + {% endif %} +
See Also: + {% if term.skos_related is sequence and not term.skos_related is string %}{% for link in term.skos_related %} + {{link|prefix_this}}{{", " if not loop.last}} + {% endfor %}{% else %} + {{term.skos_related|prefix_this}} + {% endif %} +
+
+ {% endfor %} +{% endmacro %} diff --git a/documentation-generator/jinja2_resources/template_dpv.jinja2 b/documentation-generator/jinja2_resources/template_dpv.jinja2 index e6dac2a86..a921c59d6 100644 --- a/documentation-generator/jinja2_resources/template_dpv.jinja2 +++ b/documentation-generator/jinja2_resources/template_dpv.jinja2 @@ -11,34 +11,18 @@ var respecConfig = { shortName: "dpv", title: "Data Privacy Vocabulary (DPV)", - subtitle: "version 0.3", + subtitle: "version 0.4 (published 2022-02-15)", specStatus: "CG-DRAFT", group: "dpvcg", - edDraftURI: "https://w3.org/ns/dpv", - latestVersion: "https://w3.org/ns/dpv", - github: { - repoURL: "https://github.com/w3c/dpv", - branch: "master" - }, + latestVersion: "https://w3id.org/dpv", + // github: { + // repoURL: "https://github.com/w3c/dpv", + // branch: "master" + // }, subjectPrefix: "[dpv]", doJsonLd: true, - canonicalUri: "https://w3.org/ns/dpv", - otherLinks: [ - { - key: "Version History", - data: [ - { value: "Changelog", href: 'changelog.html' }, - { - value: "v0.3 (this version)", - href: "https://w3.org/ns/dpv/versions/v0.3" - }, - { - value: "v0.2 (previous version)", - href: "https://w3.org/ns/dpv/versions/v0.2" - } - ] - } - ], + lint: { "no-unused-dfns": false }, + canonicalUri: "https://w3id.org/dpv", editors: [ { name: "Harshvardhan J. Pandit", @@ -100,6 +84,10 @@ "name": "Javier D. Fernández", "company": "Vienna University of Economics and Business" }, + { + "name": "Julian Flake", + "company": "University of Koblenz-Landau", + }, { "name": "Mark Lizar", "company": "OpenConsent/Kantara Initiative" @@ -130,6 +118,70 @@ } ], localBiblio: { + "DPV": { + href: "https://www.w3id.org/dpv", + title: "Data Privacy Vocabulary (DPV)" + }, + "DPV-GDPR": { + href: "https://www.w3id.org/dpv/dpv-gdpr", + title: "GDPR Extension for Data Privacy Vocabulary (DPV-GDPR)" + }, + "DPV-NACE": { + href: "https://www.w3id.org/dpv/dpv-nace", + title: "NACE Taxonomy serialised in RDFS" + }, + "DPV-PD": { + href: "https://www.w3id.org/dpv/pd", + title: "Personal Data Categories Extension for Data Privacy Vocabulary (DPV-PD)" + }, + "DPV-SKOS": { + href: "https://www.w3id.org/dpv/dpv-skos", + title: "Data Privacy Vocabulary serialised using SKOS+RDFS (DPV-SKOS)" + }, + "DPV-OWL": { + href: "https://www.w3id.org/dpv/dpv-owl", + title: "Data Privacy Vocabulary serialised using OWL2 (DPV-OWL)" + }, + "DPV-Primer": { + href: "https://www.w3id.org/dpv/dpv-primer", + title: "Primer for Data Privacy Vocabulary" + }, + "DPVCG": { + href: "https://www.w3.org/community/dpvcg/", + title: "W3C Data Privacy Vocabularies and Controls Community Group (DPVCG)" + }, + "GDPR": { + href: "https://eur-lex.europa.eu/eli/reg/2016/679/oj", + title: "General Data Protection Regulation (GDPR)" + }, + "SPECIAL": { + href: "https://www.specialprivacy.eu/", + title: "SPECIAL H2020 Project" + }, + "RDF": { + href: "https://www.w3.org/TR/rdf11-concepts/", + title: "RDF 1.1 Concepts and Abstract Syntax" + }, + "RDFS": { + href: "https://www.w3.org/TR/rdf-schema/", + title: "RDF Schema 1.1" + }, + "OWL" :{ + href: "https://www.w3.org/TR/owl2-overview/", + title: "OWL 2 Web Ontology Language Document Overview (Second Edition)" + }, + "SKOS": { + href: "https://www.w3.org/TR/skos-reference/", + title: "SKOS Simple Knowledge Organization System" + }, + "DPV-ADOPTION": { + href: "https://www.w3.org/community/dpvcg/wiki/Adoption_of_DPVCG", + title: "Listing of DPV adoption and applications" + }, + "DPV GUIDES": { + href: "https://w3id.org/dpv/guides", + title: "Guidelines for Adoption and Use of DPV" + }, "NACE": { href: "https://ec.europa.eu/eurostat/ramon/nomenclatures/index.cfm?TargetUrl=LST_NOM_DTL&StrNom=NACE_REV2", title: "Statistical Classification of Economic Activities in the European Community (NACE)" @@ -153,6 +205,14 @@ "EnterPrivacy": { href: "https://enterprivacy.com/wp-content/uploads/2018/09/Categories-of-Personal-Information.pdf", title: "Taxonomy of Personal Data Categories" + }, + "TIME": { + href: "https://www.w3.org/TR/owl-time/", + title: "Time Ontology in OWL" + }, + "ISO-27017": { + href: "https://www.iso.org/standard/43757.html", + title: "ISO/IEC 27017:2015 Information technology — Security techniques — Code of practice for information security controls based on ISO/IEC 27002 for cloud services" } } }; @@ -225,208 +285,95 @@ -
-

The Data Privacy Vocabulary (DPV) provides terms (classes and properties) to describe and represent information related to processing of personal data based on established requirements such as for the EU General Data Protection Regulation (GDPR). The DPV is structured as a top-down hierarchical vocabulary with the core or base concepts of personal data categories, purposes of processing and types of processing, data controller(s) associated, recipients of personal data, legal bases or justifications used, technical and organisational measures and restrictions (e.g. storage locations and storage durations), applicable rights, and the risks involved.

-

- The namespace for DPV terms is http://www.w3.org/ns/dpv#
- The suggested prefix for the DPV namespace is dpv
- The DPV and its documentation is available on GitHub.

-
-
- -

This document is published by the Data Privacy Vocabularies and Controls Community Group (DPVCG) as a deliverable and report of its work in creating and maintaining the Data Privacy Vocabulary (DPV).

-
-

Contributing to the DPV and its extensions The DPVCG welcomes participation regarding the DPV, including expansion or refinement of its terms, addressing open issues, and welcomes suggestions on their resolution or mitigation.

-

While we welcome participation via any and all mediums - e.g., via Github pull requests or issues, emails, papers, or reports - the formal resolution of contributions takes place only through the DPVCG meeting calls and mailing lists. We therefore suggest joining the group to participate in these discussions for formal approval.

-

For contributions to the DPV, please see the section on GitHub. The current list of open issues and their discussions to date can be found at GitHub issues.

-
-
-
-

Introduction

-

The Data Privacy Vocabulary provides terms (classes and properties) to describe and represent information about personal data handling. In particular, the vocabulary provides extensible taxonomies of terms to describe the following components:

+
+

The Data Privacy Vocabulary [[DPV]] enables expressing machine-readable metadata about the use and processing of personal data based on legislative requirements such as the General Data Protection Regulation [[GDPR]]. This document describes the DPV specification along with its data model.

+

The canonical URL for DPV is https://w3id.org/dpv# which contains (this) specification. The namespace for DPV terms is https://w3id.org/dpv#, the suggested prefix for is dpv, and this document along with its various serializations are available on GitHub. +

Newcomers to the DPV are strongly recommended to first read through the [[[DPV-Primer]]] to familiarise themselves with the semantics, concepts, their place within DPV.

+ +

DPV Family of Documents

    -
  • Personal Data Categories
  • -
  • Purposes
  • -
  • Processing Categories
  • -
  • Technical and Organisational Measures
  • -
  • Legal Basis such as Consent
  • -
  • Entities such as Recipients, Data Controllers, Data Subjects
  • -
  • Rights
  • -
  • Risks
  • +
  • [[DPV-Primer]]: The Primer serves as an introductory document to DPV and provides an overview of its concepts.
  • +
  • [[DPV]]: (This document) The DPV Specification is the formal and normative description of DPV and its concepts. It provides a serialisation of the concepts as a taxonomy using SKOS.
  • +
  • Extensions to Concepts:
      +
    • [[DPV-GDPR]]: Extension to the DPV providing concepts relevant to [[GDPR]].
    • +
    • [[DPV-PD]]: Extension to the DPV providing a taxonomy of personal data categories.
  • +
  • Serialisations of DPV:
      +
    • [[DPV-SKOS]]: A serialisation of the DPV using [[RDFS]] and [[SKOS]] to enable its use as a schema or ontology.
    • +
    • [[DPV-OWL]]: is a serialisation of the DPV using [[OWL]] to enable its use as an ontology.
  • +
  • [[DPV-NACE]]: [[NACE]] taxonomy serialised in RDFS
-

These terms are intended to represent personal data handling as machine-readable information by specifying personal data categories undergoing processing, its purpose(s), the data controller(s) involved, recipient(s) of this data, the legal bases or justifications used (e.g. consent or legitimate interest), involving technical and organisational measures and restrictions (e.g. storage location and storage duration), the applicable rights, and possibility of risks.

-

As some concepts, e.g. Legal Bases, are defined and dependant on legal jurisdictions - these are provide as a separate 'extension' of the DPV. The scope of DPV is intended to be as a 'general vocabulary', and the extensions expand it to required jurisdictions, domains, and use-case specific requirements. The DPV-GDPR extension models concepts such as legal bases and rights provided by the GDPR.

-

Examples of applications where the concepts provided by the DPV can be used are:

-
    -
  1. represent policies for personal data handling
  2. -
  3. represent information about consent i.e. what it is about
  4. -
  5. log/document personal data handling actions e.g. activities of data controller
  6. -
  7. represent input data for automated checking of legal compliance
  8. -
-
-
-

Namespaces

-

The namespace for DPV vocabulary is http://www.w3.org/ns/dpv#. The table below indicates the full list of namespaces and prefixes used in this document.

+

The table provides an overview of the expression of concepts across the three DPV serialisations.

- - - - + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + - - + + + - - + + + - - + + + - - + + + - - + + +
PrefixNamespace
Concept[[DPV]][[DPV-SKOS]][[DPV-OWL]]
- dct - - http://purl.org/dc/terms/ -
- dpv - - http://www.w3.org/ns/dpv# -
- dpv-gdpr - - http://www.w3.org/ns/dpv-gdpr# -
- dpv-nace - - http://www.w3.org/ns/dpv-nace# -
- odrl - - http://www.w3.org/ns/odrl/2/ -
- owl - - http://www.w3.org/2002/07/owl# -
- rdf - - http://www.w3.org/1999/02/22-rdf-syntax-ns# -
- rdfs - - http://www.w3.org/2000/01/rdf-schema# -
- skos - - http://www.w3.org/2004/02/skos/core# -
- spl - - http://www.specialprivacy.eu/langs/usage-policy# -
- svd - - http://www.specialprivacy.eu/vocabs/data# -
- svdu - - http://www.specialprivacy.eu/vocabs/duration# - Conceptdpv:Concept + skos:Conceptowl:Class
- svl - - http://www.specialprivacy.eu/vocabs/locations# - is subtype ofdpv:isSubTypeOf + skos:broaderTransitiveowl:subClassOf
- svpu - - http://www.specialprivacy.eu/vocabs/purposes# - is instance ofdpv:isInstanceOf + rdf:typerdf:type
- svpr - - http://www.specialprivacy.eu/vocabs/processing# - has conceptdpv:Relation + rdf:Propertyowl:ObjectProperty
- svr - - http://www.specialprivacy.eu/vocabs/recipients - relationship domaindpv:domain + rdfs:domainrdfs:domain
- xsd - - http://www.w3.org/2001/XMLSchema# - relationship rangedpv:range + rdfs:rangerdfs:range
-
- -
-

The DPV, as it is provided, does not recommend any specific way to use its concepts. Adopters are free to utilise their preferred models (e.g. RDFS-style, OWL2-style, or simply as a list of terms), though we strongly recommend being aware of the implications of using a specific model regarding interpretation, reasoning, and interoperability. There is ongoing work to document 'flavours' for using DPV as a RDFS, OWL2, or SKOS vocabulary.

-
- +

Related Links

+ +
+
+ +

This document is published by the Data Privacy Vocabularies and Controls Community Group (DPVCG) as a deliverable and report of its work in creating and maintaining the Data Privacy Vocabulary (DPV).

+
+

Contributing to the DPV and its extensions The DPVCG welcomes participation regarding the DPV, including expansion or refinement of its terms, addressing open issues, and welcomes suggestions on their resolution or mitigation.

+

For contributions to the DPV, please see the section on GitHub. The current list of open issues and their discussions to date can be found at GitHub issues.

+
+

Serialisations of DPV and its modules are available on GitHub.

+

Base Vocabulary

-

Concepts in the Base vocabulary are available as an individual module here.

DPV base vocabulary
Base Vocabulary
-

The 'Base' or 'Core' vocabulary describes the top-level classes required for defining a 'policy' for personal data handling. Classes and properties for each top-level class are further elaborated using sub-vocabularies, for example a taxonomy of personal data categories. While all concepts within the vocabulary share a single namespace, the modular approach makes it possible to use only the specific taxonomies or sub-vocabularies, for example to refer only to purposes. The DPV provides the following as top-level concepts and generic properties to associate them with other concepts:

+

The 'Base' or 'Core' concepts in DPV represent the most relevant concepts for representing information regarding the what, how, where, who, why of personal data and its processing. Each of these concepts is further elaborated as a taxonomy of concepts in a hierarchical fashion. The DPV provides the following as 'top-level' concepts and relations to associate them with other concepts:

@@ -437,8 +384,8 @@ - - + + @@ -477,7 +424,7 @@ - + @@ -486,226 +433,63 @@ + + + + +
[=PersonalDataCategory=][=hasPersonalDataCategory=][=PersonalData=][=hasPersonalData=] Personal data categories
Legal bases or justifications for processing
[=Right=] and [=DataSubjectRight=][=Right=] [=hasRight=] Rights applicable or provided
[=hasRisk=] Risks applicable or probable regarding processing
[=PersonalDataHandling=][=hasPersonalDataHandling=]A concept for associating the other core concepts as a 'group, 'policy', or 'set' - so as to express different use-cases and combinations
+

DPV provides taxonomies for all core concepts except the ones specified below:

-

Along with these, the DPV defines the concept of [=PersonalDataHandling=] for representing a 'policy' associating the top-level concepts with one another. For example, using [=PersonalDataHandling=] it is possible to indicate the application of a specific purpose and processing to categories of personal data relating to data subjects (or individual), along with the data controller responsible, the recipients of data, legal basis used, the technical and organisational measures involved, rights provided, and the possibility of risks.

-

The DPV does not mandate the use of [=PersonalDataHandling=]. Adopters can define their own interpretation of what concepts personal data handling involves, or define a separate concept similar to [=PersonalDataHandling=]. For example, one may specify that a [=PersonalDataHandling=] is only associated with [=Purpose=], [=Processing=], [=PersonalDataCategory=], and [=Recipient=] where the legal basis and technical and organisational measures are either assumed or defined externally. In continuation of this, the DPV also does not provide any constraints on the inclusion or exclusion of concepts used to define an instance of [=PersonalDataHandling=]. Possibilities for specifying such constraints include use of OWL2 semantics and SHACL to specify mandatory concepts. For example, requiring every instance of [=PersonalDataHandling=] must have at least one Personal Data Category, Controller, Purpose, and Legal Basis.

+ {% if core_classes %}
- {{ table_classes(core_classes) }} + {{ table_classes(core_classes, 'base') }}
{% endif %} {% if core_properties %}
- {{ table_properties(core_properties) }} + {{ table_properties(core_properties, 'base') }}
{% endif %}
-
-

Personal Data Categories

-

Concepts related to Personal Data Categories are available as an individual module here.

-

DPV provides broad top-level personal data categories adapted from the taxonomy contributed by R. Jason Cronk [[EnterPrivacy]]. The top-level concepts in this taxonomy refer to the nature of information (financial, social, tracking) and to its inherent source (internal, external). Each top-level concept is represented in the DPV vocabulary as a Class, and is further elaborated by subclasses for referring to specific categories of information - such as preferences or demographics.

-
- Concepts for Personal Data Categories in DPV -
Concepts for Personal Data Categories in DPV
-
-

While this taxonomy is by no means exhaustive, the aim is to provide a sufficient coverage of abstract categories of personal data which can be extended using the subclass mechanism to represent concepts used in the real-world.

-

Regulations such as the GDPR define personal data at an abstract level as information (directly or indirectly) associated with an individual or a category of individuals. DPV defines classes representing categories that are inclusive of information associated with them (e.g. name consisting of first name, last name), with their instances representing specific elements of personal data (e.g. “John Doe” as name).

-

The categories defined in the personal data categories taxonomy can be used directly or further extended by subclassing the respective classes to depict specialised concepts, such as “likes regarding movies” or combined with classes to indicate specific contexts. The class [=DerivedPersonalData=] provides one such context to indicate information has been derived from existing information, e.g. inference of opinions from social media. Additional classes can be defined to specify contexts such as use of machine learning, accuracy, and source. Similarly, the class [=SpecialCategoryPersonalData=] represents categories that are ‘special’ or ‘sensitive’ and require additional conditions, e.g. as per GDPR’s Article 9.

- -

The following is an overview of the concepts provided for personal data within the DPV:

- - - - - - - - - - - - - - - - - - - - - -
ConceptDescription
[=PersonalDataCategory=]Category of Personal Data
[=SpecialCategoryPersonalData=]Indicates that personal data is sensitive or belongs to a special category
[=DerivedPersonalData=]Indicates that personal data is derived or inferred
- {% if personaldata_classes %} -
- {{ table_classes(personaldata_classes) }} +
+

Entities

+ {% if entities_classes %} +
+ {{ table_classes(entities_classes) }}
{% endif %} - {% if personaldata_properties %} -
- {{ table_classes(personaldata_properties) }} + {% if entities_properties %} +
+ {{ table_properties(entities_properties) }}
{% endif %}
-
-

Purposes

-

Concepts related to Purposes are available as an individual module here.

-

DPV at the moment defines a hierarchically organized taxonomy of generic categories of purposes (for processing of personal data). Regulations, such as GDPR, require the purpose to be declared in a specific and understandable manner. We therefore suggest to declare the purpose being used as an instance of one or several dpv:Purpose categories and to always declare the specific purpose with a human readable description (e.g. by using rdfs:label and rdfs:comment).

- -

DPV provides a way to indicate purposes are restricted or fall within a specific business sector using the class [=Sector=] and the property [=hasSector=]. Hierarchies for defining business sectors include NACE maintained by EU [[NACE]], NAICS maintained by USA [[NAICS]], ISIC maintained by UN [[ISIC]], and GICS maintained by commercial organisations MSCI and S&P [[GICS]]. Multiple classifications can be used through mappings between sector codes such as the NACE to NAICS alignment provided by EU [[NACE-NAICS]].

-

We provide an interpretation of the NACE revision 2 codes which uses rdfs:subClassOf to specify the hierarchy, available here. The NACE codes have the namespace dpv-nace and are represented as dpv-nace:NACE-CODE.

- -

Purposes can be further restricted to specific contexts using the [=Context=] and the property [=hasContext=]. In this case, 'context' refers to a generic context which can be expanded as applicable within an use-case or domain.

-
- DPV purpose vocabulary -
Concepts for Purposes in DPV
-
-

For using purposes, we suggest selecting the most appropriate or applicable purpose over more abstract ones by selecting or extending the relevant classes in the purpose taxonomy. For example, the purpose [=ServiceOptimization=] is further sub-classed to indicate optimisation for consumer as [=OptimisationForConsumer=] and for controller as [=OptimisationForController=].

-
-

Depending on specific jurisdictions, certain purposes within the DPV may not satisfy the requirements to be specific, umabigious, or clear. For example, in the above example, [=OptimisationForController=] may still be considered to be too broad, and thus should be further clarified either through use of one of its sub-classes or by creating dedicated instances for a given use-case.

-

Also depending on jurisdictions, there may be correlations between purposes and legal bases that indicate which specific legal bases may or may not be used for what purpose categories.

-

We welcome ways to indicate guidelines for use of purposes, relations with legal basis, and assisting adopters with choices to utilise in their use-cases.

-
- -

The following is an overview of the concepts within the purpose taxonomy:

- - - - - - - - - - - - - - - - - - - - - - - - - -
ClassPropertyDescription
[=Purpose=][=hasPurpose=]Indicate purpose
[=Sector=][=hasSector=]Indicate sector of organisation or restrict purpose to sector
[=Context=][=hasContext=]Indicate context or restrict purpose to context
- {% if purpose_classes %} -
+
+

Purposes

+ {% if purpose_classes %} +
{{ table_classes(purpose_classes) }}
{% endif %} {% if purpose_properties %} -
+
{{ table_properties(purpose_properties) }}
{% endif %}
-
-

Processing Categories

-

Concepts related to Processing are available as an individual module here.

-

DPV provides a hierarchy of classes to specify the operations associated with the processing of personal data. Declaring the processing or processing categories associated with personal data is required by regulations such as the GDPR. Processing operations (e.g. collect, share, and use) can have specific constraints or obligations which makes it necessary to accurately represent them. While the term ‘use’ is liberally used to refer to a broad range of processing categories in privacy notices, we suggest to select and use appropriate terms to accurately reflect the nature of processing where applicable.

-
- DPV processing vocabulary -
Concepts for Processing in DPV
-
-

There are a variety of terms used for describing processing operations depending on specific interpretations within the technological, legal, or sociological domain. We consolidate these terms and define the following 'top-level' concepts to create a hierarchical taxonomy for categories of processing: [=Disclose=], [=Copy=], [=Obtain=], [=Remove=], [=Store=], [=Transfer=], [=Transform=], and [=Use=]. Each of these are then further expanded using subclasses within the taxonomy.

-

Although the DPV taxonomy of processing categories includes terms mentioned in the definition of processing in GDPR (Article 4-2), their interpretation is based on common understanding (i.e. dictionary definition) and legal interpretation. Where the interpretation of a term differs significantly within a jurisdiction, it is advisable to declare it in a separate vocabulary as an extension to the DPV, similar to DPV-GDPR. An example of where terms differ between common understanding and jurisdiction-dependent definitions is the term 'sell' mentioned within the California Consumer Protection Act (2018) 1798.140(t), which includes "selling, renting, releasing, disclosing, disseminating, making available, transferring, or otherwise communicating".

-

Along with information about the processing 'operation', regulations (e.g. GDPR) also require additional information such as scale of processing, extent of automation and human involvement, source of data, consequences, and algorithmic logic. DPV declares such concepts as top-level classes which can be used in combination with the processing (or other concepts such as purposes) to indicate their application.

-

Terms such as evaluation or scoring are defined within the processing categories because they relate to the specific operations or activities taking place over personal data. This is not to be confused as indicating a purpose, since they still need to be applied or defined towards a specific 'purpose' for the processing. For example, consider an use-case for scoring an individual for rankings in an online competition - here the 'scoring' is indicative of the processing operations while 'rankings' is the purpose.

-

The following is an overview of the concepts provided within the DPV processing taxonomy:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ClassPropertyDescription
[=Processing=][=hasProcessing=]Specifies the processing operations over personal data
[=DataSource=][=hasDataSource=]Indicates source of personal data used in processing
[=SystematicMonitoring=]Specifies processing involves systematic monitoring (of data subjects)
[=EvaluationScoring=]Specifies processing involves evaluating or scoring (of data subjects)
[=MatchingCombining=]Specifies processing involves matching or combining of data
[=AutomatedDecisionMaking=]Specifies processing produces automated decisions (regarding data subjects)
[=LargeScaleProcessing=]Specifies processing takes place at 'large scales'
[=InnovativeUseOfNewTechnologies=]Specifies processing involves use of innovative and new technologies
[=hasAlgorithmicLogic=]Specifies the algorithmic logic for processing
[=hasConsequences=]Specifies consequences arising from processing
[=hasHumanInvolvement=]Specifies the extent of human involvement regarding processing
- {% if processing_classes %} +
+

Processing

+ {% if processing_classes %}
{{ table_classes(processing_classes) }}
@@ -717,136 +501,65 @@ {% endif %}
-
-

Technical and Organisational Measures

-

Concepts related to Technical and Organisational Measures are available as an individual module here.

-

Technical and Organisational measures consist of activities, processes, or procedures used in connection with ensuring data protection, carrying out processing in a secure manner, and complying with legal obligations. Such measures are required by regulations depending on the context of processing involving personal data. For example, GDPR (Article 32) states implementing appropriate measures by taking into account the state of the art, the costs of implementation and the nature, scope, context and purposes of processing, as well as risks, rights and freedoms. Specific examples of measures in the article include:

-
    -
  1. the pseudonymisation and encryption of personal data
  2. -
  3. the ability to ensure the ongoing confidentiality, integrity, availability and resilience of processing systems and services
  4. -
  5. the ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident
  6. -
  7. a process for regularly testing, assessing and evaluating the effectiveness of technical and organisational measures for ensuring the security of the processing
  8. -
-
- DPV Technical and Organisational Measures vocabulary -
Concepts for Technical and Organisational Measures in DPV
-
-

To represent these requirements, the DPV defines a hierarchical taxonomy of technical and organisational measures through the top-level concept of [=TechnicalOrganisationalMeasure=], which is further distinguished as [=TechnicalMeasure=] and [=OrganisationalMeasure=]. A technical measure is an implementation detail or technology used to achieve a specific goal or objective, such as an authentication protocol used to validate identity. In contrast, an organisational measure is a process or procedure used by the organisation, for example an authorisation procedure to decide who should be granted access within an organisation.

-

Measures can be associated using the generic property [=measureImplementedBy=]. The value or object of this property can be an IRI (or URL) representing a specific measure or standard used to implement it, or a String representing relevant information.

-

In the future, we plan to provide a collection of terms and URIs for specifying standards (e.g. ISO) and best practices (e.g. certifications, seals). Whether this should be provided within the DPV itself or as a separate extension similar to DPV-GDPR is to be decided. We welcome participation and contributions for this work.

-

DPV provides specific measures for storage of personal data in the form of [=StorageRestriction=], with specialised variants for duration as [=StorageDuration=], location as [=StorageLocation=], deletion as [=StorageDeletion=], and restoration as [=StorageRestoration=].

-

The generic properties [=hasStorage=], [=hasLocation=], and [=hasDuration=] enable representing information about storage, location, and duration respectively. These can be used to specify restrictions or conditions, such as for storage of personal data, its processing, or information about recipients.

-

For indicating the mitigation of [=Risk=], DPV provides [=RiskMitigationMeasure=] as a top-level concept within the Technical and Organisational measures taxonomy. The property [=mitigatesRisk=] is used to indicate the relationship between risk and its mitigation.

-

The following provides an overview of the important top-level concepts within the Technical and Organisational measures taxonomy:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ClassPropertyDescription
[=TechnicalOrganisationalMeasure=][=hasTechnicalOrganisationalMeasure=]Specifies the technical and organisational measures utilised or applicable
[=measureImplementedBy=]Specifies the implementation details of measure
[=RiskMitigationMeasure=][=mitigatesRisk=]Specifies use of measure to mitigate risks
[=StorageRestriction=]Specifies restriction on storage of personal data
[=StorageDuration=]Specifies restriction on duration of storage
[=StorageLocation=]Specifies restriction on location of storage
[=StorageDeletion=]Specifies restriction on deletion of storage
[=hasStorage=]Specifies information about storage
[=hasLocation=]Specifies information about location
[=hasDuration=]Specifies information about duration
- {% if technical_organisational_measures_classes %} -
+
+

Personal Data

+ {% if personaldata_classes %} +
+ {{ table_classes(personaldata_classes) }} +
+ {% endif %} + {% if personaldata_properties %} +
+ {{ table_properties(personaldata_properties) }} +
+ {% endif %} +
+ +
+

Tech/Org Measures

+ {% if technical_organisational_measures_classes %} +
{{ table_classes(technical_organisational_measures_classes) }}
{% endif %} {% if technical_organisational_measures_properties %} -
+
{{ table_properties(technical_organisational_measures_properties) }}
{% endif %}
-
-

Entities

-

Concepts related to Entities are available as an individual module here.

-

Entities refer to individuals, organisations, institutions, authorities, agencies, or any similar 'actor'. Defining and representing them is important given their rights and responsibilities under legal obligations. To represent such entities, DPV defines the [=LegalEntity=] class as a generic concept which if further extended to represent the different categories of entities.

-
- DPV Entities vocabulary -
Concepts for Entities in DPV
-
-

The DPV core vocabulary includes the concepts of [=DataSubject=], [=DataController=], and [=Recipient=] which are subclasses of [=LegalEntity=]. Consequently, they are not described in this section to avoid duplicity.

-

To describe the entities that act as recipients regarding personal data and its processing, the concepts of [=DataProcessor=], [=DataSubProcessor=], and [=ThirdParty=] are defined. Defining recipients is important in the context of data protection and privacy as it allows tracking the entities personal data is shared/transfered with. The concepts of [=Child=] and [=VulnerableDataSubject=] represent specific categories of data subjects based on relevance in legal requirements. The concept [=Authority=] represents an entity with legal authority, and is extended to represent [=DataProtectionAuthority=] for a specific authority concerned with data protection and privacy. To represent an 'agent' of an organisation, the concept [=Representative=] is provided. Similarly, [=DataProtectionOfficer=] refers to a specific entity associated with monitoring data protection and privacy within (or on behalf of) an organisation.

-

The concept 'child' can be legally distinct from 'minor', although they are also used as synonyms in several cases. DPV uses 'child' as the commonly used term to signify an individual below a certain legally defined age. This is influenced from the use of the term 'child' within the GDPR and by CJEU in its judgements. It is important to note that the relevant age for determining a child (or a minor child) varies by jurisdiction.

-

To represent information about entities, DPV provides the following properties: [=hasName=] to indicate name, [=hasAddress=] to indicate address, [=hasContact=] to indicate contact or communication channels, [=hasIdentifier=] to indicate an identifier associated with the entity, and [=hasRepresentative=] to indicate an 'agent' or representative of the entity.

- {% if entities_classes %} -
- {{ table_classes(entities_classes) }} +
+

Contextual Info

+ {% if context_classes %} +
+ {{ table_classes(context_classes) }}
{% endif %} - {% if entities_properties %} -
- {{ table_properties(entities_properties) }} + {% if context_properties %} +
+ {{ table_properties(context_properties) }}
{% endif %}
-