A tutorial with practical examples of how to create and publish 5-star open data on the Web; originally written for the 2014 Web Intelligence Summer School “Web of Data” supported by
and updated for the 2015 Web Intelligence Summer School “Answering Questions with the Web”
Published online at http://clange.github.io/5stardata-tutorial/
Original data (HTML): schedule, presentersCost and benefits of ★ Web data
To be honest, this PDF was exported from Excel, which is more than one star. But organisations really often “publish” data in PDF.
Cost and benefits of ★★ Web dataEven though the old binary *.xls format is proprietary, it is not impossible to read this file outside of Excel:
perl -MSpreadsheet::ParseExcel -le '
print Spreadsheet::ParseExcel->new()
->parse("2star_Excel/schedule.xls")
->worksheet("Schedule")
->get_cell(1,0)
->value();'
Output:
25 Aug 2014 09:00
It is harder, but still possible, to have questions answered such as “when is the first coffee break”.
Think of an algorithm that does the following:
- In the column titled “Event”, identify all cells whose value is “Coffee break”.
- On each row of such a cell, get the entry of the cell in the “Time” column.
- Sort these cells and return the smallest value.
However, free software libraries do not support all features of this file format. Here is what happens when we ask a popular free tool to determine the type of this file:
file 2star_Excel/schedule.xls
Output:
2star_Excel/schedule.xls: Composite Document File V2 Document, corrupt: Can't read SSAT
Cost and benefits of ★★★ Web data
We can process this file with standard tools:
unzip -l 3star_OpenDocument/schedule.ods
Archive: 3star_OpenDocument/schedule.ods Length Date Time Name --------- ---------- ----- ---- 46 08-21-2014 08:13 mimetype 52832 08-21-2014 08:13 Thumbnails/thumbnail.png 27279 08-21-2014 08:13 styles.xml 15227 08-21-2014 08:13 content.xml 852 08-21-2014 08:13 meta.xml 8774 08-21-2014 08:13 settings.xml 899 08-21-2014 08:13 manifest.rdf 0 08-21-2014 08:13 Configurations2/accelerator/current.xml 0 08-21-2014 08:13 Configurations2/progressbar/ 0 08-21-2014 08:13 Configurations2/statusbar/ 0 08-21-2014 08:13 Configurations2/images/Bitmaps/ 0 08-21-2014 08:13 Configurations2/floater/ 0 08-21-2014 08:13 Configurations2/toolbar/ 0 08-21-2014 08:13 Configurations2/popupmenu/ 0 08-21-2014 08:13 Configurations2/toolpanel/ 0 08-21-2014 08:13 Configurations2/menubar/ 1093 08-21-2014 08:13 META-INF/manifest.xml --------- ------- 107002 17 files
content.xml contains the actual tabular data, so we can process it using XPath/XQuery/XSLT tools such as Zorba:
zorba --serialize-text -q '
declare namespace office="urn:oasis:names:tc:opendocument:xmlns:office:1.0";
declare namespace table="urn:oasis:names:tc:opendocument:xmlns:table:1.0";
doc("3star_OpenDocument/content.xml")//office:spreadsheet
/table:table[@table:name="Schedule"]
/table:table-row[2]/table:table-cell[1]'
25 Aug 2014 09:00
LibreOffice even stored this timestamp in a machine-friendly way. We’ll realise the advantages of this later.
zorba --serialize-text -q '
declare namespace office="urn:oasis:names:tc:opendocument:xmlns:office:1.0";
declare namespace table="urn:oasis:names:tc:opendocument:xmlns:table:1.0";
string(doc("3star_OpenDocument/content.xml")//office:spreadsheet
/table:table[@table:name="Schedule"]
/table:table-row[2]/table:table-cell[1]/@office:date-value)'
2014-08-25T09:00:00
We need one CSV file per sheet:
Cost and benefits of ★★★★ Web data
From here onwards, the original 5-star open data examples use RDF. We will continue with CSV for a while, taking it to its limits, to point out that open data on the Web is not only RDF. We will introduce RDF in a later section.
The following examples roughly conform to Linked CSV, which was one of the original proposals for an RDF-conforming specification of CSV. The CSV on the Web Working Group is now taking a different approach. Their Working Draft on Generating RDF from Tabular Data on the Web suggests leaving the CSV untouched but providing complementary, external metadata annotations, e.g., in the form of JSON. This tutorial sticks with the simpler Linked CSV approach, which is self-contained in CSV.
An example from the 3.5-star CSV:
Time,Event,Type,Presenter,Location ... 27 Aug 2014 09:00,Wikidata,Keynote,Markus Krötzsch, 27 Aug 2014 10:15,Working with Wikidata: A Hands-on Guide for Researchers and Developers,Tutorial,Markus Krötzsch, Name,Affiliation,Town,Country ... Markus Krötzsch,TU Dresden,Dresden,Germany
- How do we know it’s twice the same instructor?
- How can we make this connection Web-safe? (There might be others by the same name; how about this person on Facebook?)
Give the presenter a unique identifier! On the Web, this means using a URI (Uniform Resource Identifier).
Time,Event,Type,Presenter,Location ... 2014-08-27T09:00:00+02:00,Wikidata,Keynote,http://purl.org/net/wiss2014/presenters/#markus, 2014-08-27T10:15:00+02:00,Working with Wikidata: A Hands-on Guide for Researchers and Developers,Tutorial,http://purl.org/net/wiss2014/presenters/#markus, $id,Name,Affiliation,Town,Country ... http://purl.org/net/wiss2014/presenters/#markus,Markus Krötzsch,TU Dresden,Dresden,Germany
(The timestamp format has also changed; we’ll discuss this next.)
It is good practice to …
- use HTTP URLs for such URIs,
- choose them from a namespace that you own,
- publish a machine-comprehensible, self-describing representation of the things identified by these URIs at that same URL,
- so that any client who wants to know something about these things can easily look it up by downloading.
This approach is called linked data.
Linked data is essential for the Semantic Web – “a framework that allows data to be shared and reused across application, enterprise, and community boundaries”.
The presenters in the summer school are now identified by URIs such as http://purl.org/net/wiss2014/presenters/#markus. As these are HTTP URLs, they can be dereferenced in order to download a description of a person. This is easiest to do by entering the URL into the address bar of a web browser, but a command-line HTTP client such as wget or cURL gives you more control.wget -O - --header 'Accept: text/csv' 'http://purl.org/net/wiss2014/presenters/#markus'
--2015-09-02 11:21:11-- http://purl.org/net/wiss2014/presenters/ Resolving purl.org (purl.org)... 132.174.1.35 Connecting to purl.org (purl.org)|132.174.1.35|:80... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/ [following] --2015-09-02 11:21:11-- http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/ Resolving www.iai.uni-bonn.de (www.iai.uni-bonn.de)... 131.220.8.244 Connecting to www.iai.uni-bonn.de (www.iai.uni-bonn.de)|131.220.8.244|:80... connected. HTTP request sent, awaiting response... 302 Found Location: http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/index.csv [following] --2015-09-02 11:21:11-- http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/index.csv Reusing existing connection to www.iai.uni-bonn.de:80. HTTP request sent, awaiting response... 200 OK Length: 1499 (1.5K) [text/csv] Saving to: 'STDOUT' #,$id,Name,Affiliation,Town,Country type,url,foaf:name,schema:affiliation,http://purl.org/net/wiss2014/vocab/#town,http://purl.org/net/wiss2014/vocab/#country ,http://purl.org/net/wiss2014/presenters/#soeren,Sören Auer,Universität Bonn;Fraunhofer IAIS,Bonn,Germany ,http://purl.org/net/wiss2014/presenters/#mathieu,Mathieu d'Aquin,,Milton Keynes,UK ,http://purl.org/net/wiss2014/presenters/#aba-sah,Aba-Sah Dadzie,University of Birmingham,Birmingham,UK ,http://purl.org/net/wiss2014/presenters/#jerome,Jérôme David,Université Pierre-Mendès-France;INRIA-LIG,Grenoble,France ,http://purl.org/net/wiss2014/presenters/#stefan,Stefan Decker,INSIGHT;National University of Ireland,Galway,Ireland ,http://purl.org/net/wiss2014/presenters/#paul,Paul Groth,VU Amsterdam,Amsterdam,Netherlands ,http://purl.org/net/wiss2014/presenters/#markus,Markus Krötzsch,TU Dresden,Dresden,Germany ,http://purl.org/net/wiss2014/presenters/#christoph,Christoph Lange,Universität Bonn;Fraunhofer IAIS,Bonn,Germany ,http://purl.org/net/wiss2014/presenters/#axel,Axel Polleres,WU Wien,Vienna,Austria ,http://purl.org/net/wiss2014/presenters/#eric,Eric Prud'hommeaux,W3C,, ,http://purl.org/net/wiss2014/presenters/#harald,Harald Sack,"HPI, Universität Potsdam",Potsdam,Germany ,http://purl.org/net/wiss2014/presenters/#thomas,Thomas Steiner,Université Lyon;Google,Lyon,France ,http://purl.org/net/wiss2014/presenters/#antoine,Antoine Zimmermann,École des mines de Saint-Étienne,Saint-Étienne,France 0K . 100% 17.7M=0s 2015-09-02 11:21:11 (17.7 MB/s) - written to stdout [1499/1499]
I will not go into full detail, but here are some observations, in the order of appearance:
- I actually published the data in a place easily accessible for me: my personal webspace at the University of Bonn.
- To publish the data in a sustainable way, independent from me leaving the University of Bonn, or the University of Bonn reorganising their IT infrastructure, I used the PURL (Persistent URL) redirection service.
- The first redirect is due to the use of PURL.
- The second redirect happens because we are using content negotiation to give data consumers a choice from multiple data formats. We will see another format, RDF/XML, below.
- Instead of just the description of Markus Krötzsch, we get the descriptions of all presenters. This is because we lazily published all descriptions in the same file on the server and used hash (#) URIs for them. This approach is OK for small amounts of data. The part after the hash has to be interpreted by the client. Here, the client actually downloads http://purl.org/net/wiss2014/presenters/ from the server and then has to locate, inside the downloaded document, the fragment
#markus
by its own means.
Further background on publishing data on the Web can be found in the following specifications:
- Cool URIs for the Semantic Web: how to choose the right URIs (hash vs. slash), how to design content negotiation
- Best Practice Recipes for Publishing RDF Vocabularies (actually also addresses datasets, as vocabularies are just a special case of that): how to configure the Apache HTTP server for these settings
Here is the same example as above, redone using cURL:
curl -i -H 'Accept: text/csv' -L 'http://purl.org/net/wiss2014/presenters/#markus'
HTTP/1.1 302 Moved Temporarily Date: Wed, 02 Sep 2015 09:24:08 GMT Server: 1060 NetKernel v3.3 - Powered by Jetty Location: http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/ Content-Type: text/html; charset=iso-8859-1 X-Purl: 2.0; http://localhost:8080 Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-Length: 288 HTTP/1.1 302 Found Date: Wed, 02 Sep 2015 09:24:08 GMT Server: Apache Location: http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/index.csv Content-Length: 248 Content-Type: text/html; charset=iso-8859-1 HTTP/1.1 200 OK Date: Wed, 02 Sep 2015 09:24:08 GMT Server: Apache Last-Modified: Tue, 26 Aug 2014 04:44:11 GMT ETag: "5db-50180f4611cc0" Accept-Ranges: bytes Content-Length: 1499 Content-Type: text/csv #,$id,Name,Affiliation,Town,Country type,url,foaf:name,schema:affiliation,http://purl.org/net/wiss2014/vocab/#town,http://purl.org/net/wiss2014/vocab/#country ,http://purl.org/net/wiss2014/presenters/#soeren,Sören Auer,Universität Bonn;Fraunhofer IAIS,Bonn,Germany ,http://purl.org/net/wiss2014/presenters/#mathieu,Mathieu d'Aquin,,Milton Keynes,UK ,http://purl.org/net/wiss2014/presenters/#aba-sah,Aba-Sah Dadzie,University of Birmingham,Birmingham,UK ,http://purl.org/net/wiss2014/presenters/#jerome,Jérôme David,Université Pierre-Mendès-France;INRIA-LIG,Grenoble,France ,http://purl.org/net/wiss2014/presenters/#stefan,Stefan Decker,INSIGHT;National University of Ireland,Galway,Ireland ,http://purl.org/net/wiss2014/presenters/#paul,Paul Groth,VU Amsterdam,Amsterdam,Netherlands ,http://purl.org/net/wiss2014/presenters/#markus,Markus Krötzsch,TU Dresden,Dresden,Germany ,http://purl.org/net/wiss2014/presenters/#christoph,Christoph Lange,Universität Bonn;Fraunhofer IAIS,Bonn,Germany ,http://purl.org/net/wiss2014/presenters/#axel,Axel Polleres,WU Wien,Vienna,Austria ,http://purl.org/net/wiss2014/presenters/#eric,Eric Prud'hommeaux,W3C,, ,http://purl.org/net/wiss2014/presenters/#harald,Harald Sack,"HPI, Universität Potsdam",Potsdam,Germany ,http://purl.org/net/wiss2014/presenters/#thomas,Thomas Steiner,Université Lyon;Google,Lyon,France ,http://purl.org/net/wiss2014/presenters/#antoine,Antoine Zimmermann,École des mines de Saint-Étienne,Saint-Étienne,FranceWith an alternative export configuration, the 3.5-star CSV may have ended up like this:
Time,Event,Type,Presenter,Location 08/25/2014 09:00:00,Introduction,,, 08/25/2014 09:15:00,Keynote,Keynote,Stefan Decker,
08/25/2014
is sufficiently unambiguous, but what does 01/02/03
mean?
- 1 February 2003?
- 2 January 2003?
- 3 February 2001?
- …?
If we don’t know how to interpret date entries, we can’t answer queries such as “when is the first coffee break”.
Also, if your family from a different timezone wanted to phone you in the lunch break, how do we know that 09:00:00
is in CEST?
So let’s use an ISO 8601 conforming date and time format, with time zone information:
Time,Event,Type,Presenter,Location 2014-08-25T09:00:00+02:00,Introduction,,, 2014-08-25T09:15:00+02:00,Keynote,Keynote,http://purl.org/net/wiss2014/presenters/#stefan,
Let’s continue to make our CSV even more self-describing, by introducing a schema (also called vocabulary on the Web of Data, or ontology, especially when it involves more complex formal logic).
We introduced linked data style URIs for the presenters (so that they describe themselves); let’s also do it for other concepts, e.g. the types of presentations.
Let’s introduce a domain-specific vocabulary.
Instead of a string ”Keynote
” let’s use a self-describing URI:
,2014-08-25T09:15:00+02:00,Keynote,http://purl.org/net/wiss2014/vocab/#Keynote,http://purl.org/net/wiss2014/presenters/#stefan,
And let’s create another CSV file for the vocabulary, where we define our terms:
$id,label,description,see also #Keynote,keynote,a talk that establishes a theme,http://en.wikipedia.org/wiki/Keynote
The relative URI #Keynote
works out if this file is published at http://purl.org/net/wiss2014/vocab/.
We introduced ISO 8601 timestamps, but how does a client know, without having to resort to heuristics, that the first column of schedule.csv
is intended to be an ISO 8601 timestamp?
Time,Event,Type,Presenter,Location 2014-08-25T09:00:00+02:00,Introduction,,,
We also introduced a vocabulary, but how do we make explicit what we mean by “label”, “description” and “see also”?
Let’s explicitly indicate the types!
For the timestamps and other entries in the schedule:
#,Time,Event,Type,Presenter,Location type,time,string,url,url,string ,2014-08-25T09:00:00+02:00,Introduction,,,
(We’ll get to the structure of the new, first column later.)
For the properties of vocabulary terms:
$id,label,description,see also url,rdfs:label,rdfs:comment,rdfs:seeAlso #Keynote,keynote,a talk that establishes a theme,http://en.wikipedia.org/wiki/Keynote
rdfs:
is a well-known prefix that abbreviates a URI. rdfs:label
(actually: http://www.w3.org/2000/01/rdf-schema#label) once more is a vocabulary term, in the widely used standard vocabulary RDF Schema. Its rdfs:comment
is “A human-readable name for the subject.”. So, RDF Schema is a vocabulary for describing vocabularies. Such vocabularies are also known as ontology languages.
url,rdfs:label,rdfs:comment,rdfs:seeAlso
, how do we know that this is metadata rather than data?
Let’s make it explicit!
#,Time,Event,Type,Presenter,Location type,time,string,url,url,string ,2014-08-25T09:00:00+02:00,Introduction,,,
- When the first column has a
type
entry, we are in the type declaration row. - An empty first column means “data”.
- Is the title of an event really just a string?
- Is the presenter really just a URI (that happens to point to a presenter)?
No! – Let’s also reuse some standard vocabularies here!
Schedule:
#,Time,Event,Type,Presenter,Location type,dct:date,dct:title,rdf:type,http://id.loc.gov/vocabulary/relators/pre,http://linkedevents.org/ontology/atPlace ,2014-08-25T09:15:00+02:00,Keynote,http://purl.org/net/wiss2014/vocab/#Keynote,http://purl.org/net/wiss2014/presenters/#stefan,
Presenters:
#,$id,Name,Affiliation,Town,Country type,url,foaf:name,schema:affiliation,http://purl.org/net/wiss2014/vocab/#town,http://purl.org/net/wiss2014/vocab/#country ,http://purl.org/net/wiss2014/presenters/#soeren,Sören Auer,Universität Bonn;Fraunhofer IAIS,Bonn,Germany
- We found a lot of reusable terms in standard vocabularies.
- Linked Open Vocabularies (LOV) is a search engine that helps with this task.
- Where didn’t find perfectly reusable terms, we defined our own, in our vocabulary.
More widely than CSV, the RDF data model is used for linked data.
Whenever a URI conforms to linked data, you can expect RDF there (usually in the ugly but widely supported RDF/XML encoding).
Let’s therefore redo our example in RDF, and discuss some differences from CSV.
- data.ttl (Turtle, human-friendly)
- presenters.rdf, schedule.rdf, vocab.rdf (RDF/XML, widely understood by machines)
(For purely pragmatic reasons, the Turtle, which is what I edit, is all-in-one, whereas the RDF/XML is in split files for easier deployment.)
<#day1intro> dct:date "2014-08-25T09:00:00+02:00"^^xsd:date ; dct:title "Introduction" .
CSV is based on records (one per row, with a fixed number of columns).
RDF is based on triples (subject–predicate–object statements).
Usually more than one triple belongs to a subject (resource), which is why it’s convenient to group them.
Usually every resource has an identifier. (In the CSV, our events didn’t have any.)
You can precisely indicate the datatype of an object, but you also have to do it always, except when the datatype is string.
<#day1keynote> a wv:Keynote ; dct:date "2014-08-25T09:15:00+02:00"^^xsd:date ; dct:title "Keynote" ; marcrel:pre <http://purl.org/net/wiss2014/presenters/#stefan> .
It’s no problem for different resources to have different numbers of properties.
Compare sparsely populated CSV:
#,Time,Event,Type,Presenter,Location type,dct:date,dct:title,rdf:type,http://id.loc.gov/vocabulary/relators/pre,schema:location ,2014-08-25T09:00:00+02:00,Introduction,,,
On the other hand, the CSV data model has a built-in order, which RDF does not have. Order can be expressed in RDF, but doing so leads to a high complexity. In the specification on Generating RDF from Tabular Data on the Web, compare the “minimal” RDF representation of a CSV table to the “standard” representation that preserves information about the tabular structure.
For one subject and predicate, there can be multiple objects. In the CSV we had to cheat:
,2014-08-26T18:00:00+02:00,Hackathon dinner,http://purl.org/net/wiss2014/vocab/#Dinner;http://purl.org/net/wiss2014/vocab/#Hackathon,,Maison des Élèves ,http://purl.org/net/wiss2014/presenters/#stefan,Stefan Decker,INSIGHT;National University of Ireland,Galway,Ireland
In RDF, that’s no problem:
<#day2hackathondinner> rdf:type wv:Dinner, wv:Hackathon ; dct:date "2014-08-26T18:00:00+02:00"^^xsd:date ; dct:title "Hackathon dinner" ; schema:location "Maison des Élèves" . <http://purl.org/net/wiss2014/presenters/#stefan> foaf:name "Stefan Decker" ; schema:affiliation "INSIGHT", "National University of Ireland" ; wv:town "Galway" ; wv:country "Ireland" .
Vocabulary definitions are no problem in RDF either, as RDF Schema itself has an RDF-based syntax:
wv:Hackathon rdfs:label "hackathon" ; rdfs:comment "an event of intensive collaboration on a software project" ; rdfs:seeAlso <http://dbpedia.org/resource/Hackathon> .
Here, we introduced a custom prefix to abbreviate the URI of our vocabulary. Here’s how prefixes are declared:
@prefix dct: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix marcrel: <http://id.loc.gov/vocabulary/relators/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix schema: <http://schema.org/> . @prefix wv: <http://purl.org/net/wiss2014/vocab/#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
This is just syntactic sugar, not part of the RDF data model.
Note that the rdfs:seeAlso
link points to DBpedia. DBpedia is a linked dataset extracted from Wikipedia.
Linked data clients usually expect data to be published as RDF, and RDF/XML is the most widely supported serialization of RDF. Therefore, we have also published our data as RDF/XML:
wget --quiet -O - --header 'Accept: text/rdf+xml' 'http://purl.org/net/wiss2014/presenters/#markus'
<?xml version="1.0" encoding="utf-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="http://purl.org/net/wiss2014/presenters/#stefan"> <ns1:country xmlns:ns1="http://purl.org/net/wiss2014/vocab/#">Ireland</ns1:country> <ns2:town xmlns:ns2="http://purl.org/net/wiss2014/vocab/#">Galway</ns2:town> <ns3:affiliation xmlns:ns3="http://schema.org/">INSIGHT</ns3:affiliation> <ns4:affiliation xmlns:ns4="http://schema.org/">National University of Ireland</ns4:affiliation> <ns5:name xmlns:ns5="http://xmlns.com/foaf/0.1/">Stefan Decker</ns5:name> </rdf:Description> </rdf:RDF>
A few notes:
- This RDF/XML was auto-generated from the Turtle source and therefore looks a bit unfriendly.
- Additionally, it is good practice to also publish a human-comprehensible version of your data in HTML. Here, we did not do this.
- We configured RDF/XML to be the content served by default. Therefore, it is also served when no specific content type is requested via the
Accept
HTTP request header.
This is the .htaccess
configuration file that implements this behaviour in the Apache web server:
AddType application/rdf+xml .rdf
AddType text/csv .csv
RewriteEngine On
RewriteBase /~langec/wiss2014/
RewriteCond %{HTTP_ACCEPT} !application/rdf\+xml.*(text/csv)
RewriteCond %{HTTP_ACCEPT} text/csv
RewriteRule ^(presenters|schedule|vocab)/$ $1/index.csv [R=302]
RewriteRule ^(presenters|schedule|vocab)/$ $1/index.rdf [R=302]
Additional stars have been suggested for publishing data …
- … that uses standard schemas – we’ve done this already.
- … whose quality has been checked – our group does research on this.
Also recall that our original use case started from an HTML homepage. With the following standards it’s possible to embed linked data into HTML:
- Microformats (very basic)
- Microdata (more powerful; emphasizes syntactic conciseness)
- RDFa (widest support of the RDF data model) – try it with http://rdfa.info/play/!
The idea for this tutorial was inspired by Antoine Zimmermann. The motivation was to prepare something for the 2014 Web Intelligence Summer School “Web of Data” that’s not too heavily biased towards RDF.
This summer school was funded by
This document was created with Emacs Org mode.
The icon for highlighting external links is reused from MediaWiki, developed by the Wikimedia Foundation. Its source code is licensed under GPL version 2 and whose homepage content is licensed under the Creative Commons Attribute-ShareAlike 3.0 Unported License.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Here is how you can cite this using BibTeX (BibLaTeX recommended):
@misc{Lange:5StarData2015,
title = {Publishing 5-star Open Data},
author = {Christoph Lange},
note = {Tutorial at the Web Intelligence Summer School ``Answering Questions with the Web''},
year = {2015},
date = {2015-09-01},
venue = {Saint-{\'E}tienne, France},
url = {http://clange.github.io/5stardata-tutorial/},
}