Queries Advanced Topics

SubClasses and SubProperties

When a class is added to the query canvas, any queries generated will search for that class or any of its sub-classes.

Likewise, when a property is used (an edge Object Property or a Data Property), any queries generated will search for that property or any sub-properties. For a sub-property to be considered, it must also have a Domain of a super-class* or sub-class* of the subject class.

To query specific sets of sub-properties or sub-classes, use the union query.

Unions

SPARQLgraph can be used to generate union queries. In general, a union query is an "OR" where expressions share a set of sparqlId's. Creation of union queries in SPARQLgraph is based on the following:

create branch points for the union in a property or class, these will be marked with a colored Union symbol U
subgraphs beyond the branch point will be shown in the matching color
inside unions, sparqlId restrictions are loosened so that items from each branch can refer to the same return value
unions may be nested

The creation of union queries will be demonstrated with a brief tutorial using an ontology with a simple battery containing multiple colored cells. Consider these five batteries, each with up to four cells:

description	batt_ID	cell1_date	cell1_ID	cell1_color	cell2_ID	cell2_color	cell3_id	cell3_color	cell4_id	cell4_color
normal battery	battAA	2017-03-23T10:23:00	A	red	B	blue	C	white	D	white
normal battery	battAB	2017-03-23T10:24:00	E	red	F	blue	G	white	H	white
no date	battAC		I	red	J	blue	K	white	L	white
no colors on cells	battX	2017-03-23T10:26:00	M		N		O		P
two cells	battY	2017-03-23T10:27:00	Q	blue	R	blue

Union on object properties

Consider the following query:

Find all cells with color "red" OR with no color at all.
Return the cellId, along with the battery id and name.

Such a query looks like this in SPARQLgraph:

and is built with the following steps:

create a Battery, and set ?id and ?name to be returned
add a Cell, and set ?cellId to be returned
add a color, and constrain it to "red". Using the "suggest values" button is helpful here. Also, remember to uncheck the "return" box so the color is not returned.
create a union by selecting the "cell" arc and choosing "new union" off the "opt/minus/union" menu. At this point your subgraph will be rendered with a unique color, and the arc will be marked with a U
add another Cell to the Battery.
add to the union by selecting the new "cell" arc and choosing the "cell" union off the "opt/minus/union" menu. Now that this subgraph is added to the union, go back to the new Cell and return ?cellId making sure to use the same "cellId" sparqlId as the Cell in step 2.
add a color to the new cell, and select the new color arc and choose "minus" off the "opt/minus/union" menu.

You now have a union with two subgraphs. The top subgraph matches all cells with color red. The bottom subgraph matches all cells with no color. The "?cellId" sparqlId is shared between the branches. To make it easy to inspect results, order by "cellId".

The following SPARQL is generated:

prefix ...
select distinct ?id ?name ?cellId
		FROM <http://your/graph>
 where {
	?Battery a ?Battery_type .
	?Battery_type  rdfs:subClassOf* batterydemo:Battery.
	?Battery batterydemo:id ?id .
	?Battery batterydemo:name ?name .
	{
		?Battery batterydemo:cell ?Cell_1 .
			BIND(?Cell_1 as ?Cell) .
			?Cell_1 batterydemo:cellId ?cellId1 .
			BIND(?cellId1 as ?cellId) .
			?Cell_1 batterydemo:color ?Color_1 .
				FILTER ( ?Color_1 IN (<http://kdl.ge.com/batterydemo#red> ) ) . 
	}
	 UNION 
	{
		?Battery batterydemo:cell ?Cell .
			?Cell batterydemo:cellId ?cellId .
			minus {
				?Cell batterydemo:color ?Color .
			}
	}
}
ORDER BY ?cellId

Note that under the hood, each item in the graph has a unique identifier. BIND statements are used to match ?cellId between the two subgraphs in the UNION.

This query returns all the red, and colorless cells:

id	name	cellId
battAA	normal battery	A
battAB	normal battery	E
battAC	no date	I
battX	no colors on cells	M
battX	no colors on cells	N
battX	no colors on cells	O
battX	no colors on cells	P

Union on two data properties

For the sake of illustration, consider this query:

Find all cells with the letter 'y' in the id or in the name.
Return the cells' ids and names.

Such a query looks like this in SPARQLgraph:

and is built with the following steps:

Add the Battery node to the nodegroup
Select id:
- choose 'new union' from the menu
- apply the filter FILTER regex(?id, "[Yy]")
Select name:
- choose 'id' union from the menu
- apply the filter FILTER regex(?name, "[Yy]")

You now have a union query that will return names and ids of all batteries that have the letter 'y' in the name or id.

The query will look like this:

prefix ...
select distinct ?id ?name
		FROM <http://your/graph>
 where {
	?Battery a ?Battery_type .
	?Battery_type  rdfs:subClassOf* batterydemo:Battery.
	{
		?Battery batterydemo:id ?id .
			FILTER regex(?id, "[Yy]")   .
	}
	 UNION 
	{
		?Battery batterydemo:name ?name .
			FILTER regex(?name, "[Yy]") .
	}
}

and, given the data shown in the table above, will return the results:

id	name
	normal battery
battY

Union on two separate subgraphs

Now consider this query:

Find the id that belongs to any battery OR any blue cell

This query is the union of two disconnected subgraphs. It will look like this:

and is built with the following steps:

Add the Battery node to the nodegroup
- return the ?id
- open the class URI and select "new union", and de-selecting "return"
drag a Cell node, such that it is disconnected
- return the cellId as ?id
- open the class URI and select the "?Battery" union, and de-select "return"
Add a Color to the Cell, and constrain it to "blue", de-selecting "return"

This results in a query that is the union of the two subgraphs, each of which returns something for ?id.

The query looks like this:

prefix ...
select distinct ?id
		FROM <http://your/graph>
 where {
	{
		?Cell a batterydemo:Cell .
		?Cell batterydemo:cellId ?id_0 .
		BIND(?id_0 as ?id) .
		?Cell batterydemo:color ?Color .
			FILTER ( ?Color IN (<http://kdl.ge.com/batterydemo#blue> ) ) . 
	}
	 UNION 
	{
		?Battery a ?Battery_type .
		?Battery_type  rdfs:subClassOf* batterydemo:Battery.
		?Battery batterydemo:id ?id .
	}
}

and it returns the id of every battery and every blue cell:

id
F
B
J
R
Q
battAB
battAA
battX
battAC
battY

Combining UNION with MINUS

Consider the query "Cat named fluffy OR Cat does not have a kitty".

It is may be tempting to create a Cat and do a UNION on FILTER (?name, "fluffy") and MINUS hasKitty. That is, a union of a data property and MINUS an object property.

SemTK would create SPARQL like this:

?Cat a namespace:Cat
{
   ?Cat namespace:name ?name.
   FILTER regex (?name, "fluffy").
} UNION {
   MINUS { ?Cat namespace:hasKitty ?Kitty  }
}

And given the W3C recommenadation, since the MINUS clause has no left-hand side, it will always succeed. This query will return all cats.

Instead, build the Union on two separate subgraphs. Once both Cat nodes are added to the union, they can both be named ?Cat and their name can both be named ?name. The ?Cat which is a single node holds the ?name with the FILTER regex (?name, "fluffy").

This will generate SPARQL like this:

{
    ?Cat a AnimalSubProps:Cat .
    ?Cat AnimalSubProps:name ?name .
    minus {
        ?Cat AnimalSubProps:hasKitties ?Kitty .
    }
} UNION {
    ?Cat a AnimalSubProps:Cat .
    ?Cat_1 AnimalSubProps:name ?name .
    FILTER regex(?name, "fluffy") .
}

And this will return all cats named "fluffy" plus all cats which do not have kitties.

Construct Queries

CONSTRUCT queries return results in graph form instead of table, thus taking full advantage of the semantic web stack. This type of query is accessed by setting the query dropdown (highlighted below in yellow) to construct.

Rules for building CONSTRUCT queries:

any node and edge shown on the canvas are constructed
any data properties selected for return are constructed
any constraints are applied in the query WHERE clause

graphical results

Hovering the mouse over a node will show:

the URI of any class node
the type of any data

JSON-LD results

A download link "results.json," which will download a file in JSON-LD format.

Note that different triplestores have been observed to interpret the JSON-LD format differently:

a link to another object may be of the form { "@id": "ID123" } or just the string "ID123"
data properties may be typed { @value: "35", @type: integer } or may be strings "35"
types and URIs may be prefixed in full "uri://my/prefix#uri123" or abbreviated based on query prefixes "prefix:uri123"

The SPARQLgraph interface attempts to resolve these differences and show a standard network format.

Delete Queries

DELETE queries work differently from CONSTRUCT queries, in that

any node and edge shown on the canvas are added to the WHERE clause
any data properties selected for return are added to the WHERE clause
items to be deleted must be explicitly specified

Specifying items to delete

Data properties and object property edge dialogs have select for delete check boxes

Node dialogs (accessed by clicking on the class name) contain a menu with a choice of delete modes:

NO_DELETE
TYPE_INFO_ONLY - only delete type triples with this node's matching URIs as the subject
FULL_DELETE - delete all triples with this node's matching URIs in the subject or object
LIMITED_TO_MODEL - like FULL_DELETE, but limited to relationships specified in the model
LIMITED_TO_NODEGROUP - like FULL_DELETE but limited only to relationships in the nodegroup

FULL_DELETE on nodes is by far the most common type of delete query

Optimizations Internal

SemTK attempts to optimize queries based on performance testing of different triplestores.

VALUES clauses vs FILTER IN

This is used in ingestion URILookups, which can be several queries per row of ingestion data. Hence this can have a very large performance impact.

FILTER IN is preferred for AWS Neptune
other triples stores are more performant with VALUES clause

rdfs:subclassOf*

This is a very common query clause since a node in a nodegroup typically matches all subclasses.

Blazegraph peforms best with rdfs:subclassOf*
other triple stores are more performant with a list of classes in a VALUES clause

Virtuoso RDF1.1 compatibility

For compatibility purposes, virtuoso VALUES clauses will contain one typed and one untyped version of each value in the VALUES clause for string and numeric constants.

Circular Graphs

In some situations it is meaningful to create queries that have multiple connections to the same node. In instances where this would create circularity in the nodegroup, SPARQLgraph is not currently able to show this graphically. A work-around is available.

Consider the case data has been incorrectly ingested such that a dog's puppy and it's parent are the same. A query to find this bad data would logically seem to be two nodes where ?Dog_Parent hasPuppy ?Dog_Child and the ?Dog_Child hasPuppy ?Dog_Parent, forming a circle which will not execute properly.

Instead build a three-node query, and set ?Dog_Child equal to ?Dog_Parent behind the scenes. Starting with this nodegroup:

click on the ?Dog_Child to get a dialog, and set ?Dog_Child equal to ?Dog_Parent like this:

The resulting query will now find instances where a dog (incorrectly) has the same child and parent.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queries Advanced Topics

On this page

SubClasses and SubProperties

Unions

Union on object properties

Union on two data properties

Union on two separate subgraphs

Combining UNION with MINUS

Construct Queries

graphical results

JSON-LD results

Delete Queries

Specifying items to delete

Optimizations Internal

VALUES clauses vs FILTER IN

rdfs:subclassOf*

Virtuoso RDF1.1 compatibility

Circular Graphs

Clone this wiki locally