Skip to content
Paul Cuddihy GE Research edited this page Jul 19, 2021 · 46 revisions

SemTK: Semantics Toolkit

SemTK is an open source project intended to provide easy interactions with semantic triplestores (RDF stores). It is built on the W3C Semantic Web standard.

It is composed of two main parts:

  • SemTK Java API / REST Services - code and services to facilitate interacting with semantic triplestore data (e.g. querying data, ingesting data)
  • SPARQLgraph - a Javascript-based graphical web application providing drag-and-drop access to many SemTK features

SemTK was developed by the Knowledge Discovery Lab at GE Research. Contact: Paul Cuddihy

Demos are available at semtk.research.ge.com and "Hello World" demo

SemTK is licensed under Apache 2. Please include our logo whenever possible.

Key Features

Below are some of the features that SemTK provides:

  • SPARQL query generation & execution (supports Virtuoso, Fuseki, Neptune, Jena triplestores, extensible to other SPARQL 1.1 stores)
  • Ingestion of tabular data
  • Storing queries by id
  • Utility functions (e.g. loading OWL/TTL files, clearing data)
  • Instance data browsing

The tool is designed for triplestores with an ontology-based model. We use SADL for ontology authoring.

A Quick Look

This section is intended to give a quick idea of the most basic functionality of SemTK and SPARQLgraph.

Load a Connection

To establish a connection to one or more data graphs, choose connection->load off the main menu.

A user will first specify what triplestore to connect to using the dialog below. The Server URL contains the location of the triplestore. Each connection may have one or more datasets (a named graph within the given triplestore) containing the ontology, and one or more datasets containing instance data. The OWL Imports checkbox indicates that SemTK should recursively load datasets that are referenced as imports in the ontology.

The clear cache checkbox forces the ontology to be re-read from the triplestore, bypassing the cache mechanism in the service layer.

After a connection is loaded, the main SPARQLgraph screen might look like this.

The Ontology Info Pane (top left) shows a subset of the ontology, including classes, subclass relationships, properties, property domain/range, and enumerations. Mousing over an item will display a tooltip with the items full URI, and any aliases or notes.

The user can drag classes into the Nodegroup Pane (top right), and then select properties to return, delete, constrain, etc. The selected classes/properties/options are referred to in SemTK as a nodegroup. Using the nodegroup, as well as the corresponding connection and ontology info, the tool will generate a query that matches the semantic model and run against all ontology/instance datasets in the connection.

The generated SPARQL query (e.g. INSERT, COUNT, DELETE) will be shown in the Query Pane (middle). This SPARQL will include subclass inference (subClassOf *) for any class that has subclasses in the ontology, and subproperty inference (subPropertyOf *) for any properties that have subproperties in the ontology.

After the query is executed, results are shown in the Results Pane (bottom).

Latest Additions

2021

2020

  • support for UNION queries wiki page
  • visJs display of CONSTRUCT query results in SPARQLgraph
  • moving EDC (external data connections) to opensource
  • moving FDC (federated data connections) and FDCCache to opensource
  • improved ingestion speed using Jena in-memory cache
SPARQLgraph
Clone this wiki locally