-
Notifications
You must be signed in to change notification settings - Fork 6
Home
Built on the W3C
See the live demos at semtk.research.ge.com and out the "Hello World" demo
The Semantics Toolkit (SemTk) is a research project of the Knowledge Discovery Lab at GE Global Research in Niskayuna NY.
It is composed of two main parts:
- SPARQLgraph / SPARQLform - a graphical web tool for ingesting data and auto-generating SPARQL queries. This includes Javascript libraries that function as a work-alike for the Java ones.
- SemTk Java library - the java used to build the back-end services. It has it's own API for other uses.
SemTK is licensed under Apache 2.
Please include our logo whenever possible.
found in documentationFiles
SPARQLgraph is a web application that enables drag-and-drop:
- SPARQL query generation (SELECT DISTINCT)
- Execution of the queries (Supports Virtuoso. Other SPARQL 1.1 endpoints should be easy to add).
- Drag-and-drop ingestion of .csv files
The tool is designed for triple-stores with an ontology-based model. We use SADL.
It uses a modified A* path-finding algorithm to make query-generation simple.
The front end is Javascript, while the ingestion and querying are performed via JAVA services.
SemTk was designed as a SPARQL-generator, first mainly for SELECT queries. It then evolved with important features for ingesting data with SPARQL auto-generated IMPORT queries. A cloud infrastructure was added to support the storage of nodegroups (subgraphs of interest along with their connection information) and stored-procedure like capabilities which support application development.
Part of each session, and stored with each nodegroup is a **connection **as shown below.
Note that this connection lists **server ** and dataset for a model graph. The dataset is essentially a named graph within the given server. Each connection may have multiple model connections where the ontology is stored, and multiple data connections.
After a connection is loaded, the main SPARQLgraph screen might look like this.
The top left represents a cached version of the ontology, with only the following relationships captured. This sub-set of owl is most useful for generating SPARQL queries.
- Classes
- Sub-class relationships
- Properties
- Domain / Range of properties - note that complex ranges are not yet supported
- Enums (SADL "must be one of")
In this section, classes can be dragged-and-dropped and properties chosen for returning, deleting, constraining, etc. Queries are then generated off these nodegroups.
The ontology info is critical in building a query that matches the model.
The connection is used such that the proper FROM or USING clauses are included in the SPARQL query so that it is performed against the entire ontology and collection of instance data.
This is the query generated by SemTK. It may be INSERT, COUNT, DELETE.
Results of the most recent query