Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A void-generator (For VoID) is to be figured out before a human-readable schema by ShEx is conducted. #31

Closed
candlecao opened this issue Dec 10, 2024 · 4 comments
Assignees
Labels
priority: high high priority

Comments

@candlecao
Copy link
Contributor

candlecao commented Dec 10, 2024

This follows the #27, and https://pypi.org/project/sparql-llm/, in which there is a "SPARQL endpoint schema loader" section with code:

from sparql_llm import SparqlVoidShapesLoader

loader = SparqlVoidShapesLoader("https://sparql.uniprot.org/sparql/")
docs = loader.load()
print(len(docs))
print(docs[0].metadata)

I tried it but it didn't work out while prompted, and fed back: Could not retrieve VoID description for endpoint https://sparql.uniprot.org/sparql/: No VoID description found in the endpoint

In https://pypi.org/project/sparql-llm/, it also tips: Checkout the void-generator project to automatically generate VoID description for your endpoint.

@candlecao candlecao added the priority: high high priority label Dec 10, 2024
@candlecao candlecao self-assigned this Dec 10, 2024
@candlecao
Copy link
Contributor Author

About the use of VoID, please see the excerpt from the paper: LLM-based SPARQL Query Generation from Natural Language over Federated Knowledge Graphs

The VoID generator is available open source6 and can be used to generate statistics over any SPARQL endpoint. This information allows us to generate simple, human-readable Shape Expressions (ShEx) for each class.

@candlecao candlecao changed the title A void-generator is to be figured out before a human-readable schema by ShEx is conducted. A void-generator (For VoID) is to be figured out before a human-readable schema by ShEx is conducted. Dec 11, 2024
@candlecao
Copy link
Contributor Author

First, I understood VOID: https://www.w3.org/TR/void/.
To vividly illustrate it, see this example:
There are 2 Graphs in a SPARQL Endpoint, such as:

  1. Fruits Graph <https://example.org/void-generator/test/fruits>
prefix ex:<https://example.org/void-generator/test/>


ex:red_apple a ex:Fruit, ex:Apple ;
    ex:color "red" ;
    ex:grows_on ex:apple_tree .

ex:green_apple a ex:Fruit, ex:Apple ;
    ex:color "green" ;
    ex:grows_on ex:apple_tree .

ex:rotton_apple a ex:Fruit, ex:Apple ;
    ex:color "brownish" .

ex:green_pear a ex:Fruit, ex:Pear ;
    ex:color "green" ;
    ex:growns_on ex:pear_tree .
  1. Fruit Trees Graph <https://example.org/void-generator/test/trees>
prefix ex:<https://example.org/void-generator/test/>


ex:apple_tree a ex:FruitTree .

ex:pear_tree a ex:FruitTree .

ex:lemon_tree a ex:FruitTree .

And the corresponding VoID info is:

@prefix void: <http://rdfs.org/ns/void#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix ex: <https://example.org/void-generator/test/> .

# Fruits Graph
<https://example.org/void-generator/test/fruits>
    a void:Dataset ;
    dcterms:title "Fruits Graph" ;
    dcterms:description "This graph contains information about different types of fruits and their properties." ;
    void:uriSpace "https://example.org/void-generator/test/" ;
    void:exampleResource ex:red_apple ;
    void:triples 15 ;
    void:classes 3 ;  # Fruit, Apple, Pear
    void:properties 3 ;  # color, grows_on, growns_on .

# Fruit Trees Graph
<https://example.org/void-generator/test/trees>
    a void:Dataset ;
    dcterms:title "Fruit Trees Graph" ;
    dcterms:description "This graph contains information about different types of fruit trees." ;
    void:uriSpace "https://example.org/void-generator/test/" ;
    void:exampleResource ex:apple_tree ;
    void:triples 3 ;
    void:classes 1 ;  # FruitTree
    void:properties 0 .  # No specific properties in this graph

# Linkset: Relationship between Fruits and Fruit Trees
<https://example.org/void-generator/test/fruits-trees-linkset>
    a void:Linkset ;
    dcterms:title "Linkset between Fruits and Fruit Trees" ;
    dcterms:description "This linkset describes the relationship between fruits and the trees they grow on." ;
    void:subjectsTarget <https://example.org/void-generator/test/fruits> ;
    void:objectsTarget <https://example.org/void-generator/test/trees> ;
    void:linkPredicate ex:grows_on ;
    void:triples 2 .  # red_apple -> apple_tree, green_apple -> apple_tree

@candlecao
Copy link
Contributor Author

That would be useful for federated query.

@candlecao
Copy link
Contributor Author

The README.md https://github.com/DDMAL/void-generator/blob/main/README.md has clearly demonstrated it.
Notice: When generating a VoID file for a graph in a SPARQL endpoint, please ensure the file is uploaded to the corresponding endpoint. Only after this step, and with the VoID file uploaded to the endpoint, can we extract the schema (shapes) for the graph. (refer to https://pypi.org/project/sparql-llm/)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: high high priority
Projects
None yet
Development

No branches or pull requests

1 participant