Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat/sparql star #160

Merged
merged 41 commits into from
Jun 12, 2023
Merged

feat/sparql star #160

merged 41 commits into from
Jun 12, 2023

Conversation

jeswr
Copy link
Contributor

@jeswr jeswr commented Nov 15, 2022

Enables SPARQL-star annotations syntax. Should theoretically work with sparqlalgebrajs off-the-bat since

SELECT * WHERE {
  ?s1 ?p1 ?o1 {| ?p2 ?o2 |}
}

gets expanded to

{
  "queryType": "SELECT",
  "variables": [
    {
      "termType": "Wildcard",
      "value": "*"
    }
  ],
  "where": [
    {
      "type": "bgp",
      "triples": [
        {
          "subject": {
            "termType": "Variable",
            "value": "s1"
          },
          "predicate": {
            "termType": "Variable",
            "value": "p1"
          },
          "object": {
            "termType": "Variable",
            "value": "o1"
          }
        },
        {
          "subject": {
            "termType": "Quad",
            "subject": {
              "termType": "Variable",
              "value": "s1"
            },
            "predicate": {
              "termType": "Variable",
              "value": "p1"
            },
            "object": {
              "termType": "Variable",
              "value": "o1"
            }
          },
          "predicate": {
            "termType": "Variable",
            "value": "p2"
          },
          "object": {
            "termType": "Variable",
            "value": "o2"
          }
        }
      ]
    }
  ],
  "type": "query",
  "prefixes": {
  }
}

This is a draft as there are a couple of edge cases to resolve (such as dangling ","s) as seen by the one currently failing test below

  1) A SPARQL parser
       in SPARQL-star mode
         should correctly parse query "sparql-star-annotated-object-list":
     Error: Parse error on line 4:
...o3 {| ?p32 ?o32 |},}
----------------------^
Expecting 'IRIREF', 'PNAME_NS', '(', 'INTEGER', 'VAR', 'NIL', '[', 'DECIMAL', 'DOUBLE', 'INTEGER_POSITIVE', 'DECIMAL_POSITIVE', 'DOUBLE_POSITIVE', 'INTEGER_NEGATIVE', 'DECIMAL_NEGATIVE', 'DOUBLE_NEGATIVE', 'BOOLEAN', 'STRING_LITERAL1', 'STRING_LITERAL2', 'STRING_LITERAL_LONG1', 'STRING_LITERAL_LONG2', 'PNAME_LN', 'BLANK_NODE_LABEL', 'ANON', '<<', got '}'
      at Parser.parseError (lib/SparqlParser.js:672:21)
      at Parser.parse (lib/SparqlParser.js:739:22)
      at Parser.parser.parse (sparql.js:36:37)
      at Context.<anonymous> (test/SparqlParser-test.js:460:31)
      at process.processImmediate (node:internal/timers:471:21)

@jeswr
Copy link
Contributor Author

jeswr commented Nov 15, 2022

A note on the tests:

I've handwritten all new tests except those in the sparqlstar-spec folder. The ones in that folder I have generated with the following script

const { ManifestLoader } = require('rdf-test-suite');
const fs = require('fs');
const path = require('path');
const { Parser } = require('./sparql');

const url = "https://w3c.github.io/rdf-star/tests/sparql/syntax/manifest.jsonld";

const loader = new ManifestLoader();
loader.from(url)
  .then(d => {
    d.testEntries.forEach(entry => {
      const elems = entry.baseIRI.split('/');
      
      if (!elems[elems.length - 1].includes('bad')) {
        fs.writeFileSync(
          path.join(__dirname, 'queries', 'sparqlstar-spec', elems[elems.length - 1].replace(/r(q|u)$/, 'sparql')),
          entry.queryString
        )

        const parser = new Parser({ sparqlStar: true });
        parser._resetBlanks();

        const res = parser.parse(entry.queryString);

        if (res.queryType === "SELECT") {
          res["variables"] = [
            {
              "termType": "Wildcard",
              "value": "*"
            }
          ]
        }

        

        fs.writeFileSync(
          path.join(__dirname, 'test', 'parsedQueries', 'sparqlstar-spec', elems[elems.length - 1].replace(/r(q|u)$/, 'json')),
          JSON.stringify(res, null, 2)
        )
      }
    })
  })

and I am verifying have verified the generated algebras by hand.

@rubensworks
Copy link
Collaborator

Now that SPARQL-star is being worked on in a WG to target SPARQL 1.2, I'm wondering if we should remove the sparqlStar flag, and make it default behaviour.

@jeswr
Copy link
Contributor Author

jeswr commented Nov 15, 2022

Now that SPARQL-star is being worked on in a WG to target SPARQL 1.2, I'm wondering if we should remove the sparqlStar flag, and make it default behaviour.

Since technically merging this PR is breaking change (since it now errors on quoted graphs such as the example below because they are not valid in the SPARQL-star spec); I think that would be fine - however, I would be inclined to keep the flag and just set it to true by default for now since I don't think the sparqlStar checks are really that much of a performance hit.

PREFIX ex: <http://example.com/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT * WHERE {
    << GRAPH ex:graph {<< GRAPH ex:graph {ex:Moscow a ex:City} >> a ex:City} >> ex:author ?person .
}

@jeswr
Copy link
Contributor Author

jeswr commented Nov 15, 2022

In doing this I think I've found what to me is a shortcoming in the SPARQL-star spec. In particular, lists are not allowed as subjects/objects in quoted triples as made explicit by SPARQL-star - bad - collection list in quoted triple. However, there is no associated good or bad test for the following.

PREFIX : <http://example.com/ns#>

SELECT * {
   ?s :p ("abc") {| :q 123 |} .
}

I think this is valid since you can explicitly state the list annotations and then apply the annotation to the IRI referring to the list. Secondly it appears to be valid according to the grammar.

@afs Would you accept a PR with the above as a positive syntax test for sparql-star in https://github.com/w3c/rdf-star/tree/main/tests/sparql/syntax?

@afs
Copy link

afs commented Nov 15, 2022

It's not for me to decide. I was just one of the members of the RDF-star community group. The test are a shared product.

The CG is not not active because the new working group is starting up. It will have to address this issue -- I don't think the WG issues list is setup yet.

In general, lists (RDF collections) are difficult to deal with in all sorts of places because they are not first-class concepts in the data model. (Tails of lists can be shared. List can be ill-formed. Other foo.)

If legal, the output might be a bnode inside the <<>> for the list head and the list cons cells outside so they are shared by the data triple and list. Not what you want. In other words, it does not appear to be a simple change.

Annotation syntax is only a syntactic convenience form (as is list form (123 456)) for writing:

?s :p "x"
<< ?s :p "x" >>  :q 123 .

It does not exist as a concept in the data model.

@jeswr
Copy link
Contributor Author

jeswr commented Nov 16, 2022

Just to keep a record. I think it would also be useful to have a failing spec test for quoted empty lists (reasoning: I just discovered that this is an edge case that this implementation currently misses).

PREFIX : <http://example.com/ns#>

SELECT * {
   <<?s :p ()>>  :q 123  .
}

@jeswr
Copy link
Contributor Author

jeswr commented Nov 17, 2022

Note: The spec tests here are only failing as they depend on https://github.com/rubensworks/rdf-test-suite.js/pull/78/files to be able to properly dereference the rdf-star spec. They do pass when you use that branch :)

Note: If anyone is ever doing a major re-write / refactor they may wish to refer to bfcf8b7 where I have some existing concerns around conformance listed as TODOs in the .jison file.

@jeswr jeswr marked this pull request as ready for review November 17, 2022 13:33
@jeswr
Copy link
Contributor Author

jeswr commented Nov 21, 2022

All tests are now passing and this is ready to review @RubenVerborgh . To summarise what I have done here:

  • Updated the jison with references corresponding to the sparql-star spec
  • Added support for sparql-star
  • Added a bunch of (positive and negative) tests for sparql star
  • Added spec tests for sparql star parsing (and bumped the version of rdf-test-suite).
  • Run npm audit fix --force
  • Added a newline character to any files missing one (I ran a script to catch those that were missed in the files that I had added; it seems there were several already missing one)

Copy link
Owner

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, @jeswr!
Just some concerns on generated file size; even though I love the increased clarity of course with the references to the spec. But I'm afraid that's how I started, and then inlined because it had a considerable impact on (an already large) generated file size.

lib/sparql.jison Show resolved Hide resolved
lib/sparql.jison Outdated Show resolved Hide resolved
lib/sparql.jison Outdated Show resolved Hide resolved
@jeswr
Copy link
Contributor Author

jeswr commented Jan 4, 2023

Currently increases the size of the generated file by 8.7%.

Before: 114016 b
After: 123966 b

I'll have a quick look to see how much of this is caused by inlining vs. rdf-star additions.

@jeswr
Copy link
Contributor Author

jeswr commented Jan 4, 2023

After that commit it is now: 119298 b which is 4.6 % larger than the original

@jeswr
Copy link
Contributor Author

jeswr commented Jan 4, 2023

Moreover these get much smaller if minified (which I tested with rollup):

Main branch: 80251 b
This branch (on 5a4014c): 84104 b

This means that it is now only a 4kb difference in cases where we care about bundle size. Is this sufficient?

@jeswr
Copy link
Contributor Author

jeswr commented Jan 5, 2023

@RubenVerborgh I also just wanted to bump this comment suggesting that we now make sparqlStar parsing the default behavior #160 (comment), #160 (comment). What do you think?

Copy link
Collaborator

@rubensworks rubensworks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a closer look at this, and it looks like there are some missing elements to this after all.

The new built-in functions TRIPLE, SUBJECT, PREDICATE, OBJECT, isTRIPLE, and sameTerm currently cause a parsing error when used in a query. (I'm surprised these aren't part of the spec tests)

For this, the grammar needs the following updates: https://w3c.github.io/rdf-star/cg-spec/editors_draft.html#sparql-star-grammar

@jeswr
Copy link
Contributor Author

jeswr commented Jan 27, 2023

he new built-in functions TRIPLE, SUBJECT, PREDICATE, OBJECT, isTRIPLE, and sameTerm currently cause a parsing error when used in a query. (I'm surprised these aren't part of the spec tests)

@rubensworks Could you give an example? There are passing tests using these built-ins such as https://github.com/RubenVerborgh/SPARQL.js/pull/160/files#diff-4df929f46b74d81489675446f8cd5066a78eb48a34218ab67ffa53b74fa43e8f and https://github.com/RubenVerborgh/SPARQL.js/pull/160/files#diff-ff3de9e97b2eb71b7fcc96727c7d23dea0204fb69d3fce1b016ec5c708f53b7c

@rubensworks
Copy link
Collaborator

@rubensworks Could you give an example? There are passing tests using these built-ins such as https://github.com/RubenVerborgh/SPARQL.js/pull/160/files#diff-4df929f46b74d81489675446f8cd5066a78eb48a34218ab67ffa53b74fa43e8f and https://github.com/RubenVerborgh/SPARQL.js/pull/160/files#diff-ff3de9e97b2eb71b7fcc96727c7d23dea0204fb69d3fce1b016ec5c708f53b7c

Ok, never mind, I it works correctly as intended.
(I assumed the jison would be compiled upon install, but it requires an explicit build command)

@rubensworks
Copy link
Collaborator

@RubenVerborgh @jeswr Now that the bundle size issues seem to be resolved, is there anything else blocking this from getting merged? (this is blocking us from implementing SPARQL-star support in Comunica)

@jeswr
Copy link
Contributor Author

jeswr commented Jun 9, 2023

This is ready to merge & release (ping @RubenVerborgh)

@RubenVerborgh RubenVerborgh merged commit f3fd546 into RubenVerborgh:main Jun 12, 2023
@RubenVerborgh
Copy link
Owner

Thanks, @jeswr, this is amazing work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants