-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow arbitrary strings as predicates #106
Comments
Interesting idea. I think there are two ways this could be interpreted: as globally scoped predicates that have minimal semantic commitment; or as some kind of locally scoped predicates. As shown, it looks like you are treating them as globally scoped. If they are globally scoped, then inference rules that use them must qualify their intended scope, such as by indicating the class of the subject, as you describe. If they are locally scoped, then we'll need ways to manipulate scopes, such as we have in programming languages. For example, when a library is imported it pulls a set of identifiers into the current scope, or allows an identifier from a foreign scope to be bound to an identifier in the current scope. I wonder what other pros and cons there might be of treating them as globally scoped vs locally scoped. |
In my experience 'predicates as strings' is more approachable, and less interoperable. |
I think the issue is perhaps tied up with the question of literals as subjects too. |
Yes there is some overlap with #21. For many of the customers that we see, the concept of Linked Data is not relevant as they only operate on controlled enterprise graphs. Also I believe even from a Linked Data perspective, having simpler property names shouldn't be a problem because it is far more relevant that the subjects and objects are URIs than the predicate. |
I don't understand how typing one less character ( |
Did you consider a mini-rdf on top of what RDF can be implemented? |
@namedgraph: I believe one goal of an EasierRDF project is to align better with what most software people are used to. Backward compatibility is desirable but by definition difficult to achieve forever. In the case of allowing strings as predicates, there is at least one simple approach, namely to convert them into special URIs, allowing existing infrastructure to be re-used without issues - urn:rdfpredicate:firstName With this approach, the only software changes would be to the Turtle and SPARQL parsers, to convert these strings into special URIs. |
Another way of implementing this would be to automatically wrap every
subject (or predicate) string into a blank node with that exact value. I
guess this will be very much hated by most here, but it would have a valid
RDF 1.1 interpretation.
Am Sa., 25. Nov. 2023 um 11:18 Uhr schrieb Holger Knublauch <
***@***.***>:
… @namedgraph <https://github.com/namedgraph>: I believe one goal of an
EasierRDF project is to align better with what most software people are
used to. Backward compatibility is desirable but by definition difficult to
achieve forever.
In the case of allowing strings as predicates, there is at least one
simple approach, namely to convert them into special URIs, allowing
existing infrastructure to be re-used without issues -
urn:rdfpredicate:firstName
With this approach, the only software changes would be to the Turtle and
SPARQL parsers, to convert these strings into special URIs.
—
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATZWSJMABT36NGMTCYFAGTYGHAZDAVCNFSM6AAAAAA7SSEMFCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRWGI3TGMZYGA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Am Sa., 25. Nov. 2023 um 12:53 Uhr schrieb Christian Chiarcos <
***@***.***>:
Another way of implementing this would be to automatically wrap every
subject (or predicate) string into a blank node with that exact value.
I meant "rdf:value"
Message ID: ***@***.***>
… |
@HolgerKnublauch I disagree. IMO we should make RDF-based software so flexible and powerful (in ways that would be impossible with RDF) so that we can empower the non-software people to work with data in new ways. That is a much broader audience than "software people". Trying to bring RDF to the general "software people" always ends up in attempts to dumb down RDF, because part of the RDF community seems to think that it's its job to accomodate while the "software people" can't be bothered to put in the effort and learn anything new. |
@namedgraph For how many years has the RDF community already tried to convert everyone else, with little success. 20 years now? It remains a niche technology. Maybe success is just around the corner, maybe not. People coming from other communities just find it alien and too complex. One of the particularly alien concepts is that properties have a global identity. This is basically unknown in any other language. Combine this with the unusual semantics of rdfs:range and rdfs:domain and you can understand why few people want to invest into understanding this stack. You are saying it would dumb down RDF, but what is actually the value of global identifiers for properties, leaving aside what RDF Schema tried to do: using property definitions to infer the types of subjects and objects without explicitly requiring type triples. What else is there apart from that use case? |
Hypothetical syntax that only uses strings in predicate position, while bringing in existing namespace-based predicates:
|
A real aside, perhaps:... I find it gently amusing that the example predicates being used are exactly the ones I would really need to look up (using Linked Data?) to see if I can find out what the author might mean. |
@HughGlaser not aside at all but important. Like in OWL and SHACL, a property would have its meaning in the context of a class or shape. So in this case, you would look the properties up by following the rdf:type, here ex:Person. It's still linked, self-describing data. This is exactly like you would look up the meaning of fields in a (Java) object or the parameters of a function - you start at the surrounding entity. |
Would all bare predicates then be implicitly scoped to the class of the subject on which they are used? If so, which class if the subject is in multiple classes? |
@dbooth-boston The problem of type clashes already exists, for example if you have two rdf:types with two owl:Classes that carry owl:Restrictions with different owl:allValuesFrom on the same property. In SHACL this would mean that all constraints apply. |
I find myself wondering if it should not be better as: |
If the RDF community were thriving and growing you might have a valid point in blaming developer laziness. But given that RDF is clearly losing out to easier-to-use competitors, I don't buy that argument. I think developers are rationally deciding that the effort required to "learn something new" with RDF is not worth the payoff, given the availability of easier "good enough" alternatives, even if the RDF approach may seem more appealing in a theoretical sense. The goal here is to make RDF -- or a successor built on RDF -- significantly easier to use, while retaining RDF's benefits and as much of the tooling and standards as possible. |
Requiring some things — e.g., RDF Subjects and Predicates — always be HTTP/S URIs means that those HTTP/S URIs can be treated as superkeys, which reach across DBMS schema, because they always denote the same thing. This is what delivers the Linked Data magic, and comprises the Giant Global Graph of our Semantic Webs (yes, intentionally plural). (Concerns like temporality do mean that Named Graphs or similar must be brought to bear, but this is handled with another batch of URIs, not arbitrary strings.) Letting RDF Subjects and Predicates be arbitrary strings would turn RDF into yet another semantically unjoinable mishmash of schemata, and, if merged without great care, could render the current bunch of Semantic Webs a giant global mudpuddle of incoherency. As to RDF's "failure" because it hasn't replaced tabular relational DBMS (a/k/a SQL) nor labeled property graphs — "horses for courses" comes to mind. RDF is VERY well suited to data where the overall data structure is not known at project start, where the "schema" will evolve over time — e.g., "schema last" — and where data is sparse, i.e., where the values of some predicates/attributes may not be known for any given subject/entity but you still want to collect all those values that are known. Tabular relational DBMS and their relational integrity and other restrictions makes them VERY well suited to dense data, i.e., where the values of all predicates/attributes for any given subject/entity are known, and you only want to collect the values of any given predicate/attribute when they are known for all subjects/entities. Changing a tabular relational DBMS schema once deployed can be a HUGE undertaking, and may require updates to all tools in use against that schema. On the other hand, adding a property/attribute to an RDF graph or data set is typically a trivial undertaking, and tools which operate against that data do not typically require updates specific to the new attribute/property. The idea of the "special treatment of arbitrary strings in subject or predicate position", coercing them into URIs, has some potential for implementability, though it doesn't solve the problem of "local only definition". I cannot dererefence your freshly minted URI, so I cannot confirm whether your intended meaning matches mine. This is, to me, a non-starter, overall. |
I think you just want more mature tools, that will show you labels instead of raw URIs, while the URIs are in place behind the screen (a/k/a under the covers). "Billy Bob Brockali" "was born in" "Ballingdon Bottom" . |
@dbooth-boston why do you see it as a competition? RDF will always lose out in the marketing sense because Neo4J alone has received $500M+ in VC funding. Just stop trying to convert developers to RDF or use mainstream adoption as the success criteria. Many of their problems that do not require data interchange might simply be solved with JSON, or with property graphs for that matter. The premise of Semantic Web was to deliver a new generation of the Web that is smarter, more automated etc. We haven't really seen that yet, and that's not the fault of the RDF model but of the software development still using legacy architectures. Why not focus our efforts on software that exploits RDF to the max and delivers something previously impossible? We have barely scratched the surface yet. |
Domain specific languages would help to make data writing easier int he sense of being more natural to the domain (SHACL-driven?). They could "compile" to Turtle/N-triples/JSON-LD with little more than guided text processing. |
I like this idea of having the possibility to register additional shortcuts in addition to the “a” (rdf:type) - or even further to add the possibility to import it from another file (like json-ld context). This way, I can write my turtle file faster and still compliant to the current namespace-based predicates.
As a general comment, I think having the possibility to describe properties (i.e., property as first-class citizen) is one of the main uniqueness of RDF, and I would prefer to keep it that way. However, having the option of adding syntactic sugar like Holger proposed will be really nice, and hopefully can attract more people to use RDF.
On 25.11.2023, at 14:47, Holger Knublauch ***@***.***> wrote:
Hypothetical syntax that only uses strings in predicate position, while bringing in existing namespace-based predicates:
@Prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@Prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@Prefix ex: <http://example.org/ns#> .
@alias a: rdf:type .
@alias label: rdfs:label .
ex:JohnDoe
a ex:Person ;
label "John Doe" ;
firstName "John" ;
lastName "Doe" ;
age 42 .
—
Reply to this email directly, view it on GitHub<#106 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAPAY42WK6UWBX5N5HDIMALYGHZDLAVCNFSM6AAAAAA7SSEMFCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRWGMZDGOJSHA>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Some comments:
1)
I point out again that
firstName "John" ;
is not a string as predicate; it is some sort of RDF symbol as predicate.
To be string as predicate, it would be be more sensibly rendered as
"firstName" "John" ;
It seems to be that the discussion centres around the first of those (which means that the thread title is misleading me).
2)
If you think (à la Semantic Web) that the purpose of a URI is as an identifier, then the differences between the various forms that have been mentioned (:firstName, firstName, "firstName" and blank nodes etc.) become syntactic sugar that can easily be handled by preprocessors or equivalent making a URI.
However, if you think (à la Linked Data) that the use of a URI means that the consumer expects to be able to resolve it using http(s), then the difference between those forms becomes quite stark - the publisher needs to have access to an appropriate system etc., if the notation implies a URI.
So I suspect that SemWeb people see little point in the discussion, but LD people think it worth engaging with.
3)
With respect to the sub-discussion of literals in the subject position.
I find the asymmetry of RDF annoying, makes me represent things in unnatural ways for specific applications, and embarrassing to explain to newcomers.
And I understand that it is unnecessary.
Cheers
Hugh
… On 26 Nov 2023, at 06:07, Ted Thibodeau Jr ***@***.***> wrote:
I would love to have:
"Billy Bob Brockali" "was born in" "Ballingdon Bottom".
"Ballingdon Bottom" "is located in" "Britain".
I think you just want more mature tools, that will show you labels instead of raw URIs, while the URIs are in place behind the screen (a/k/a under the covers).
"Billy Bob Brockali" "was born in" "Ballingdon Bottom" .
"Ballingdon Bottom" "is located in" "Britain" .
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
--
Hugh Glaser
CEO
Seme4 Limited
Mobile: +44 7595 334155
***@***.***
www.seme4.com
|
On firstName vs "firstName, note that JavaScript allows both forms equivalently, assuming the string is a valid identifier.
On the general topic, it is rather obvious that the W3C processes will not allow making such changes because by now there are too many established users and vendors who will expect predicates to continue to be (potentially resolvable) URIs. So any discussion here is rather academic, as input for a future WG that is independent of RDF as we know it. Maybe if we frame these topics accordingly, it will raise fewer concerns by those who will want to preserve the status quo. |
@dbooth-boston you should enable Discussions :) |
Done: #107 |
Please be aware that GitHub's "Discussions" are more of a Q&A that appears to have been modeled after the StackOverflow family of sites, than they are a discussion space which calls for threading message trees along the lines of what was once NetNews/Usenet/NNTP ... so what you intended to do may not be doable there, @namedgraph. |
Note that quasi aliasing can already be done with prefixes, albeit uncommon and not whilst re-using the prefix. JSON-LD of course also allows mapping JSON keys (aliases) to other URLs. Re-using @fekaputra's example: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://example.org/ns#>
# aliases as prefixes
PREFIX label: <http://www.w3.org/2000/01/rdf-schema#label>
PREFIX firstName: <http://example.org/ns#firstName>
PREFIX lastName: <http://example.org/ns#lastName>
PREFIX age: <http://example.org/ns#age>
ex:JohnDoe
a ex:Person ;
label: "John Doe" ;
firstName: "John" ;
lastName: "Doe" ;
age: 42 . ex:JohnDoe
a ex:Person ;
- label "John Doe" ;
+ label: "John Doe" ;
- firstName "John" ;
+ firstName: "John" ;
- lastName "Doe" ;
+ lastName: "Doe" ;
- age 42 .
+ age: 42 . |
Oh, sorry, I misread the topic. It should be best to update to topic to mention: byte strings. Otherwise, we will write past each other. |
THIS IS ME BRAINSTORMING ONLY, so don't kill me.
Currently, RDF requires predicates of a triple to be IRIs. I guess that choice was made so that
But:
ad 1: Global property axioms are not necessary and do not play a role in SHACL where everything is scoped by shapes and classes. And rdfs:range and rdfs:domain are typically horribly misunderstood.
ad 2: Even with unique identifiers people from other graphs may reference your predicate in unexpected ways and your queries still need to filter by subjects.
Even within a single namespace it is quite common that the same URI is used for different purposes. For example, a ex:role property could point from an ex:Agent to a ex:Role or from an ex:Organization to a ex:Role, and both could have different local meanings depending on the context.
So the benefits of URIs as predicates are IMHO overrated.
Proposal: Moving forward, RDF could also allow predicates to be arbitrary strings.
a) That is how most map-based data structures like JSON objects or Python dictionaries operate, meaning that the mapping between RDF and other languages becomes easier. I think property graphs too.
b) Allowing strings would make the syntax more compact. For example one could write
c) People don't need to invent artificially "unique" names - their application logic and queries are most likely already checking for the context anyway, e.g.
is already scoping the use of firstName to instances of Person, making the property uniquely identified at query time. And when mapped to languages like GraphQL or JavaScript, any access to predicates is already scoped to the context object.
As this would be an incremental generalization, existing RDF graphs would not be affected. People are not forced to use strings as predicates.
To minimize the overhead for existing triple stores, string-based predicates could be internally converted to URIs such as
after parsing in Turtle or SPARQL. But in the far future, there could also be an RDF that uses no URIs as predicates, with all frequently used predicates mapped to shorter names. Turtle and SPARQL have already started going down this route by introducing 'a' as abbreviation for rdf:type. They could also add 'label' as alias for rdfs:label or 'superClass' as alias for rdfs:subClassOf.
Also note that schema.org and wikidata use the same namespace for all predicates, so basically it's the same as if no namespace exists in their worlds.
The text was updated successfully, but these errors were encountered: