Skip to content

Trellis Architecture

Aaron Coburn edited this page Oct 29, 2017 · 11 revisions

The Trellis API unites the concepts of a Key-Value store and the interaction models of a LDP server. This makes it possible for a particular implementation to scale horizontally while supporting a standards-based model for managing resources. The relevant methods in the Trellis API are:

    Optional<Resource> get(IRI);
    Optional<Resource> get(IRI, Instant);
    Boolean put(IRI, Dataset);

That is, get and put are the means by which resources are retrieved and manipulated. The semantics of get and put are also idempotent and scoped to a single resource. Even the non-idempotent HTTP methods (POST and PATCH) are decomposed in the HTTP layer to the much simpler put method of the resource service.

By relying on put for the manipulation of all resources, an implementation can treat a resource IRI as an opaque key, independent of any implied hierarchy. This means that, in a distributed context, the data of some resource /foo may be stored on one set of servers while /foo/bar is stored on an entirely different set of servers.

Making this architectural choice is the cornerstone of Trellis' ability to scale. It also introduces some restrictions on the behaviors a client can expect.

Recursion

There is no support for recursion in Trellis. In the context of a hierarchical datastore, that means recursive PUT and recursive DELETE are not available. For instance, given an empty Container at the server root, if a client were to create a resource at /foo/bar/baz, none of the intermediate containers would be created. If /foo/bar is subsequently created as a Container, one would find a </foo/bar> ldp:contains </foo/bar/baz> . triple, but a client will need to explicitly create such a container.

Similarly, given a hierarchy of resources, starting at /foo, a DELETE command issued for /foo will only affect /foo. It will not trigger the deletion of any child or other descendent resources; a client will need to explicitly delete those resources.

In order to properly implement a recursive PUT or DELETE, a server would require a strong notion of consistency and atomicity of the underlying datastore. This is typically not a problem in a single-node (especially RDBMS) context, but it is very problematic for distributed systems where consistency cannot be taken for granted. Therefore, in order to support the general case of recursive PUT or DELETE over a distributed datastore, an extensive locking regimen would need to be in effect for every such PUT or DELETE operation. So for operations that would ordinarily be very efficient, these operations would become considerable bottlenecks for systems expecting even modest levels of concurrency.

Therefore, Trellis does not support recursion of any sort.

This decision is also in line with the principles of REST, where idempotent operations on one resource are constrained to that one resource. Any deviation Trellis makes from that principle relates to existing LDP requirements on the behavior of various container types, and those operations are typically handled in an asynchronous fashion anyway.

Another implication of this is that a client can easily create a disconnected graph of resources. If that is a concern, then a client should only use POST to create resources. DELETE operations will require some level of coordination by the client, especially if it is running in a multi-threaded or multi-processor environment.

Asynchrony

The Rosid implementation of Trellis makes extensive use of asynchrony when persisting resources, particularly LDP Containment and Membership triples. Depending on the configuration of the async processor, this could mean that such triples are not visible to clients until a later time, though typically that lag will be measured in only a few seconds. This makes writes much faster, and it allows the persistence layer some level of flexibility in terms of eventual consistency.

Clone this wiki locally