Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AuthZ: Collection-based access #413

Open
rosiel opened this issue Nov 9, 2016 · 30 comments
Open

AuthZ: Collection-based access #413

rosiel opened this issue Nov 9, 2016 · 30 comments
Labels
Subject: Access Control related to managing roles and permissions/information security. Type: use case proposes a new feature or function for the software using user-first language.

Comments

@rosiel
Copy link
Member

rosiel commented Nov 9, 2016

Title (Goal) Collection-based access
Primary Actor Repository admin
Scope access
Level High
Story [from Kelsey]: All collections have [XACML] policies limiting access to Admins and either Usergroup A or Usergroup B.

Remarks:

  • This would be useful if access and management were both configurable at a collection level.
@rosiel
Copy link
Member Author

rosiel commented Nov 9, 2016

Use case

@rosiel rosiel closed this as completed Nov 9, 2016
@rosiel rosiel reopened this Nov 9, 2016
@ruebot ruebot added the use case label Nov 9, 2016
@dannylamb
Copy link
Contributor

👍 This seems like a very common use case. Thanks @rosiel and Kelsey.

@bseeger
Copy link
Member

bseeger commented Apr 27, 2020

We at JHU are interested in something like this. For example, all people can see all content, but only a user in a specific role can edit a specific collection and it's content. But collections and content in them are fairly decoupled. If I restrict edit access to a specific collection, I'd like for that to apply to objects in the collection w/o having to tag each item.

@bseeger
Copy link
Member

bseeger commented Apr 27, 2020

Although tagging could be part of the ingest. Haven't thought through that yet, though.

@seth-shaw-unlv
Copy link
Contributor

seth-shaw-unlv commented Apr 27, 2020 via email

@mjordan
Copy link
Contributor

mjordan commented Apr 27, 2020

I think the ASU team were thinking about implementing collections as taxonomies (@wgilling and others, that correct?). Anyway, being able to enforce access to items in the collection via tags would be a very good reason to do this.

@wgilling
Copy link
Contributor

@mjordan, we are actually implementing collections as a content type rather than a taxonomy now although I believe that other use cases would really need an access restrictor based on taxonomy terms - I can think of a special collection that we will potentially implement like this in particular and it could be helpful possibly for some IR content.

@mjordan
Copy link
Contributor

mjordan commented Apr 27, 2020

OK, thanks. The two approaches are not mutually exclusive. You could use Islandora collections as they currently work, but also add a tag to each member to control access. That might be the best approach as long as it's workable. We might even be able to create a Context to automatically add a term, with the condition being collection membership.....

@mjordan
Copy link
Contributor

mjordan commented Apr 27, 2020

Or better idea, apply the term after the fact using Views Bulk Operations.

@rosiel
Copy link
Member Author

rosiel commented Apr 29, 2020

Though this wasn't originally my use case (it was passed to me by someone else), I want to re-underline what @bseeger said. Managing the permissions at a collection level ought to be a built in feature that is at least somewhat intuitive to manage. I'm coming from a documentation perspective. What we have is:

Use case:
My site is organized into several collections ("collection" nodes containing, via member_of, "resource" nodes), and I want different people to be able to manage each collection.

Solution:

  • Create a vocabulary containing taxonomy terms that mirror your collection structure.
  • Create a Drupal Role for this collection. It needs (what permissions? permission to manage objects with that term, yes, but what else? Access gets complicated as there are a lot of different permission checks on the way to doing various tasks. What does someone need in order to be a successful (limited) islandora object manager? This should be documented).
  • If you don't have one already, create an administrative view of nodes that allows you to filter on member_of and perform Views Bulk Operations. Make sure that it doesn't only show 'published' items, since you're probably interested in managing unpublished ones as well. If you're feeling fancy, have this appear on the "manage" page of each collection.
  • Filter to members of the first collection, select all items in the view (not just the first page), and using VBO, apply the appropriate taxonomy term.
  • Repeat this for all your collections.
  • Now you have tagged your items according to their collection. Nobody touch them! Hahaha.
  • Edit the documentation for managers. Tell them that when they're creating an object in their collection they must tag it with "their" term and only their term. Also, when they edit an item in their collection, they must leave "their" term in place. Failure to do so will cause them to lose access to that content. Also, if they add any other terms, they will be violating whatever permission structure the site has.
  • Edit the documentation for site admins. Tell them that when moving items between collections, they not only have to change the member_of field but also the term in the taxonomy field.
  • Probably create a view, or cron job, to find the objects where their member_of and taxonomy term don't match, because i'm sorry, it's going to happen.
  • Explain to your users what a taxonomy term page is, so that when they land on /term/x and click "Edit" they know what they're doing.

Oh... I forgot to mention.
Media permissions are entirely independent of Node permissions. You've so far done absolutely nothing to affect who can manage the binaries themselves. So now, you have to repeat all of the above with Media.

Your user documentation will now require users to accurately set, and never touch, the appropriate taxonomy terms on every node and media they create. This is an administrative field that confers permissions that I think most users reasonably expect would happen when making "a child of a collection", or "a media of a node". The fact that you can have children of collections that aren't, permission-wise, in that collection, and media of nodes that aren't, permission-wise, related to that node, because we're using taxonomy terms - a front-end feature that is exposed in ways this kind of structure shouldn't be - is kinda bonkers.

@mjordan
Copy link
Contributor

mjordan commented Apr 30, 2020

@Rosie Since can use the same taxonomy and therefore access rules on both nodes and media (the default Vagrant defines the "Access terms" field, which references the "Islandora Access" taxonomy, on both Repository Item and the various media types), I wonder if we could automate, on node or media creation, the assignment of a term recursively to all collection children, children of children, and media. This could probably be done with custom Context Reaction.

@rosiel
Copy link
Member Author

rosiel commented Apr 30, 2020

Would this be a fair description of the proposed solution?

  • Edit permission will be controlled by what taxonomy term(s) are on a node or media.
  • Edit permission will not be controlled by what collection (or node) a node (or media) belongs to.

However,

  • it will be possible, using a View or other method, to batch-apply terms to the children of a collection.
  • it will be possible, if we implement some recursion code on 'member_of', for this batch action to get applied also to all the descendants of a collection (the children's children, and children's children's children, etc).
  • it will be possible, if we implement recursion code on 'media_of', for this batch action to get applied also to the media of the affected nodes as well.
  • it will be possible, if we implement some code such as a Context Reaction, for new children of a permissioned collection to have that same permission term automatically applied.
  • it will be possible, if we implement some code such as a Context Reaction, for new media of a permissioned node to have that same permission term automatically applied.
  • It will still be possible to manage individual nodes independently of the collection they're in (if, say, a particular item needs to be more restricted or less restricted)

Questions (some of them are about using Permissions By Term with Islandora in general)

  1. Can i edit an individual node's permission terms, and cause the changes i make to be automatically populated to the media of that node?
  2. Can I make some media of a node (e.g. Preservation Master) more locked down than others (e.g. thumbnail)?
  3. How will doing batch actions, such as applying a term, affect items that already have permission terms (this one or other ones) on them?
  4. Will it be possible to remove terms by batch as well?
  5. What happens when I move an object to a different collection by editing the member_of field, and the target collection has a permission term?
  6. Am I able to add a node to a collection that I'm allowed to see, but not allowed to edit?
  7. If I want someone to be able to edit the metadata on some nodes, but not change the permission terms on those nodes, how will that work?
  8. What exact set of permissions is required to batch-edit permission terms over a collection?
  9. what exact set of permissions is required to edit which roles can access things tagged with a given term?
  10. Can i lock myself out of being able to access a node?
  11. Can permissions by term be used to control view access, as well as edit access?
  12. As I understand it, terms grant access, which is different from XACML, which taketh away. Does that mean untagged items can't be managed by anyone (except Administrators), even if users have 'edit islandora objects'-type role-based permissions?
  13. What else do I need to know about how Permissions By Term works?

@mjordan
Copy link
Contributor

mjordan commented May 1, 2020

@rosiel I know the answers to some of your excellent questions, but not all. SFU uses tags to control read/write permissions on collections in its Drupal 6 IR. Our use cases are very much simpler than the ones you provide however. (Yes, you read that right, "6" - our IR will be migrated to Islandora 8 over the summer.)

I think if we want to pursue using taxonomies to control access in the ways we're describing, we need to confirm they can meet that need (I started to do this back in #823 (comment) with some success), and then figure out what we need to do to make the UX of the user assigning permissions using tags as least onerous as possible. I would be happy to help test this approach to see if it is viable, especially using all of the criteria you provide above.

An alternative strategy for having members inherit permissions from their parents would be to walk up the family tree like we do for breadcrumbs (code), but I have trouble imagining how we can make exceptions to permissions in the ways you describe using that method. The advantage to that method would be no/very little configuration (just like breadcrumbs), no adding of terms, etc., in other words, much simpler but less flexible. I also wonder how it would scale. It's possible this could be mitigated with some smart caching.

Another aspect of controlling access to media (both by using terms and by the upwards inheritance method, for that matter) is that the media must live in Drupal's private file system. Any file in Drupal's public file system can be viewed by anyone if they know its URL, since "public" in this case is outside of Drupal's control. My concern here is performance, since requests for files in Drupal's private file system invoke Drupal to check permissions, which can be expensive,.

I volunteer to help test using taxonomies for this purpose, but maybe we should be thinking about a mini-sprint or a small group of testers instead of having a single person, with only two eyeballs, focus on the testing. I'm happy either way.

@bseeger
Copy link
Member

bseeger commented May 1, 2020

There is another module named Group which seems pretty handy and my initial testing of it is that it mostly meets my needs. But then it doesn't manage media - though there is a patch for doing that that I wasn't able to get to work. I'll add some more of my notes here on Monday about Groups.

@rosiel
Copy link
Member Author

rosiel commented May 1, 2020

@mjordan cool, I am impressed with your migration plan, that's very ambitious (and we too have some 6 sites left).

Any "permissioning" module should be looked at very carefully, because there are so many ways that information can seep out in a Drupal site. Controlling access to /node/nid and node/nid/edit is one thing, but with additional modules and play, behaviours can get very surprising. For instance, if you're allowed to View All Revisions, then you're allowed to view the revisions of nodes that you're not allowed to see. There are also, as Diego pointed out, the numerous ways that content can be rendered on other routes (views come to mind in particular) and I think it's due diligence to demonstrate that whatever system we use will respect the permissions we set up regardless of where the content is being potentially displayed. Permission inheritance is interesting. Fine grained permission control also seems important for a generalized digital object management system. Whatever we end up using, I think it's important that we make the mechanism as transparent to administrators as possible. We got used to XACML, I think in large part thanks to the amazing work done in creating the XACML editor. Otherwise, even though it's robust, it's unreadable and requires Wizards who know Black Magic to use.

requests for files in Drupal's private file system invoke Drupal to check permissions

Yes! Checking file permissions before displaying them, ie not using Drupal public file system, is a hugely important feature for sites where information can't be public. The RDM project was an example of this, and we encountered a number of very interesting glitches that we worked through for the most part.

@bseeger Group is a fantastic module for controlling visibility and permissions - if you can figure it out. 😳 As far as I know it is very robust. I am very interested to hear your feedback on it.

@kayakr
Copy link
Contributor

kayakr commented May 1, 2020

I'll add some more of my notes here on Monday about Groups.

@bseeger I'm interested in your experience with Groups. I'm applying it to a new project and I've already applied several patches

"drupal/group": {
    "#2774827 Get token of node's parent group; #62": "https://www.drupal.org/files/issues/2020-02-21/group-gnode_tokens-2774827-62.patch",
    "#3071489 Incorrect Access Check on Media Library": "https://www.drupal.org/files/issues/2020-02-27/incorrect-access-check-on-media-library-3071489-9.patch",
    "#3103884 Class should be using EntityTypeManagerInterface": "https://www.drupal.org/files/issues/2020-01-25/group-entity-type-manager-interface-3103884-6.patch"
},

It took me a while to wrap my head around "Group content" being the relationships-as-entities to the things I want to manage in a group e.g. nodes (Repository Items).
Some issues right now; no support for group members to publish/unpublish, and Group breadcrumb competing with Islandora breadcrumb. Migrating into a group is working ok, as is pathauto using group.

@mjordan
Copy link
Contributor

mjordan commented May 2, 2020

I've got an idea I'd like to throw against the wall for implementing a single checkbox that applies permissions to all descendents.

"Member of" can be indexed (and in fact is by default) such that the node IDs of the parents, all the way up the collection/compound parent/book, etc. hierarchy are available for a node in Solr. For example, here is the Solr entry for "Member of" for a node (node ID 3) that is a direct member of a subcollection (node ID 2), that itself is a member of a top-level collection (node ID 1):

"itm_field_member_of":[1,2]

Visualized another way:

Top-most collection (node ID 1)
  - Subcollection (node ID 2)
    - Image node (node ID 3) <- Is a descendent of both 1 and 2.

This indexing provides us with an out-of-the-box way of determining collection membership all the way up to the top-most level of an Islandora instance. We could use this, when a user views a node, to get a node's pedigree, and if there are any forebears that have a permission-enforcing tag on them, apply that permission to the node being viewed.

The UI to apply this to all descendants could be an indexed checkbox field in a collection, compound, newspaper issue, book etc. node (named for purposes of this proposal "Enforce access permissions on descendents") that if checked on the node that has the permissions applied to it, signals to the code that enforces the permission that it is this node that defines the permissions (can the user view this or not?) to apply to descendents.

The Solr query that some custom code somewhere (maybe in a small Islandora submodule) would perform on viewing a node would get all the nodes that 1) are in the node's "Member of" field and that 2) have the "Enforce access permissions on descendents" box checked. If the query finds a node that meets these two conditions, it gets the permission (can the user view this or not?) and applies it to the child node being viewed; if it gets more than one node, it uses the first one in the list (which should correspond to the closest parent, although we'd need to confirm that) and applies its permission.

The code could implement the access control by issuing an HTTP 403 "Access denied" response. An example of applying the 403 response is in https://github.com/mjordan/ip_range_access/blob/master/src/Plugin/ContextReaction/DenyAccessReaction.php (although the small module I am describing need not implement it in a Context Reaction, it could be in a hook_entity_view_mode_alter()).

What I'm describing here would apply access control on viewing a child node. I'm not sure at the moment how this would apply to adding items to a collection, since adding an item to a collection is basically updating the "Member of" field on a child node, or to files. This wouldn't meet all of the exception use cases described above either.

@rosiel
Copy link
Member Author

rosiel commented May 5, 2020

That's a cool idea, @mjordan. The method you describe sounds feasible enough (and it's good to know that we already have the ancestral chain indexed somewhere). Two questions:

  1. Do we already have a method for "get me some info about the solr doc for the entity I'm on"? We used to do this in 7 ( back in my day we used Islandora Solr Query Handlers... [inhaled yup] good times </geezer voice>) but I don't know how to do this - if we do this - in 8, now that we're just using Drupal's solr mechanisms. Do you already know what code to call to get the full Lucene Query Syntax (or at least q, fq, fl, facets... and not just whatever dismax the "Search" bar uses)?

  2. Fundamentally, is solr the right place to do access control when it's not access to the solr doc, but to the contents of the drupal database (i.e. node and field tables) that's at stake? As I understand it, solr is async AF and has no guarantee of being up to date in any length of time. Also, what method will you use for validating that this access control hook we may implement doesn't break down in unexpected situations?

@mjordan
Copy link
Contributor

mjordan commented May 5, 2020

  • A to Q 1: 'http://localhost:8983/solr/ISLANDORA/select?q=ss_search_api_id:%22entity:node/' . $nid . ':en%22' will do it but the proper way to do it would be to use Drupal's Search API
  • A to Q 2: Good point. Maybe not. I mentioned Solr since it already indexes up the family tree, but if we can get the family tree from the Drupal db, I agree that would be a more reliable source. OTOH, unless the db query is very fast, Solr is probably going to be faster, and we do need to be cognizant of speed since we'd be issuing this query whenever someone views an object.

As for testing/validating access to the node, I am assuming that taxonomy access control will act on simple node load operations, such that code could load the parent/collection node in question in the background and if that FALSEs, that signals the user isn't allowed to view that node and triggers the 403 on the child. Haven't tested that. The actual location that we issue the 403 is in a node view alter hook. That would cover viewing the node, I'm not sure about title/teaser/etc but we'd need to test that. I have no idea how this would work in unexpected situations since I can't predict them 😄

@mjordan
Copy link
Contributor

mjordan commented May 5, 2020

@Rosie an alternative to using Solr would be to walk up the hierarchy on demand like we do to produce breadcrumbs. That might be a more reliable, if possibly slower, approach.

@bseeger
Copy link
Member

bseeger commented May 6, 2020

Perhaps it's worth meeting about this to discuss ideas and the use case overall?

My thinking is that "Groups" module could be what we need/want and might be worth contributing to that module versus coming up with something new. It needs work and is a little rough around the edges, also a little tricky to understand. But kind of neat once I got it working. You don't need to walk up any hierarchy to determine access via membership, just make sure the collection and items in it are in the same group. Big gap when it comes to media - though there is a patch to include media in groups, though I wasn't able to get it to work (but that was most likely my fault and not the patches). Sounds like we've had similar experiences, @kayakr - I'm thinking it might work, but it's not quite solid yet.

This video was very helpful in looking at Groups: https://www.youtube.com/watch?v=GkiCLJk5n0s&app=desktop I think the module definitely needs a bit more work to ensure there aren't areas where security violations could happen.

In comparison to Permissions By Term - Groups was a bit easier on the Collection Level Admin user experience - they log in, add things and they don't have to tag them as it's automatically put in the right group with the right access for folks. However, when logged in, all they can see is the content in their group, but perhaps that's okay.

The Permissions by Terms module required that every new thing the Collection Level Admin adds has to be tagged - the extra step that could be easy to forget. I'm not clear on how well media is protected with Permissions By Term. I had one test user able to access media that belonged to another user's node. The node was protected, but not the media. Honestly, I'm not comfortable with any method I've come across yet, but I'm also crash coursing perms in drupal as I go, so it's a learning curve.

I have not played with Permissions By Entity yet (which is a sub module of Permissions By Term, but I'm not clear on how it works).

@mjordan
Copy link
Contributor

mjordan commented May 6, 2020

Maybe put it on today's Tech Call agenda?

@mjordan
Copy link
Contributor

mjordan commented May 6, 2020

Something we need to keep in mind is that any module that requires a site admin or background process to "rebuild permissions", like Group does, will suffer the same issue @rosiel pointed out about using Solr as the source of truth about permissions:

[...] solr is async AF and has no guarantee of being up to date in any length of time.

I'm not advocating against modules that require this, I'm just pointing it out.

@seth-shaw-unlv
Copy link
Contributor

I should mention that any module that installing any module that requires you to rebuild permissions after you've ingested thousands of objects is a serious pain to deal with. Drupal will shut down access to everything until it finishes the rebuild and it has a difficult time rebuilding sites with very large node counts. The Node Access Rebuild Progressive module modifies that default behavior by keeping your nodes' existing permissions while it progressively rebuilds permissions for existing nodes with the new access regime; just enable it before you enable any module that will cause you to rebuild permissions.

Also, I agree that the SOLR would be faster than walking the database every time (I also don't trust SOLR to be up-to-date), but that is a straw man. The query hit is only when it updates the permissions table (node_access) for that node during node insert/update, not for every access. Also, the permissions would be built before SOLR gets a chance to index the hierarchy anyway.

@mjordan
Copy link
Contributor

mjordan commented May 7, 2020

Doing some more poking around to see what the Drupal contrib world has to offer that might be relevant to our needs. I found https://www.drupal.org/project/private_files_download_permission, which looked interesting, since it defines role-and-user-based access to files. However, it does this by putting all files for a role/user in a separate directory:

private_file_download

This would be awkward for Islandora, I think, unless we could figure out how to move all the thumbnails, service files, etc. for a given restricted collection into a specific subdirectory of Drupal's private filesystem. Drupal's File field types, which media use, can be configured at the Media type-level to use the private file system, but I'm not sure how we'd ensure that all the files associated with nodes in an Islandora collection got written to the correct subdirectory so that the Private Files Download Permission module would manage them.

Now I'm leaning back to a more dynamic approach, where media inherit who can view the files from their parent node. We need something that requires as little configuration as possible, since the more configuration a site admin needs to do, the higher the chance that something will get missed or go wrong, and leak media to unathorized users.

I agree with our conversation today - we should review our use cases and go from there.

@mjordan
Copy link
Contributor

mjordan commented May 7, 2020

To eat my own dogfood, I've created a small proof-of-concept module that infers "view" authorization on media and their files from the parent node. Since Drupal can only secure files that are stored in its private file system, you will need to set that up as a prerequisite, but full instructions for testing this module are provided in its README. It's at https://github.com/mjordan/islandora_view_perms. I'll point out that using the method demonstrated in this module is not limited to using taxonomy terms to control access, it also works with standard Drupal "view content" permissions.

I wrote this up last night to satisfy my own curiosity about the inference part (not the Solr part) more than anything else, not to advocate against using existing contrib modules to lock down Islandora content.

@mjordan
Copy link
Contributor

mjordan commented May 8, 2020

Hi all, what's the best way forward on defining/refining use cases for collection-based permissions? This issue is getting long. Would people be interested in setting up a google doc, distributing its URL in Slack to reduce doc bombing, and then once we've got some specific use cases/issues, for example describing CRUD actions, bring them back here for wider perusal? Maybe set up a Slack channel for this topic too?

@bseeger
Copy link
Member

bseeger commented May 8, 2020

That sounds good to me, @mjordan.

@mjordan
Copy link
Contributor

mjordan commented May 13, 2020

Channel and use case doc created.

@mjordan
Copy link
Contributor

mjordan commented Jul 21, 2020

Dropping this here for future reference: https://www.prometsource.com/blog/how-manage-hook-entity-access-with-drupal.

@kstapelfeldt kstapelfeldt added Type: use case proposes a new feature or function for the software using user-first language. Subject: Access Control related to managing roles and permissions/information security. and removed use case labels Sep 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Subject: Access Control related to managing roles and permissions/information security. Type: use case proposes a new feature or function for the software using user-first language.
Projects
Development

No branches or pull requests

9 participants