Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shibboleth: Support "Federated Login Mode" (i.e. feed of Identity Providers from InCommon) #2937

Closed
4 tasks
pdurbin opened this issue Feb 9, 2016 · 24 comments
Closed
4 tasks

Comments

@pdurbin
Copy link
Member

pdurbin commented Feb 9, 2016

http://guides.dataverse.org/en/4.2.3/installation/shibboleth.html#dataverse-idp-metadata-xml explains how to configure /etc/shibboleth/dataverse-idp-metadata.xml to specify one or more Identity Providers (IdPs) that you would like to allow Shibboleth users to log into your installation of Dataverse from. Many Dataverse installations will chose this mode, only allowing users from their own institution to log in.

The Harvard Dataverse plans to run in a mode that I'll call "Federated Login Mode" for lack of a better term. This means that rather than configuring an XML file by hand to include a list of approve Identity Providers (IdPs), we will configure Shibboleth to periodically download a list of IdPs approved by InCommon. As of this writing there are 426 InCommon-approved IdPs: https://incommon.org/federation/info/all-entities.html#IdPs

Once the list of hundreds of InCommon Identity Providers (IdPs) is in place I imagine the login page will look something like the login page at https://www.hathitrust.org in the screenshot below:

screen shot 2016-02-29 at 2 27 54 pm

To do for this issue:

After release:

@pdurbin
Copy link
Member Author

pdurbin commented Mar 7, 2016

Here's how https://demo.dataverse.org looks now that I just did a quick one-off download of http://md.incommon.org/InCommon/InCommon-metadata.xml while I was working on documenting how to configure Dataverse for use with an identity federation:

screen shot 2016-03-07 at 3 54 03 pm

pdurbin added a commit that referenced this issue Mar 8, 2016
- Documente API to migrate Shib user to local #2915.
- Add Debugging section for #2916.
- Document identity federation stuff #2937.
- Reference :AllowSignup as part of "remote only" #2838.
@pdurbin
Copy link
Member Author

pdurbin commented Mar 9, 2016

In 21bd0e8 I documented a bit about how to set this up but I'm not sure what the best practices are. See http://guides.dataverse.org/en/2939-shib/installation/shibboleth.html#identity-federation for a preview of what I wrote.

@pdurbin
Copy link
Member Author

pdurbin commented Mar 21, 2016

I'm sending this to QA for feedback on what I wrote at http://guides.dataverse.org/en/2939-shib/installation/shibboleth.html#identity-federation

Once some of our test servers have been registered with InCommon as part of #2104 we'll reconfigure shib to use a feed as documented in the link above. I'm still not sure what the best practices are but I hope what I've written provides enough guidance.

@pdurbin pdurbin assigned kcondon and unassigned pdurbin Mar 21, 2016
@kcondon
Copy link
Contributor

kcondon commented Apr 7, 2016

We've discussed this since and discovered that shibd supports metadata refresh intervals and validation of metadata against a cert as recommended by InCommon. The sample config to achieve updates and validation that should be placed in shibboleth2.xml can be found here: https://spaces.internet2.edu/display/InCFederation/Shibboleth+Metadata+Config#ShibbolethMetadataConfig-ConfiguretheShibbolethSP

Also, the refresh interval in the sample shows 7200 milliseconds rather than the default. It might be good to restore the default value as an example.

@pdurbin
Copy link
Member Author

pdurbin commented Apr 19, 2016

@kcondon in 5edf6a3 I rewrote the section on identity federations. Thanks for all the feedback. I hope you like this better. As before you can preview the docs at http://guides.dataverse.org/en/2939-shib/installation/shibboleth.html#identity-federation

Regarding reloadInterval it turns out that the value is expressed in seconds as explained at https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPReloadableXMLFile so I didn't change anything since 7200 (2 hours) is the default (as seen in /etc/shibboleth/shibboleth2.xml.dist) and seems reasonable.

Passing to QA.

@pdurbin pdurbin assigned kcondon and unassigned pdurbin Apr 19, 2016
@kcondon
Copy link
Contributor

kcondon commented Apr 19, 2016

Doc is good. Waiting for InCommons to test federation and config.

@pdurbin
Copy link
Member Author

pdurbin commented Jun 8, 2016

Woo hoo! https://dataverse.harvard.edu is now listed as an InCommon Service Provider (SP)! The following screenshot is what you see if you click "Harvard Dataverse" at https://incommon.org/federation/info/all-entities.html#SPs

incommon

Here's what you see at https://incommon.org/federation/info/org.html?orgName=Harvard%20College

screen shot 2016-06-10 at 2 43 37 pm

The next step will be to get https://dataverse.harvard.edu listed under the "Research & Scholarship" category at https://incommon.org/federation/info/all-entity-categories.html because as I mentioned in my last comment, we recently learned that it is highly unlikely the the 400+ InCommon Identity Providers (IdPs) will release the attributes Dataverse requires (name and email, basically).

@pdurbin
Copy link
Member Author

pdurbin commented Jun 10, 2016

https://dataverse-test.irss.unc.edu is now listed under "research-and-scholarship" at https://incommon.org/federation/info/all-entity-categories.html#SPs ...

incommon_federation_info_entity_categories_-_2016-06-10_16 21 56

... which resulted in MIT and UIC being able to log in!

Unfortunately, Harvard and Emory users can't log in because neither institution is part of the Research & Scholarship category at https://www.incommon.org/federation/info/all-entity-categories.html#IdPs (only 57 of the 430 InCommon IdPs are part of that category).

I'm keeping track of who can log in at https://docs.google.com/spreadsheets/d/1fWHGamXetTQw3cpf6Tc4efmTvoC97lpEeTLvN7Mxz3g/edit?usp=sharing

@pdurbin
Copy link
Member Author

pdurbin commented Jun 13, 2016

Unfortunately, Harvard and Emory users can't log in because neither institution is part of the Research & Scholarship category at https://www.incommon.org/federation/info/all-entity-categories.html#IdPs (only 57 of the 430 InCommon IdPs are part of that category).

@donsizemore figured out how to limit the number of institutions that can log in to https://dataverse-test.irss.unc.edu to just the 57 that are part of the Research & Scholarship category by following https://spaces.internet2.edu/display/InCFederation/Migrating+an+SP+to+Global+Research+and+Scholarship . Of course, that leaves Harvard users and many others out in the cold but it makes no sense to offer a login that won't work due to required attributes not being released.

@pdurbin
Copy link
Member Author

pdurbin commented Jun 13, 2016

https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPMetadataProvider#NativeSPMetadataProvider-ChainingMetadataProvider is interesting. The examples shows a URL from a federation and a local XML file:

<MetadataProvider type="Chaining">
    <MetadataProvider type="XML" path="partners.xml"/>
    <MetadataProvider type="XML" url="https://federation.org/metadata.xml" backingFilePath="fedmetadata.xml"/>
</MetadataProvider>

It also says "With V2.4 and above, this is implied by any configuration with multiple elements, so is no longer explicitly needed unless one of its optional settings is required." This means we might not even need to use the "Chaining" stuff. It might just work. Something to investigate since I think we'll need both a feed a local file in production.

Thank you @donsizemore for figuring out you can use multiple MetadataProvider elements.

@pdurbin
Copy link
Member Author

pdurbin commented Jun 22, 2016

@donsizemore have been talking a lot about how to handle Identity Providers (IdPs) who do not release attributes that Dataverse requires such as "eppn" (a unique identifier for a user).

He pointed out https://spaces.internet2.edu/display/InCFederation/Error+Handling+Service which is a service InCommon member can use to display a somewhat friendlier error message. For example, if a researcher from the University of Texas at Austin tried to log into https://dataverse.harvard.edu we could put the more descriptive error message at https://ds.incommon.org/FEH/sp-error.html?sp_entityID=https%3A%2F%2Fdataverse.harvard.edu%2Fsp&idp_entityID=https%3A%2F%2Fidp.its.utexas.edu%2Fidp%2Fshibboleth in an iframe or something. Here's how that page looks:

incommon_federated_error_handling_-_2016-06-22_08 58 07

In practice, for now anyway, we plan to prevent researches from the University of Texas at Austin and many other InCommon institutions that are not part of the Research & Scholarship category at https://incommon.org/federation/info/all-entity-categories.html#IdPs (only 60 of 433 IdPs are part as of this writing) from even attempting to log in to https://dataverse.harvard.edu with their institutional credentials by filtering out the 373 institutions that are not part of the Research & Scholarship category.

@pdurbin
Copy link
Member Author

pdurbin commented Jul 29, 2016

Woo-hoo! https://demo.dataverse.org and https://beta.dataverse.org were just added as InCommon Service Providers! That means we can work on switching from "Specific Identity Provider(s)" mode to "Identity Federation" mode as documented at http://guides.dataverse.org/en/4.4/installation/shibboleth.html#specific-identity-provider-s-vs-identity-federation

Part of this will be setting up (and probably documenting on the page above via a pull request) how to set up periodic metadata refresh which has been a blocker for @djbrooke getting the Harvard Dataverse servers (production, demo, and beta) added to the Research & Scholarship category at https://incommon.org/federation/info/all-entity-categories.html#SPs

murphy:tmp pdurbin$ curl -s http://md.incommon.org/InCommon/InCommon-metadata.xml | grep dataverse | grep entityID
<EntityDescriptor entityID="https://dataverse-test.irss.unc.edu/shibboleth">
<EntityDescriptor entityID="https://dataverse.unc.edu/shibboleth">
<EntityDescriptor entityID="https://beta.dataverse.org/sp">
<EntityDescriptor entityID="https://dataverse.harvard.edu/sp">
<EntityDescriptor entityID="https://demo.dataverse.org/sp">

I'm using the curl command above because it seems like https://incommon.org/federation/info/all-entities.html#SPs isn't working properly. None of those 5 Dataverse servers appear there, as @donsizemore and I have been discussing at http://irclog.iq.harvard.edu/dataverse/2016-07-26#i_38772

pdurbin added a commit that referenced this issue Aug 1, 2016
Also remove "experimental" since #2117 has been closed.
@pdurbin
Copy link
Member Author

pdurbin commented Aug 1, 2016

I just reconfigured https://demo.dataverse.org for "Federated Login Mode" and captured the necessary changes into f0f5ac9 (maxRefreshDelay="3600" which is in seconds, is the key).

Next @djbrooke will be picking up where we left on the R&S form: https://spaces.internet2.edu/display/InCFederation/Research+and+Scholarship+Application+Form . The demo site is ready to go. We can now check the box next to "My service refreshes and verifies metadata at least daily" for the demo site.

Eventually, the beta and production sites need to be reconfigured as well.

@pdurbin
Copy link
Member Author

pdurbin commented Aug 2, 2016

@djbrooke discussed this issue. I'll take a swing at filling out https://spaces.internet2.edu/display/InCFederation/Research+and+Scholarship+Application+Form myself.

@pdurbin
Copy link
Member Author

pdurbin commented Aug 2, 2016

For entityID "https://demo.dataverse.org/sp" I filled out https://spaces.internet2.edu/display/InCFederation/Research+and+Scholarship+Application+Form and we are tracking this at https://help.hmdc.harvard.edu/Ticket/Display.html?id=239200

For "How my service supports research and scholarship" I put "Dataverse is open source research data repository software. The Harvard Dataverse is open for all researchers worldwide from all disciplines to deposit data."

I checked the boxes agreeing to https://refeds.org/category/research-and-scholarship and http://www.incommon.org/docs/policies/participationagreement.pdf but someone with more authority such as @djbrooke or @mcrosas should review these documents as well, expecially when we fill out this form for our production service.

That's the demo site. The beta site is in use for usability testing today but @kcondon and I plan to reconfigure it soon so that in the R&S form above we can check the box that says "My service refreshes and verifies metadata at least daily." I'll co-assign this issue to him so some knowledge transfer can take place.

Some coordination will be necessary to enable this in production and meet the refresh requirement without confusing users. InCommon members won't be able to log in until production is part of the R&S category.

pdurbin added a commit that referenced this issue Aug 5, 2016
@kcondon
Copy link
Contributor

kcondon commented Aug 5, 2016

Works, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants