Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TwoRavens is not working with dataverse, even after running the TwoRavens instance #5859

Closed
HasanKhatib opened this issue May 18, 2019 · 9 comments

Comments

@HasanKhatib
Copy link

HasanKhatib commented May 18, 2019

I am using Dataverse instance from dataverse-docker repository, but I updated the dataverse version to 4.14.
Now I am working on adding TwoRavens integration to dataverse, I followed the documentation and managed to get it running on Apache server then exposed the port to use it on my host machine.
Screenshot from 2019-05-18 20-02-16

But, the problem is that even after configuring dataverse to add an external tool using twoRavens.json file, I'm still missing the Explore button next to .dat files.

Here are the steps I followed up to this moment:

  • Installed TwoRavens dependencies and itself on the docker image
  • Exposed port 80 to be used on the host machine
  • Tested dataexplore (TwoRavens) on the host machine and it's working!
  • Updated twoRavens.json file to be as follow:
{
  "displayName": "TwoRavens",
  "description": "A system of interlocking statistical tools for data exploration, analysis, and meta-analysis.",
  "type": "explore",
  "toolUrl": "http://localhost:85/dataexplore/gui.html",
  "toolParameters": {
    "queryParameters": [
      {
        "dfId": "{fileId}"
      },
      {
        "key": "{apiToken}"
      }
    ]
  }
}
  • Executed this command to add the external tool:
    curl -X POST -H 'Content-type: application/json' --upload-file twoRavens.json http://localhost:8085/api/admin/externalTools

Finally, it shows that the external tool was added, but there is nothing on the dataverse side.
Screenshot from 2019-05-18 20-20-20

@pdurbin
Copy link
Member

pdurbin commented May 20, 2019

@HasanKhatib hi! First I wanted to say that I noticed samihabakri/D2S-Dataverse#3 about TwoRavens and all the other very interesting issues in that "Design to Software - Dataverse Project" with @TomMiksa and others. If you could talk a little here about the class and how we can help, I'm interested. 😄

Let me try to explain the problem you're having with the lack of the Explore button on your file. The problem is that your dta file was not successfully ingested. When it goes though a successful ingest, it shows observations and looks like this:

Screen Shot 2019-05-20 at 8 03 51 AM

I would suggesting trying a different file format. From https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/TJCLKP/3VSTKY you can download a tab separated file that should ingest properly (I know because it's mine 😄 ) and look like this:

Screen Shot 2019-05-20 at 8 19 56 AM

I hope this makes sense. Please try that tab separated file and keep the questions coming!

@TomMiksa
Copy link

Hi @pdurbin !
Thanks for offering your help!
We offer a project related to the Data Stewardship course at the Vienna University of Technology, Austria [1]. We want our students to get a hands-on experience in configuring and running repository systems, so that the DS course is not purely theoretical. Additionally, we are evaluating which of the existing repository systems would best fit for a data repository that we're thinking of establishing at our university [2]. The primary goal is to go beyond a pure publication repository and to focus on data. Hence, also our interest in the TwoRavens integration.

I guess there will be some more students testing Dataverse soon, because we have also other groups working on extending Dataverse ingest workflow to incorporate information from machine-actionable DMPs [3] - a new standard developed within the Research Data Alliance.

Are you by chance attending the open repositories conference in Hamburg [4]?

[1] https://tiss.tuwien.ac.at/course/courseDetails.xhtml?dswid=1123&dsrid=916&courseNr=194044&semester=2019S
[2] https://www.tuwien.at/en/research/rti-support/research-data/overview/
[3] https://github.com/RDA-DMP-Common/RDA-DMP-Common-Standard
[4] https://or2019.blogs.uni-hamburg.de

@djbrooke
Copy link
Contributor

Hey @TomMiksa, thanks for all the details. To respond to one of the points, OR2019 unfortunately wasn't in the budget this year. :( We sent some folks last year. It's a fantastic conference. There are usually some folks from the larger Dataverse Community there.

@skasberger
Copy link
Contributor

Hi @TomMiksa

I am the DevOps engineer of AUSSDA - The Austrian Social Science Data Archive. We use Dataverse, and are very happy with it. For non-production usage I can recommend the DataverseEU docker container 2. After 15min it runs on your machine, and then you can start to configure it yourself.
If you have any further questions, simply get in touch with me.

@pdurbin
Copy link
Member

pdurbin commented May 22, 2019

@TomMiksa your course sounds great and I love that the project is so hands on!

With regard to an installation of Dataverse at TU Wien, you and others are of course welcome to open additional issues or ask at https://groups.google.com/forum/#!forum/dataverse-community . This morning @skasberger , the author of https://github.com/AUSSDA/pyDataverse (just released on PyPI!) said that you and he seem to know a lot of people in common: http://irclog.iq.harvard.edu/dataverse/2019-05-22#i_94617 and you and others are welcome to join that conversation at http://chat.dataverse.org . (Whoops, while I was writing this I see Stefan commented here already. 😄 ) The Dataverse team also offers installation assistance privately if you'd like to email support@dataverse.org to create a ticket.

As you may know, Dataverse is not intended to be used primarily for publications but I'm glad to hear you plan to focus on data. This issue is about TwoRavens but I'd encourage you to also check out Data Explorer, which is one of the new "external tools" listed at http://guides.dataverse.org/en/4.14/installation/external-tools.html . Just to give you a quick visual, here are some screenshots of Data Explorer in action by clicking "Explore" over at https://dataverse.scholarsportal.info/file.xhtml?fileId=8988

Screen Shot 2019-05-22 at 5 58 56 AM

Screen Shot 2019-05-22 at 5 59 34 AM

To be clear, you can have both Two Ravens and Data Explorer at the same time. The "Explore" button becomes a drop down. And there are other external tools listed on that page that you might want to try out. The author of the File Previewers tools ( @qqmyers ) just got a shout out during out community call yesterday. Oh, you and others are welcome to call in to those too. They happen every other Tuesday: https://dataverse.org/community-calls

I don't know a lot about DMP but thanks for the pointer to that new standard. People in the Dataverse community talk about DMPTool now and then and you can find a thread on this topic at https://groups.google.com/d/msg/dataverse-community/JJB33tqykrI/qLysUwI6BgAJ

I had the choice between Berlin and Hamburg and ended up giving a talk on Open Science Days. 😄 You can find my slides at https://osd.mpdl.mpg.de/?page_id=1250 or https://dataverse.org/presentations/research-software-and-dataverse . I was very glad to meet @poikilotherm there an catch up with @RightInTwo . As @djbrooke said, I suspect that some Dataverse people might be in Hamburg. You are welcome to post a thread about this at https://groups.google.com/forum/#!forum/dataverse-community to see if there are others you can meet up with!

I hope this helps. Please keep the questions coming!

@TomMiksa
Copy link

Thank you all (@skasberger, @pdurbin, @djbrooke) for the replies and the interesting links!

@skasberger I think the handshake theorem works very well and we’re just one handshake away. I think we even have a meeting scheduled with AUSSDA team in September – ask Veronika :) Some of the students were also testing your OAI-PMH endpoint recently - check your logs :) We were basically discussing, whether we need to have OpenAIRE compliance, or if we only connect to DataCite. The decision for now, DataCite should be enough. Scholix should anyway make our data visible to openAIRE.

@pdurbin Maybe I was not super clear. In a perfect world, we want to have no publications in the Dataverse, but I am afraid this cannot be evaded. If we create a Dataverse per project, then likely people will put there project deliverables, power points, etc. In fact it aligns with the vision of research objects that store together data and other artefacts which are necessary to interpret the data.
Our long term goal would be to build infrastructure around Dataverse in such a way that we could host databases that people use during their projects (insert, delete, update, select). They could cite their content at a specific moment in time - when they used it for computation. To do that, we would rely on the RDA Data Citation recommendations.
As a result, we would break up with the division into phases: “managing active data during the project” and “repository deposit”. Instead we would have one system for all phases.

@pdurbin
Copy link
Member

pdurbin commented May 23, 2019

@TomMiksa In Dataverse 4.14 OpenAIRE-compliant exports via OAI-PMH were added (in pull request #4664 for issue #4257 ... thank you to @fcadili for finishing it!) and if you'd like to play around with these, you can check which version of Dataverse is installed by using this "version" endpoint: https://data.aussda.at/api/info/version . https://dataverse.harvard.edu has already been upgraded to Dataverse 4.14 and if you'd like to get a list of other installations that are happy to let you harvest from them, you can visit this "List of Dataverse installation OAI-PMH (Harvesting) URLs and sets" spreadsheet: https://docs.google.com/spreadsheets/d/12cxymvXCqP_kCsLKXQD32go79HBWZ1vU_tdG4kvP5S8/edit?usp=sharing

When I search my email I find a reference to http://www.researchobject.org from 2014 but this website wasn't really on my radar so thanks for linking us to it. When I hear of code and data together I think of what Dataverse calls "replication datasets" and you might be interested in my 5 minute lightning talk on this subject (slides: https://www.slideshare.net/philipdurbin/reproducibility-and-dataverse , video: https://www.youtube.com/watch?v=SuyQTsOGugc , transcript: http://wiki.greptilian.com/talks/2018/wholetale-reproducibility-and-dataverse/ ). Recently, we were awarded a grant to integrate Dataverse with Code Ocean and you can read more about the progress in #5028. There are related community efforts to integrate with Binder from the Jupyter project at #4714 and integration with Whole Tale has been kicked off and documented in #5097 (some nice screenshots).

Yes, being able to cite your content at various moments in time is critical. That's why Dataverse supports versioning, which you can read about at http://guides.dataverse.org/en/4.14/user/dataset-management.html#dataset-versions . We also are quite happy for a lot of the pre-publication work to happen outside Dataverse in systems such as Dropbox, Open Science Framework (OSF), RSpace. We highlight these specifically at http://guides.dataverse.org/en/4.14/admin/integrations.html#getting-data-in but these integrations are build on Dataverse APIs and anyone can integrate however they see fit.

@djbrooke
Copy link
Contributor

djbrooke commented Oct 2, 2019

Hi all - I'm going to close out this specific issue since it seems to be resolved, but please feel free to reach out generally on the Google Group or through another communication channel for the discussion of any other topics!

@djbrooke djbrooke closed this as completed Oct 2, 2019
@pdurbin
Copy link
Member

pdurbin commented Oct 2, 2019

@djbrooke I should mention that when @landreev and I met with @raprasad the other day and I showed him how TwoRavens is broken on demo, he said he's fine with simply removing it. Maybe we should have an issue tracker for demo? As I've said I have BIG PLANS for demo some day, if we can get DCM and other goodies rolled in to dataverse-kubernetes: gdcc/dataverse-kubernetes#68 . I feel like all this is related to #5688.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants