Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help in DCM installation #36

Open
Thanh-Thanh opened this issue Nov 28, 2019 · 6 comments
Open

Need help in DCM installation #36

Thanh-Thanh opened this issue Nov 28, 2019 · 6 comments

Comments

@Thanh-Thanh
Copy link

Hi all,

We're trying to install DCM on our Test environment (Dataverse 4.10.1).

We've installed DCM as described here and configured dataverse in consequence (with ::DataCaptureModuleUrl and :UploadMethods).

However, when we call the dataverse api to get the upload script (api/datasets/:persistentId/dataCaptureModule/rsync?persistentId=$PERISTENTID), we have this error message in the log :

There was a problem getting the script for XGRMMT . DCM returned status code: 404]]

We have no idea what really is the problem and whether the DCM is well installed as needed.

Can someone please tell us how to check on the DCM installation ?

Thanks in advanced,

Thanh Thanh

@Thanh-Thanh
Copy link
Author

Hi all,
It's me again :)
We continue to investigate a little bit more about this problem.
Here is what we got 'till now:

  • The file /deposit/gen/upload-.bash is not created, this leads to the 404 error we had in the server.log of dataverse.

However, for further investigation, we're stuck : no logs written in /var/log/lighttpd/breakage.log or /var/log/lighttpd/error.log

With this new info, we wonder if someone can tell us how to process further to detect the real problem ?

Thanks in advanced,

Thanh Thanh

@pameyer
Copy link
Member

pameyer commented Nov 30, 2019

Hi Thanh-Thanh,

Assuming that lighttpd is working correctly (lighttpd running, calls showing up in /var/log/lighttpd/access.log, etc), the next stage that might have a problem would be rq. Is the rq service running, and Iis there anything informative in the rq logfile (/var/log/rq.log, if I'm remembering correctly)?

For testing if a DCM is setup correctly, you could use https://github.com/sbgrid/data-capture-module/blob/master/scripts/test_client.sh (which is what the jenkins pipelines use to test in isolation).

@pkiraly
Copy link

pkiraly commented Jan 27, 2020

I have some questions regarding to this software:

  • Which python version does it needs?
  • Is it planned to release a Debian package as well?
  • The Dataverse manual says: "Note that shared storage (posix or AWS S3) between Dataverse and your DCM is required. You cannot use a DCM with Swift at this point in time." -- is it possible to install it to the Dataverse server? If yes: is lighttpd a necessary or would it be possible to use Apache HTTP server instead?

Thanks,
Péter

@pameyer
Copy link
Member

pameyer commented Jan 27, 2020

HI @pkiraly

  • python 2.7 ; python3 upgrade should be fairly straightforward, but I haven't investigated if there will be dependency issues.
  • There aren't any current plans for a Debian package, but I'm open to pull requests if someone wants to add packaging / testing scripts.
  • For security, the DCM should be run on an isolated "disposable" system (with the caveat that the DCM's precursor was designed with the same assumption of disposability, been in production for several years, and shows no signs of needing to be disposed and reset). Technically speaking, the python API could be adapted to Apache in addition to lighttpd; and there are no intrinsic limitations preventing Dataverse and DCM from running on the same server - but this is a really bad idea for security and performance reasons, which are why they were designed to be isolated.

@pkiraly
Copy link

pkiraly commented May 19, 2021

@pameyer

Some suggestions for the installation documents:

  • please specify that it should run on a distinct system, not on the Dataverse server
  • I found during installation, that it depends on lighttpd server. It would be great to create a "Prerequisites" section which list it, and other required components (if there are).

@pameyer
Copy link
Member

pameyer commented May 19, 2021

@pkiraly - Thanks for the feedback!

For distinct systems, do you have suggestions for how to improve "install it on a stand-alone, disposable system" https://github.com/sbgrid/data-capture-module/blob/master/doc/installation.md to make this clearer?

For dependencies, those are currently specified in the RPM (and in my testing, installing the RPM also installed the dependencies); but it sounds like a good idea to mention them explicitly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants