Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions about optimizing image downloads - currently rate stuck at 1 image per 5 seconds #69

Closed
jhpoelen opened this issue Feb 3, 2023 · 13 comments

Comments

@jhpoelen
Copy link

jhpoelen commented Feb 3, 2023

Hi @amilworks -

Thanks for your work in developing BisQue .

I am working with an instance of BisQue hosted by Cyverse, and I'd like to understand more efficient ways in accessing large sets of images through Bisque.

Currently, I am using URLs like:

https://bisque.cyverse.org/image_service/image/00-D9CTqQSfzuC553PxBwbt4d/resize:4000/format:jpeg

and

https://bisque.cyverse.org/image_service/image/00-D9CTqQSfzuC553PxBwbt4d

and, it appears that, regardless of the size of the image, the transfer rate is limited to 1 image per 5 seconds.

Is this expected?

Do you have suggestions on how to make accessing hundreds of thousands of images through your service a little faster?

Thanks for your time in reviewing my questions.

For context, see bio-guoda/preston-brit-2022#3 .

fyi @jbest

@jhpoelen
Copy link
Author

jhpoelen commented Feb 3, 2023

@edwins
Copy link

edwins commented Feb 6, 2023

@jhpoelen I suggest transferring large data directly through the CyVerse Data Store, such as using the CLI aka iCommands, rather than transferring through bisque directly: https://learning.cyverse.org/ds/move_data/

@jhpoelen
Copy link
Author

jhpoelen commented Feb 6, 2023

@edwins thanks for your pointer and glad to hear there's another method to access CyVerse resources.

I tried to install the icommands cli on target system:

Linux larus 5.15.0-58-generic #64-Ubuntu
or Ubuntu 22.04.1 LTS .

, and found:

$ sudo apt install irods-icommands
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 irods-icommands : Depends: irods-runtime (= 4.3.0)
                   Depends: libssl1.1 but it is not installable
E: Unable to correct problems, you have held broken packages.

and

$ sudo apt install irods-runtime
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 irods-runtime : Depends: libssl1.1 but it is not installable
E: Unable to correct problems, you have held broken packages.

Do you happen to know who to contact about this installation issue?

@iychoi
Copy link

iychoi commented Feb 13, 2023

Hi @jhpoelen

If you have dependency issues while installing iCommands, Gocommands is another option. It is highly portable.

https://github.com/cyverse/gocommands

@jhpoelen
Copy link
Author

@iychoi - thanks! I was able to download Gocommands, and run it -

$ ./gocmd 
Usage:
  gocmd [subcommand] [flags]
  gocmd [command]

Available Commands:
  bput         Bundle-upload files or directories
  bun          Extract iRODS data-objects in a structured file format to target collection
  cat          Display the content of an iRODS data-object
  cd           Change current working iRODS collection
  completion   Generate the autocompletion script for the specified shell
  copy-sftp-id Copy SSH public key to iRODS for SFTP access
  cp           Copy iRODS data-objects or collections to target collection
  env          Print current irods environment
  get          Download iRODS data-objects or collections
  help         Help about any command
  init         Initialize gocommands
  ls           List entries in iRODS collections
  mkdir        Make iRODS collections
  mv           Move iRODS data-objects or collections to target collection, or rename data-object or collection
  passwd       Change iRODS user password
  ps           List processes
  put          Upload files or directories
  pwd          Print current working iRODS collection
  rm           Remove iRODS data-objects or collections
  rmdir        Remove iRODS collections
  svrinfo      Display server information
  sync         Sync local directory with iRODS collection

Flags:
  -c, --config string     Set config file or dir (default is $HOME/.irods)
  -d, --debug             Enable debug mode
  -e, --envconfig         Read config from environmental variables
  -h, --help              Print help
  -R, --resource string   Set resource server
  -s, --session int32     Set session ID (default -1)
  -v, --version           Print version

Use "gocmd [command] --help" for more information about a command.

Now, I'll try and figure out how to configure the tool to get the images associated with endpoints like -

curl -I https://bisque.cyverse.org/image_service/image/00-D9CTqQSfzuC553PxBwbt4d

see also bio-guoda/preston-brit-2022#4 and associated repo https://github.com/bio-guoda/preston-brit--2022 .

If you have any tips or tricks, please do share!

@jhpoelen
Copy link
Author

Does cyverse support anonymous read access via gocmd ?

@iychoi
Copy link

iychoi commented Feb 13, 2023

You would be able to find the physical path of the image when you click the image in Bisque in the description/attributes.

Yes, cyverse allows anonymous read access to most of datasets in /iplant/home/shared. But not for data under users' home.

@jhpoelen
Copy link
Author

Yay for anonymous access!

Following your tip and https://learning.cyverse.org/ds/gocommands/#anonymous-access-to-the-cyverse-data-store,

I tried:

$ ./gocmd init
iRODS Host [data.cyverse.org]: 
iRODS Port [1247]: 
iRODS Zone [iplant]: 
iRODS Username: anonymous
iRODS Password: 
Please provide password

iRODS Password: 

Apologies for asking these silly questions, this is my first time trying to use the gocmd tool.

@iychoi
Copy link

iychoi commented Feb 13, 2023

The bug was fixed in gocommands v0.4.5. Please check the version.

@jhpoelen
Copy link
Author

$ ./gocmd --version
{
  "clientVersion": "v0.4.2",
  "gitCommit": "d2a4bae8ae8f2a8cc885ffcbd9949506245e0a46",
  "buildDate": "2023-01-19T20:27:50Z",
  "goVersion": "go1.18.6",
  "compiler": "gc",
  "platform": "linux/amd64"
}

So, I have an older version. I'll upgrade in a minute.

By the way -

I was also able to connect via sftp -

sftp anonymous@data.cyverse.org 

Thanks again for taking the time to reply.

@iychoi
Copy link

iychoi commented Feb 13, 2023

Yes, SFTP is another convenient way to access your data and it also supports anonymous access.
Great to hear that you found it. 👍

@amilworks
Copy link
Member

Huge thanks to the Cyverse folks at UofA for helping with this issue!

@jhpoelen Hopefully we answered your iRODS latency issue questions. For now, I will close this issue, but feel free to reopen or reach out if you have further questions.

@jhpoelen
Copy link
Author

jhpoelen commented Mar 9, 2023

@amilworks thanks for the follow-up on this issue. Unfortunately, I was unable to understand how to retrieve the bisque cyverse images via sftp or irods. I am sure it is possible, but I after spending some time on it, I gave up.

However, I was able to use some URL endpoint to retrieve the (original) images via the BisQue web stack.

See e.g., bio-guoda/preston-brit-2022#3 (comment) .

Again, I appreciate your replies and hope that some day I'll have more time / reason to better understand your sophisticated BisQue tool to manage and analyze images and image volumes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants