Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terra Optional Param Choices #107

Closed
samuel-marsh opened this issue Jul 16, 2021 · 3 comments
Closed

Terra Optional Param Choices #107

samuel-marsh opened this issue Jul 16, 2021 · 3 comments
Assignees
Labels
enhancement New feature or improvement
Milestone

Comments

@samuel-marsh
Copy link

Hi Stephen & Team,

Just wondering if you could provide any guidance on couple of the optional parameters when running workflow on Terra for optimal performance (speed mainly).

Given that it runs on GPU using the workflow I'm wondering of the potential hardware_gpu_type whether you have seen performance difference between the various compatible GPUs available via Terra/GCP?

Also wondering if you have rec for hardware_memory_GB (I assume this means GPU memory?) or any of the other optional parameters that give performance gains in terms of run time.

Thanks very much and love the Terra workflow setup!
Sam

@sjfleming
Copy link
Member

Hi @samuel-marsh ,

Yes! My favorite GPU for this is the Tesla T4. It has a lot of memory, and it's cheaper (and marginally faster) than the Tesla K80. So that is currently the default "nvidia-tesla-t4". That being said, I've been thinking about cost, so I've only tried out the K80 and the T4, which have lower cost per hour than other GPUs. I guess theoretically it's possible that some of the more expensive and beefier GPUs might be faster, and then the price difference might not be what I'd expect... that I am not sure of, and I have not tested other GPUs yet.

The parameter hardware_memory_GB is for the CPU memory actually. I should make that more clear. But yes, the GPU memory is completely fixed and dictated by the GPU type. It is not adjustable.

Currently, the only other way to get a speedup would be to try to minimize total_droplets_included. But be a little cautious... if this number is too small, then you might end up missing out on some real cells. The idea for this number is to include all the droplets that you think might possibly contain a cell. That is to say, on the UMI curve, all droplets past this point should be "surely empty". But you don't have to make this number too large. Runtime scales linearly with total_droplets_included.

Soon there will be some code updates coming that will reduce the runtime. I am hoping to get those up within the next month or so.

@sjfleming
Copy link
Member

@sjfleming make this more clear in the documentation

@sjfleming sjfleming self-assigned this Apr 1, 2022
@sjfleming sjfleming added the enhancement New feature or improvement label Apr 1, 2022
@sjfleming sjfleming added this to the v0.3.0 milestone Apr 1, 2022
@sjfleming sjfleming mentioned this issue Mar 28, 2023
@sjfleming sjfleming mentioned this issue Aug 6, 2023
@sjfleming
Copy link
Member

Closed by #238

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or improvement
Projects
None yet
Development

No branches or pull requests

2 participants