Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keras DataGenerators #2

Open
Lswhiteh opened this issue Aug 19, 2020 · 4 comments
Open

Keras DataGenerators #2

Lswhiteh opened this issue Aug 19, 2020 · 4 comments

Comments

@Lswhiteh
Copy link

Hi there, was looking through your codebase and paper and after your mention of data being too large to load into memory, I was wondering if you had heard of/tried Keras DataGenerators. They stream data on the fly, and you can customize them quite a bit.

I have some custom generators built for some other genetics deep learning networks I'm working on, would you mind if I submitted a pull request in the next few days after testing if a generator works? I might also functionalize the script while I'm at it so it can be more modular for the actual generation process.

@andrewkern
Copy link
Member

Hey @Lswhiteh. Yeah we have used DataGenerators for other projects and would definitely welcome PRs here!

In this particular project I think the problem with memory is not the batch size per se, but instead the size of the individual tensors being passed to the GPU. Is that correct @cjbattey?

@cjbattey
Copy link
Collaborator

Yeah the memory issue I ran in to (ie with running > a million SNPs) were in loading tensors for a single minibatch into GPU memory, so I'm not sure if a generator would help there. Definitely open to any PR's testing it though!

@Lswhiteh
Copy link
Author

Sounds good. I'm waiting for another model to finish training so I might fiddle around with it in my free time. I'm also not sure if generators would help in that case, but worth a shot since it's already written.

I'll let you know how it goes!

@andrewkern
Copy link
Member

right on. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants