Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a specific reason to not let run the trainer in memory? #16

Closed
FraFabbri opened this issue Mar 22, 2024 · 4 comments
Closed

Is there a specific reason to not let run the trainer in memory? #16

FraFabbri opened this issue Mar 22, 2024 · 4 comments
Labels
question Further information is requested

Comments

@FraFabbri
Copy link

if not DataDreamer.initialized() or DataDreamer.is_running_in_memory():

@AjayP13
Copy link
Collaborator

AjayP13 commented Mar 22, 2024

Only because we can then assume there is a directory available to store checkpoints & store the final model weights. The training code relies on a directory being available to make things easier.

Do you have a use case for in-memory training? If you want to share, you can email me at ajayp@seas.upenn.edu and I can try to support it.

@FraFabbri
Copy link
Author

makes sense, thanks for the explanation :)

My intuition was that if we can let the trainer running in memory, it would help for fast experimentation, e.g. when not necessarily we want to store the model weights but just running a notebook.

@AjayP13
Copy link
Collaborator

AjayP13 commented Mar 22, 2024

If you train with LoRA, which we support across any of the trainers, it's pretty lightweight, you can have it only a save a few MBs, so that might help save disk space / make sure it's not too slow when running.

@AjayP13
Copy link
Collaborator

AjayP13 commented Mar 25, 2024

Closing this for now, but if you run into any trouble with this let me know. Another thing worth noting here is you can use a hack to actually run DataDreamer training in memory by possibly utilizing /dev/shm/my_output_folder as the output directory. /dev/shm/ is a file system that stores data in RAM (ram disk) vs on-disk.

@AjayP13 AjayP13 closed this as completed Mar 25, 2024
@AjayP13 AjayP13 added the question Further information is requested label Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants