You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Only because we can then assume there is a directory available to store checkpoints & store the final model weights. The training code relies on a directory being available to make things easier.
Do you have a use case for in-memory training? If you want to share, you can email me at ajayp@seas.upenn.edu and I can try to support it.
My intuition was that if we can let the trainer running in memory, it would help for fast experimentation, e.g. when not necessarily we want to store the model weights but just running a notebook.
If you train with LoRA, which we support across any of the trainers, it's pretty lightweight, you can have it only a save a few MBs, so that might help save disk space / make sure it's not too slow when running.
Closing this for now, but if you run into any trouble with this let me know. Another thing worth noting here is you can use a hack to actually run DataDreamer training in memory by possibly utilizing /dev/shm/my_output_folder as the output directory. /dev/shm/ is a file system that stores data in RAM (ram disk) vs on-disk.
DataDreamer/src/trainers/trainer.py
Line 96 in ad3dd9c
The text was updated successfully, but these errors were encountered: