Is there a specific reason to not let run the trainer in memory? #16

FraFabbri · 2024-03-22T18:17:49Z

Line 96 in ad3dd9c

if not DataDreamer.initialized() or DataDreamer.is_running_in_memory():

AjayP13 · 2024-03-22T18:47:24Z

Only because we can then assume there is a directory available to store checkpoints & store the final model weights. The training code relies on a directory being available to make things easier.

Do you have a use case for in-memory training? If you want to share, you can email me at ajayp@seas.upenn.edu and I can try to support it.

FraFabbri · 2024-03-22T18:52:12Z

makes sense, thanks for the explanation :)

My intuition was that if we can let the trainer running in memory, it would help for fast experimentation, e.g. when not necessarily we want to store the model weights but just running a notebook.

AjayP13 · 2024-03-22T19:01:28Z

If you train with LoRA, which we support across any of the trainers, it's pretty lightweight, you can have it only a save a few MBs, so that might help save disk space / make sure it's not too slow when running.

AjayP13 · 2024-03-25T11:09:03Z

Closing this for now, but if you run into any trouble with this let me know. Another thing worth noting here is you can use a hack to actually run DataDreamer training in memory by possibly utilizing /dev/shm/my_output_folder as the output directory. /dev/shm/ is a file system that stores data in RAM (ram disk) vs on-disk.

AjayP13 closed this as completed Mar 25, 2024

AjayP13 added the question Further information is requested label Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a specific reason to not let run the trainer in memory? #16

Is there a specific reason to not let run the trainer in memory? #16

FraFabbri commented Mar 22, 2024

AjayP13 commented Mar 22, 2024

FraFabbri commented Mar 22, 2024

AjayP13 commented Mar 22, 2024

AjayP13 commented Mar 25, 2024

Is there a specific reason to not let run the trainer in memory? #16

Is there a specific reason to not let run the trainer in memory? #16

Comments

FraFabbri commented Mar 22, 2024

AjayP13 commented Mar 22, 2024

FraFabbri commented Mar 22, 2024

AjayP13 commented Mar 22, 2024

AjayP13 commented Mar 25, 2024