Replies: 5 comments 8 replies
-
The submitit plugin runs locally vs non-locally based the config setting for the I'd suggest printing the config from each of your experiment runs to confirm that
If the above commands show that # src/train.py
...
@hydra.main(...)
def app(cfg):
# add lines printing the config for diagnostic purposes:
print("\n JOB CONFIG:")
print(OmegaConf.to_yaml(cfg)) # print the job config
from hydra.core.hydra_config import HydraConfig
print("\n HYDRA CONFIG:")
print(OmegaConf.to_yaml(HydraConfig.get())) # print the Hydra config
... Looking at the output, can you confirm that |
Beta Was this translation helpful? Give feedback.
-
Here's a small set of code that reproduces the issue I have. Should we keep the discussion here or should I open an issue? From the configs, it looks like to me the issues are more on the submitit part since the configs are correctly overridden? |
Beta Was this translation helpful? Give feedback.
-
@Jasha10 sorry for spamming, but do you think this should be submitted as an issue? I currently don't have any solutions and it doesn't look like a trivial problem. |
Beta Was this translation helpful? Give feedback.
-
hi @Aceticia - in your example you have |
Beta Was this translation helpful? Give feedback.
-
@Aceticia i cannot repro your issue with your example. I installed your example and ran it on our slurm cluster and I could see the jobs were run on the cluster, am i missing something here?
if i cat the slurm log
|
Beta Was this translation helpful? Give feedback.
-
I'm working on a project that starts from this repo. I wrote some custom experiments that uses submitit, and it correctly submits the right jobs if I run commands with only one experiments, like
python src/train.py -m experiment=rnn seed=0,1,2
. However, if I run 2 or more experiments, for examplepython src/train.py -m experiment=rnn,temporal_attn seed=0,1,2
,submitit will stop working and run the job locally. Is this likely a feature or a bug (in my config or in the plugin)?Some clues to the problem: The experiments themselves use the
# @package _global_
header. Could this be what makes it behave differently?I currently don't have a minimal example to reproduce this, but please let me know if you need anything to solve this. Thank you.
Beta Was this translation helpful? Give feedback.
All reactions