-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Base classes for accelerator refactoring #5715
Conversation
Co-Authored with @awaelchi
Co-authored with @awaelchi
Co-Authored with @awaelchi
Co-Authored with @awaelchi
Co-authored with @awaelchi
Hello @justusschock! Thanks for updating this PR.
Comment last updated at 2021-01-30 19:15:52 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking forward to it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
with self.precision_plugin.val_step_context(): | ||
with self.training_type_plugin.val_step_context(): | ||
return self.lightning_module.validation_step(*args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type of logic makes me think the precision plugin should live in the training type plugin, and that the base training type plugin manage the precision logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge it as is and then change it later on :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed with @SeanNaren - that cleans up the interface for the accelerator then. conceptually the accelerator simply manages the training plugin with additional lifecycle hooks, and the training plugin is responsible for precision/whatever else(?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I also agree with @SeanNaren and there are plans to support that on top of the existing PoC since then this is probably easier to change then now.
e940e5e
to
a341186
Compare
replacing usage of all unexisting Plugins and annotating with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for all these @abstractmethod
with raise NotImplementedError
use only one...
@abstractmethod
cannot be instantiated till all these methods are implementedraise NotImplementedError
can be instantiated, but will crash when touching one of these
pytorch_lightning/plugins/training_type/training_type_plugin.py
Outdated
Show resolved
Hide resolved
pytorch_lightning/plugins/training_type/training_type_plugin.py
Outdated
Show resolved
Hide resolved
pytorch_lightning/plugins/training_type/training_type_plugin.py
Outdated
Show resolved
Hide resolved
pytorch_lightning/plugins/training_type/training_type_plugin.py
Outdated
Show resolved
Hide resolved
pytorch_lightning/plugins/training_type/training_type_plugin.py
Outdated
Show resolved
Hide resolved
pytorch_lightning/plugins/training_type/training_type_plugin.py
Outdated
Show resolved
Hide resolved
pytorch_lightning/plugins/training_type/training_type_plugin.py
Outdated
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## release/1.2-dev #5715 +/- ##
================================================
- Coverage 89% 88% -1%
================================================
Files 168 169 +1
Lines 12423 12673 +250
================================================
+ Hits 11102 11180 +78
- Misses 1321 1493 +172 |
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justusschock @awaelchli @SeanNaren what's the expected control flow across trainer/training loop/accelerator/training type plugin? i'm getting mixed up in trying to figure out what will call what when
with self.precision_plugin.val_step_context(): | ||
with self.training_type_plugin.val_step_context(): | ||
return self.lightning_module.validation_step(*args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed with @SeanNaren - that cleans up the interface for the accelerator then. conceptually the accelerator simply manages the training plugin with additional lifecycle hooks, and the training plugin is responsible for precision/whatever else(?)
import torch | ||
|
||
|
||
class Plugin(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are all of these needed for all plugins? are there any steps here that we could/should move to the training type plugin?
@ananthsub good comments let address them in follow-up PRs :] #5718 |
|
||
|
||
class TrainingTypePlugin(Plugin, ABC): | ||
"""A Plugin to change the behaviour of the training, validation and test-loop.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justusschock - another question: currently these training loop functions are set on the accelerator backend in fit()
- should these be represented in the interface here or on the accelerator?
What do you think about the training type plugin accepting these as constructor arguments, assuming we can define some base interfaces for train/evaluation/test loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is needed anymore, since we pass the Trainer to start_training etc, where we can directly call trainer.train
What does this PR do?
Adds the base classes from #5616
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃
Co-Authored with @awaelchli