-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding parameters for device context and patching of Scikit-Learn #23
Conversation
Example of the config file:
|
@@ -153,6 +153,12 @@ def parse_args(parser, size=None, loop_types=(), | |||
help='Seed to pass as random_state') | |||
parser.add_argument('--dataset-name', type=str, default=None, | |||
help='Dataset name') | |||
parser.add_argument('--device', type=str, default='None', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this params should be enabled in higher level scripts that will execute them - but i don't see this in make?
Another question - are we going to pass single value for bench.py or this should be tuple so bench.py will iterate via them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this moment I used only runner.py script. Possible that additional changes should be done in make or somewhere else. Thanks.
Devices are provided as a list in config "device": ["None", "host", "cpu", "gpu"],
. See example above. Benchmarks are executing with all devices in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't these changes break the launch of traditional DAAL? Can you check?
|
||
if args.patch_sklearn is not None and args.patch_sklearn != 'None': | ||
from daal4py.sklearn import patch_sklearn, unpatch_sklearn | ||
if args.patch_sklearn == "True": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are "True", "False" string, not boolean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more option is "None" - to leave patching state "as is". I think it can be useful for back compatibility. Possible we should change it to boolean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will be broken in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not common to have to write --patch-sklearn True
. It might be better to instead have two possible options: --patch-sklearn
setting patch_sklearn
to True
, --no-patch-sklearn
setting patch_sklearn
to False
, and then None
by default. This can be done by adding two arguments with the same dest='patch_sklearn'
, but with different action
s store_true
and store_false
, and the same default None
.
You may also want to put both of these arguments into a mutually exclusive group.
For the --device
option, I would keep it the way you have it right now, but just let the default be None
(not the string), and not allow specifying None (unless you call it something like auto
). --device None
is a bit confusing, but reading --device cpu
or --device auto
makes much more sense
If possible, it would be nice to not have the exact same construction of argument parsing in two places for both the device and for the patching. What if you added the same tri-state patch argument as a kwarg to patch_sklearn
, and the device as a kwarg to run_with_context
?
Why doesn't the config add to this repository? |
@PetrovKP At this moment we have only one example in the repository. I suppose to create a directory with various configs. Please comment what do you think about it. |
We wanted to store all the configs here. This is correct and convenient. I don’t know why we still haven’t moved the configs ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good. See comment about command-line options
In the future, it might be worth considering turning this project into a more standard Python module structure, with a sklearn_bench
module containing sklearn
, daal4py
, etc. as submodules. Currently, bench.py
is duplicated for each implementation of benchmarks, and it could just be moved to the sklearn_bench
module.
|
||
if args.patch_sklearn is not None and args.patch_sklearn != 'None': | ||
from daal4py.sklearn import patch_sklearn, unpatch_sklearn | ||
if args.patch_sklearn == "True": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not common to have to write --patch-sklearn True
. It might be better to instead have two possible options: --patch-sklearn
setting patch_sklearn
to True
, --no-patch-sklearn
setting patch_sklearn
to False
, and then None
by default. This can be done by adding two arguments with the same dest='patch_sklearn'
, but with different action
s store_true
and store_false
, and the same default None
.
You may also want to put both of these arguments into a mutually exclusive group.
For the --device
option, I would keep it the way you have it right now, but just let the default be None
(not the string), and not allow specifying None (unless you call it something like auto
). --device None
is a bit confusing, but reading --device cpu
or --device auto
makes much more sense
If possible, it would be nice to not have the exact same construction of argument parsing in two places for both the device and for the patching. What if you added the same tri-state patch argument as a kwarg to patch_sklearn
, and the device as a kwarg to run_with_context
?
Implemented in #133 |
This is initial suggestion in purpose to start the discussion.
Parameters for device context and Scikit-Learn patching are added.
Some benchmarks are changed in purpose to use new parameters.