Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK - Lightweight - Added support for complex default values #1696

Conversation

Ark-kun
Copy link
Contributor

@Ark-kun Ark-kun commented Jul 29, 2019

Allows creating components from python function with complex default values and calling them preserving those defaults.

Problem: We want to enable users to create Pipelines components from python functions. Some functions have complex default values. Pipelines SDK cannot currently serialize and pass such arguments (and for some kinds of default values it does not even seem to be possible). The only way to make it possible for the function to receive those original complex default values is to make it possible to skip those arguments when instantiating the component (as opposed to passing None or '').

This PR makes the following changes:

  • When creating a component form a python function, the inputs that have defaults will be optional (they will be marked as optional in ComponentSpec; the command-line parsing code will consider them not required; the ComponentSpec.implementation.arguments will be generated accordingly so that the command-line parameters are omitted when not passed.)
  • When generating task factory function, the parameters corresponding to optional inputs will be marked in a special way so that it's possible to detect when the arguments was skipped during the call. Skipped arguments are not passed to the component (which has the effect that the component will use its original default values).

Example of a function with complex default values:

def assert_values_are_default(
    a, b,
    singleton_param=None,
    function_param=ascii,
    dict_param={'b': [2, 3, 4]},
    func_call_param='_'.join(['a', 'b', 'c']),
) -> int:
    assert singleton_param is None
    assert function_param is ascii
    assert dict_param == {'b': [2, 3, 4]}
    assert func_call_param == '_'.join(['a', 'b', 'c'])

Fixes #1485


This change is Reviewable

@gaoning777
Copy link
Contributor

Would you mind also giving an example of the component yaml for the function above?

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Aug 2, 2019

Would you mind also giving an example of the component yaml for the function above?

Here it is. As you see, the inputs are marked as optional: true (as you once suggested; I thought a lot and changed my mind). They also have stringified default values (I think these are useful for the users to see), but they're not getting passed unless the user overrides them.

name: Assert values are default

inputs:
- name: a
- name: b
- name: singleton_param
  optional: true
- name: function_param
  optional: true
  default: <built-in function ascii>
- name: dict_param  
  optional: true
  default: '{''b'': [2, 3, 4]}'
- name: func_call_param
  optional: true
  default: a_b_c

outputs:
- name: Output
  type: int

implementation:
  container:
    args:
    - --a
    - inputValue: a
    - --b
    - inputValue: b
    - if:
        cond:
          isPresent: singleton_param
        then:
        - --singleton-param
        - inputValue: singleton_param
    - if:
        cond:
          isPresent: function_param
        then:
        - --function-param
        - inputValue: function_param
    - if:
        cond:
          isPresent: dict_param
        then:
        - --dict-param
        - inputValue: dict_param
    - if:
        cond:
          isPresent: func_call_param
        then:
        - --func-call-param
        - inputValue: func_call_param
    - '----output-paths'
    - outputPath: Output
    command:
    - python3
    - -u
    - -c
    - |
      def assert_values_are_default(
          a, b,
          singleton_param=None,
          function_param=ascii,
          dict_param={'b': [2, 3, 4]},
          func_call_param='_'.join(['a', 'b', 'c']),
      ) -> int:
          assert singleton_param is None
          assert function_param is ascii
          assert dict_param == {'b': [2, 3, 4]}
          assert func_call_param == '_'.join(['a', 'b', 'c'])
          return 1

      import argparse
      _missing_arg = object()
      _parser = argparse.ArgumentParser(prog='Assert values are default', description='')
      _parser.add_argument("--a", dest="a", type=str, required=True, default=_missing_arg)
      _parser.add_argument("--b", dest="b", type=str, required=True, default=_missing_arg)
      _parser.add_argument("--singleton-param", dest="singleton_param", type=str, required=False, default=_missing_arg)
      _parser.add_argument("--function-param", dest="function_param", type=str, required=False, default=_missing_arg)
      _parser.add_argument("--dict-param", dest="dict_param", type=str, required=False, default=_missing_arg)
      _parser.add_argument("--func-call-param", dest="func_call_param", type=str, required=False, default=_missing_arg)
      _parser.add_argument("----output-paths", dest="_output_paths", type=str, nargs=1)
      _parsed_args = {k: v for k, v in vars(_parser.parse_args()).items() if v is not _missing_arg}
      _output_files = _parsed_args.pop("_output_paths")

      _outputs = assert_values_are_default(**_parsed_args)

      if not hasattr(_outputs, '__getitem__') or isinstance(_outputs, str):
          _outputs = [_outputs]

      from pathlib import Path
      for idx, filename in enumerate(_output_files):
          _output_path = Path(filename)
          _output_path.parent.mkdir(parents=True, exist_ok=True)
          _output_path.write_text(str(_outputs[idx]))
    image: tensorflow/tensorflow:1.11.0-py3

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Aug 5, 2019

/cc @gaoning777 Can you please take a look?

1 similar comment
@Ark-kun
Copy link
Contributor Author

Ark-kun commented Aug 9, 2019

/cc @gaoning777 Can you please take a look?

@gaoning777
Copy link
Contributor

/lgtm

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Aug 11, 2019

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Aug 12, 2019

/retest

@k8s-ci-robot k8s-ci-robot merged commit 7917ea4 into kubeflow:master Aug 12, 2019
magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this pull request Oct 22, 2023
…eflow#1696)

Similar to the safeguard that exists in pytorchserver.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parameters with default value None are handled incorrectly by python components
3 participants