-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mount_pvc failing when pipeline includes ResourceOps #3906
Comments
Hi @jackwhelpton, The documentation seems wrong there. Since there is commonality between the ops, it's the op transformer's job to check before applying something exclusive to a specific kind of ops. Could you write a few-lines-wrapper around def mount_pvc_transformer(pvc_name, mount_path):
def transformer(op):
if isinstance(op, dsl.ContainerOp):
mount_pvc(pvc_name=pvc_name, volume_mount_path=mount_path)
return transformer Also, instead of op.add_pvolumes({mount_path: dsl.PipelineVolume(pvc=pvc_name)}) |
Ah, that's useful, thanks! I was going to offer a PR to push that check into Oddly enough I had been using
What is the recommended approach here, then? It seems like best practice is still fairly diffused through docs, GitHub issues etc. |
Yes, that opinion has been commented here and there in some issues.. I don't think such thing as an unsupported feature on master branch of an OSS exists. Especially if we're talking for a method such as Of course, Google team has the same goal, it's quite clear and there's no doubt about that. I mean look at KFP 😄 |
There's one other subtlety about using def mount(vop):
# mount_pvc wrapper ensures transformer only executes for ContainerOps.
# This is required until https://github.com/kubeflow/pipelines/pull/3912 is merged.
def transformer(op):
if isinstance(op, dsl.ContainerOp):
onprem.mount_pvc(
pvc_name=vop.volume.persistent_volume_claim.claim_name,
volume_mount_path=MOUNT_PATH,
)
op.after(vop)
return transformer which seems clunky (again, I've read that explicit dependencies between ops are a code smell). Is there a cleaner way to do this? |
Using Something like this should work: def mount(vop, mount_path):
def transformer(op):
if isinstance(op, dsl.ContainerOp):
op.add_pvolumes({mount_path: vop.volume})
return transformer Can you verify? |
That's looking much better (I'll update this comment once it actually completes, which takes around 30m, but it didn't get this far before). This seems to be a good argument for using We're having a lot of discussions internally about how to make kubeflow usable for "pure" data scientists: I'm coming from much more of a traditional engineering background, but things that are straightforward for me are often not so for our data science team (and vice versa, obviously). I'm interested to read anything Google have written on how they address this internally... |
Firstly, it wasn't me who opened the PR. Credits should go to @eterna2 😄
The usage of
|
All looks good using At some point I want to migrate what is effectively a hand-wired pipeline to kubeflow itself, but as an intermediate step I'm changing that YAML on-the-fly so it mounts a PVC that the pipeline sets up in advance. Actually, even if I were to migrate to more of a pure Kubeflow solution, I think I'd still be looking at a volume mount to fan-in results from a Given the dependency problem, it sounds like |
Yep, sounds correct! |
kubeflow#3912) * Fix kubeflow#3906 - check that ops to be transformed is a containerOp * Update docstring for add_op_transformer to clarify that not only containerOp will be transformed.
kubeflow#3912) * Fix kubeflow#3906 - check that ops to be transformed is a containerOp * Update docstring for add_op_transformer to clarify that not only containerOp will be transformed.
What steps did you take:
Created a persistent volume claim using a
VolumeOp
:Attempted to mount the PVC to every operation in a pipeline using:
What happened:
During compilation, the
add_op_transformer
attempts to invoke theadd_volume_mount
method on theVolumeOp
, producing an error:What did you expect to happen:
The documentation states that transfomers take and return a
ContainerOp
:https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.PipelineConf.add_op_transformer
Given that, I would expect them to be bypassed for
VolumeOp
(or otherResourceOp
operations), and hence this compilation should succeed.Environment:
KFP version: Build commit: 9c16e12
KFP SDK version:
kfp 0.5.1
kfp-server-api 0.3.0
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
/kind bug
The text was updated successfully, but these errors were encountered: