Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web UI should supports get artifact from local path #1497

Closed
jinchihe opened this issue Jun 12, 2019 · 28 comments
Closed

Web UI should supports get artifact from local path #1497

jinchihe opened this issue Jun 12, 2019 · 28 comments
Assignees
Labels
area/frontend help wanted The community is welcome to contribute. lifecycle/stale The issue / pull request is stale, any activities remove this label. needs investigation priority/p2 status/triaged Whether the issue has been explicitly triaged

Comments

@jinchihe
Copy link
Member

We have support mounting PV/PVC in pipelines for on premise cluster. For this case the artifact will be stored in the PVC mount path, we should support getting artifacts from local path, such as /mnt/a07b8215-8c1a-11e9-a2ff-525400ed33aa/tfx-taxi-cab-classification-pipeline-example-q7rtq-2449399348/data/roc.csv. Thanks.

@cvenets
Copy link

cvenets commented Jun 21, 2019

This is something we need as well, to be able to visualize the results correctly via the KFP UI on MiniKF.

Can we coordinate to include this in 0.6?

@jlewi
Copy link
Contributor

jlewi commented Jul 9, 2019

@paveldournov
Copy link
Contributor

@jinchihe when you refer to the local path for the artifacts, what is the path local to? Is this a local file system of the machine running miniKube, or a local file system of the container creating the artifact or something else?

@jinchihe
Copy link
Member Author

@jinchihe when you refer to the local path for the artifacts, what is the path local to? Is this a local file system of the machine running miniKube, or a local file system of the container creating the artifact or something else?

The case is running on on-prem cluster, and the artifacts will be saved to path where PVC mounted to.

@IronPan
Copy link
Member

IronPan commented Jul 11, 2019

The case is running on on-prem cluster, and the artifacts will be saved to path where PVC mounted to.

How will the UI get the data? Will the UI pod mount the same PV to retrieve and render it? How does this work if UI need to render data for different pipeline that has different PVs?

@IronPan
Copy link
Member

IronPan commented Jul 11, 2019

Also @mameshini has some extension to better abstract artifacts storage that works with On-Prem and cloud. We are looking into port them over as part of KF Pipeline as the default artifact storage. Please see this thread for details.
#596

@jlewi
Copy link
Contributor

jlewi commented Jul 14, 2019

@cvenets Based on @IronPan is this 0.6 blocking? Should we downgrade it from P0? IIUC figuring out how to support storing artifacts on PV/PVC is non-trivial. If artifacts are stored on a PV/PVC then the UI won't be able to access those artifacts without mounting the PV/PVC.

@cvenets
Copy link

cvenets commented Jul 15, 2019

@jlewi @IronPan indeed this is not trivial, and these are two distinct and orthogonal problems mentioned in this thread:

  1. Being able to have UIs that can access data in PVCs (both the Kubeflow UI and the KFP UI)
  2. Being able to have PVCs backed by Object Stores (issue S3 errors in Pipeline examples for reading training data and artifact storage #596)

[1] needs to be tackled in a generic way that will work for different use cases, one of which is the Artifacts tab of the KFP UI. We have discussed this internally and I think we can contribute a generic mechanism that all UIs will be able to use for v0.7. This is related to the Tensorboard issue as well:
kubeflow/kubeflow#3578

So, yes, let's not consider this blocking for v0.6 because it needs significant work. We will aim to solve it universally for v0.7.

For [2], I went through #596. I don't think it is related to this issue, it's more of an infra problem, of how one chooses to implement PVCs at the K8s level. If one has PVCs backed by Goofys which is then backed by an Object Store, then Pipelines and every other component will work transparently. @IronPan can you comment why this may be different than any other PVC provider?

@jlewi
Copy link
Contributor

jlewi commented Jul 16, 2019

Thanks @cvenets downgrading to p1 and moving to 0.7.0

@jlewi
Copy link
Contributor

jlewi commented Aug 27, 2019

Anyone planning on tackling this in Q3?

@jackhawa
Copy link

jackhawa commented Oct 6, 2019

Hello, is there any update (or eta) on this issue? And is there a workaround to see local artifacts produced by a step?

@Ark-kun
Copy link
Contributor

Ark-kun commented Oct 16, 2019

We could support this consistently:
Design:

  1. Cluster admin specifies that they want to use particular Kubernetes volume for all data storage and passing (which replaces Minio or other S3 storage).
  2. The storage volume is also mounted to the Frontend pod.
  3. Backend rewrites pipelines so that the artifacts are now stored in the volume while only the volume paths are being passed between steps. The change should be invisible for the user code.
  4. Frontend can now access all data (including the data from unfinished steps) as pointed by the volume paths.

@mameshini
Copy link

Yes it would address the need - a single Kubernetes volume can be mounted to the Frontend pod, and used for all data storage and passing. Tensorboard data can also be saved on that volume. Currently only GCS buckets can be used to visualize Tensorboard data which is a serious limitation for other cloud providers, on-prem, and local.

@eterna2
Copy link
Contributor

eterna2 commented Oct 24, 2019

The storage volume is also mounted to the Frontend pod.

Question. Can multiple pods mount the same PV? Because for PV for AWS backed k8s uses EBS. And I don't think we can mount the same EBS on multiple pods.

Does that means we need to use some custom volume backed by nfs or some storage service?

@eterna2
Copy link
Contributor

eterna2 commented Oct 24, 2019

Just a side-note, I got tensorboard viewer to work with S3 by exposing a env var in frontend to set a path to a custom podspec, which I config by mounting a configmap.

#1906

@tanguycdls
Copy link
Contributor

@eterna2 How can we use your PR #1906 to use Tensorboard on S3 ? I opened an issue a while ago here kubeflow/kubeflow#3773. Thks !

@eterna2
Copy link
Contributor

eterna2 commented Oct 25, 2019

Tensorboard supports s3 thru boto3 under the hood. You will need to either pass in the AWS credentials with env variables or set the pod annotations with the appropriate IAM roles (if your cluster is running kube2iam or equivalent) for the tensorboard pod to access ur s3 bucket.

My PR exposes an env var VIEWER_TENSORBOARD_POD_TEMPLATE_SPEC_PATH inside pipeline UI to load a custom podTemplateSpec instead of the default gcp spec.

The podTemplateSpec is used by the view controller to create the tensorboard viewer pod.

You can create a configmap with a json for the podTemplateSpec for the tensorboard viewer with the AWS credentials or pod annotations. Then u mount the configmap to the path referenced by the env var.

scheme for podTemplateSpec can be found in k8s API reference. U can ignore the image and arg field as these are injected by the viewer controller.

https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#podtemplatespec-v1-core

See here on the env var to config the AWS credentials.

https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html

More on kube2iam here
https://github.com/jtblin/kube2iam

Note also that changes to the spec is not retro-active. U will need to kill existing view, and reload them to see the pods with the updated podTemplateSpec.

@gabrielvcbessa
Copy link

gabrielvcbessa commented Oct 26, 2019

@eterna2 I am also using pod annotations to authenticate with aws. I tried to patch the ml-pipeline-ui deploy and include the following patch:

spec:
  template:
    metadata:
      annotations:
        iam.amazonaws.com/role: my-role

But I am still getting S3Error: Access Denied when trying to visualize any artifact generated by any given step. I double checked and the pod has the annotation and also have the following log:

Getting storage artifact at: s3: bucket/prefix

So the annotated pod is the one requesting it. Shouldn't this work?

@eterna2
Copy link
Contributor

eterna2 commented Oct 26, 2019

This is for tensorboard. Not pipeline-ui.

Cuz minio-js does not support IAM roles.

U need to wait for my pr #2081 to be merged in before pod annotations will work for pipeline-ui.

Meanwhile u can use a minio gateway to proxy to s3.

@eterna2
Copy link
Contributor

eterna2 commented Oct 26, 2019

https://github.com/minio/minio/blob/master/docs/gateway/s3.md

A minio gateway is setup very similarly to the kf minio server. U just need to add the pod annotations and change the args.

@descampsk
Copy link

My PR exposes an env var VIEWER_TENSORBOARD_POD_TEMPLATE_SPEC_PATH inside pipeline UI to load a custom podTemplateSpec instead of the default gcp spec.

Thanks !

It works pretty good. But the Tensorboard viewers pods will live forever...

Do you think it's possible to add a spec to allow them to live only some minutes ? Or do we have to build a cronejob to delete automatically this pods every hour ?

@Bobgy Bobgy added needs investigation status/triaged Whether the issue has been explicitly triaged help wanted The community is welcome to contribute. labels Jan 22, 2020
@Bobgy
Copy link
Contributor

Bobgy commented Jan 22, 2020

Not very familiar with the long thread here, is there anything left that still needs a solution?

@mameshini
Copy link

Yes it's still unsolved for AWS and on-prem, as far as I know. Presenting Tensorboard data to end users is a high value user story. Tensorboard data has to be presented not only after a pipeline step execution is done, but also during the pipeline step execution so that a data scientist can monitor the model training progress.

Let's assume there is a single PVC mounted to all pipeline steps, as well to the pipeline UI. Many Kubeflow users are already mounting data to pods using tools like goofys. Pipeline UI has to access Tensorboard data via local path without assuming that tensor board data always in GCP storage bucket. Alternatively if pipeline UI can get artifact data from Minio/S3 buckets that would be fine too. I will be able to get back to testing this in a week.

@stale
Copy link

stale bot commented Jun 24, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 24, 2020
@stale
Copy link

stale bot commented Jul 1, 2020

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@Ark-kun
Copy link
Contributor

Ark-kun commented Aug 26, 2020

Alternatively if pipeline UI can get artifact data from Minio/S3 buckets that would be fine too.

I think this is what usually happen.

@Ark-kun Ark-kun reopened this Aug 26, 2020
@stale stale bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Aug 26, 2020
@Bobgy
Copy link
Contributor

Bobgy commented Aug 27, 2020

The requested feature is already supported in https://github.com/kubeflow/pipelines/blob/master/docs/config/volume-support.md.

@stale
Copy link

stale bot commented Nov 26, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Nov 26, 2020
@Bobgy Bobgy closed this as completed Nov 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/frontend help wanted The community is welcome to contribute. lifecycle/stale The issue / pull request is stale, any activities remove this label. needs investigation priority/p2 status/triaged Whether the issue has been explicitly triaged
Projects
None yet
Development

No branches or pull requests