Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UPSTREAM: <carry>: add last_run_creation #52

Merged
merged 1 commit into from
Jun 11, 2024

Conversation

HumairAK
Copy link

@HumairAK HumairAK commented Jun 7, 2024

Description of your changes:

Resolves: https://issues.redhat.com/browse/RHOAIENG-7692

Testing instructions:

instruction.txt

Prepare environment:

  1. First deploy regular DSPA without changes.
  2. Create experiments expA and expB
  3. Create a run1A and run1B in expA and expB respectively
  4. Get access to the underlying mariadb instance, you will need to be able to select on run table and experiment table
    Go to mariadb pod, open the terminal
mysql -u root 
MariaDB [(none)]> use mlpipeline
Database changed
MariaDB [mlpipeline]> select * from experiments;

Confirm that you don't see a column LastRunCreatedAtInSec .

Note: you can also use cloudbeaver for this.

Now deploy the changes:

Update the current DSPA with the updated images, you should only need to update the API server image

Test Runs:

  1. Now create a run named run2A, run3A, run4A in expA experiment, make sure the runs are not cached.
    You can use this sample pipeline: here

  2. Now either using postman or curl, the experiment and sort by last_run_created_at:

curl --location --request GET '${API_SERVER_ROUTE}/apis/v2beta1/experiments?namespace=${YOUR_DSPA_NS}&sort_by=last_run_created_at desc' \
--header 'Authorization: Bearer ${YOUR_OC_AUTH_TOKEN}'

Set API_SERVER_ROUTE, YOUR_OC_AUTH_TOKEN, YOUR_DSPA_NS accordingly.

You should see expA and expB listed in a json sorted by last_run_created_at. Confirm the presence of this field last_run_created_at in every experiment listed.

Note that after deploying we did not create a new run in expB, thus we expect expB's last_run_created_at to be the first epoch time: "last_run_created_at": "1970-01-01T00:00:00Z" confirm this is the case. This should confirm the update scenario for existing experiments. Because of this since we are sorting by desc we expect expB to show up last. Now try to use asc instead of dsc confirm the order is reversed. You can repeat this test with multiple experiments instead of just 2.

We also need to test the behavior with scheduled runs because these are persisted by persistence Agent so they work differently.

Test Jobs:

  1. Create a recurring run run2B under expB. Set the cron for every 1 min.
  2. Let 3 minutes pass, or until 3 runs are created. Disable the recurring run via the UI for run2B otherwise you'll get spammed with runs.
  3. Now use curl or postman again, the command is the same as before, and sort by last_run_created_at. Confirm expected behavior.
  4. expB should not show first if sorting by desc and the last_run_created_at should match this run's creation time (see this in the runs page, or curl the runs api)
  5. Confirm asc sorting has expected behavior

Create a new experiment

  1. Create a new experiment exp3 after you have deployed the new changes.
  2. Once again do a sort, exp3 should show up last when sorting by desc because it has no runs. Confirm that it has the first epoch time stamp.
  3. Create a run in exp3, and curl the api again and sort by last_run_created_at confirm expected behavior, sort again

Verify DB data

Now go to mariadb, as described earlier, display the experiments table, confirm LastRunCreatedAtInSec is populated as expected this maps to last_run_created_at.

For converting epoch, you can use this tool for relative time

@openshift-ci openshift-ci bot requested review from DharmitD and gmfrasca June 7, 2024 18:11
@dsp-developers
Copy link

Commit Checker results:

**NOTE**: These are the results of the commit checker scans. 
If these are not commits from upstream kfp, then please ensure
you adhere to the commit checker formatting
commitchecker verson unknown
Validating 0 commits between f3fbc2cfba32098ae1902ec8afa8d9661a2b30b0...733f60bc3ac55cf16eed32d6be1c14cd45419758

@dsp-developers
Copy link

Commit Checker results:

**NOTE**: These are the results of the commit checker scans. 
If these are not commits from upstream kfp, then please ensure
you adhere to the commit checker formatting
commitchecker verson unknown
Validating 0 commits between f3fbc2cfba32098ae1902ec8afa8d9661a2b30b0...5d187fd70d1c4a9dce311c5ae64a32ca793c1170

@dsp-developers
Copy link

A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-52
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-52
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-52
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-52
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-52
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-52

@dsp-developers
Copy link

An OCP cluster where you are logged in as cluster admin is required.

The Data Science Pipelines team recommends testing this using the Data Science Pipelines Operator. Check here for more information on using the DSPO.

To use and deploy a DSP stack with these images (assuming the DSPO is deployed), first save the following YAML to a file named dspa.pr-52.yaml:

apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
kind: DataSciencePipelinesApplication
metadata:
  name: pr-52
spec:
  dspVersion: v2
  apiServer:
    image: "quay.io/opendatahub/ds-pipelines-api-server:pr-52"
    argoDriverImage: "quay.io/opendatahub/ds-pipelines-driver:pr-52"
    argoLauncherImage: "quay.io/opendatahub/ds-pipelines-launcher:pr-52"
  persistenceAgent:
    image: "quay.io/opendatahub/ds-pipelines-persistenceagent:pr-52"
  scheduledWorkflow:
    image: "quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-52"
  mlmd:  
    deploy: true  # Optional component
    grpc:
      image: "quay.io/opendatahub/mlmd-grpc-server:latest"
    envoy:
      image: "registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2"
  mlpipelineUI:
    deploy: true  # Optional component 
    image: "quay.io/opendatahub/ds-pipelines-frontend:pr-52"
  objectStorage:
    minio:
      deploy: true
      image: 'quay.io/opendatahub/minio:RELEASE.2019-08-14T20-37-41Z-license-compliance'

Then run the following:

cd $(mktemp -d)
git clone git@github.com:opendatahub-io/data-science-pipelines.git
cd data-science-pipelines/
git fetch origin pull/52/head
git checkout -b pullrequest 733f60bc3ac55cf16eed32d6be1c14cd45419758
oc apply -f dspa.pr-52.yaml

More instructions here on how to deploy and test a Data Science Pipelines Application.

@dsp-developers
Copy link

Change to PR detected. A new PR build was completed.
A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-52
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-52
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-52
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-52
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-52
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-52

@dsp-developers
Copy link

Commit Checker results:

**NOTE**: These are the results of the commit checker scans. 
If these are not commits from upstream kfp, then please ensure
you adhere to the commit checker formatting
commitchecker verson unknown
Validating 0 commits between f3fbc2cfba32098ae1902ec8afa8d9661a2b30b0...8dbded7855b46a121287f8811e2d6042a369ef88

@dsp-developers
Copy link

Change to PR detected. A new PR build was completed.
A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-52
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-52
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-52
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-52
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-52
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-52

@HumairAK HumairAK requested review from rimolive and removed request for DharmitD June 10, 2024 13:42
Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
@dsp-developers
Copy link

Commit Checker results:

**NOTE**: These are the results of the commit checker scans. 
If these are not commits from upstream kfp, then please ensure
you adhere to the commit checker formatting
commitchecker verson unknown
Validating 0 commits between f3fbc2cfba32098ae1902ec8afa8d9661a2b30b0...bf77909897738751fc58b8b5b1dd928bf264bf49

@dsp-developers
Copy link

Change to PR detected. A new PR build was completed.
A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-52
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-52
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-52
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-52
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-52
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-52

@rimolive
Copy link

/lgtm

Tested all instructions and it worked fine.

@rimolive
Copy link

/hold

When testing with jobs, it did not work.

Copy link

@gregsheremeta gregsheremeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

language nitpicks
lgtm

Comment on lines +554 to +559
// Upon run creation, update owning experiment
err = r.experimentStore.UpdateLastRun(newRun)
if err != nil {
return nil, util.Wrap(err, fmt.Sprintf("Failed to update last_run_created_at in experiment %s for run %s", newRun.ExperimentId, newRun.UUID))
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(language nitpicks, feel free to ignore both)

UpdateLastRun makes me think we're doing something to the run. We're actually just setting a timestamp on the experiment, so I think I would have named it SetLastRunTimestamp or similar.

I'd also say that maybe that log message knows a little too much about what's happening behind closed doors -- maybe leave the column name out ("failed to set last run timestamp on experiment")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree on both, will fix in a follow up pr thanks @gregsheremeta

@HumairAK
Copy link
Author

/unhold

When testing with jobs, it did not work.

this was an issue with @rimolive environment

@HumairAK
Copy link
Author

from @rimolive :

ok, I tested with jobs and it looks good.

Copy link

openshift-ci bot commented Jun 11, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gmfrasca

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 2aacfe2 into opendatahub-io:master Jun 11, 2024
3 checks passed
@dsp-developers
Copy link

Commit Checker results:

**NOTE**: These are the results of the commit checker scans. 
If these are not commits from upstream kfp, then please ensure
you adhere to the commit checker formatting
commitchecker verson unknown
Validating 0 commits between 2aacfe2781c01939e087e188cdb04a0e117bfb2c...bf77909897738751fc58b8b5b1dd928bf264bf49

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants