[UI] inputs/outputs tab in KFP semantics #5670

Bobgy · 2021-05-18T14:11:11Z

KFP Inputs/Outputs tab in run details page is currently very coupled to argo.

For v2 compatible pipelines, we can use information from MLMD to render the Inputs/Outputs tab in KFP semantics.

Bobgy · 2021-05-18T14:27:06Z

/assign @zijianjoy

zijianjoy · 2021-06-06T00:12:33Z

Current INPUT/OUTPUT tab

Render Input Parameters, Input Artifacts, Output Parameters, Output Artifacts.

They are read from Workflow object.

Use MLMD

Based on execution, we can find a list of events to identify artifact and input/output for this execution. Detail info is in

pipelines/third_party/ml-metadata/ml_metadata/proto/metadata_store.proto

Lines 94 to 161 in d9c0196

    
           // An event represents a relationship between an artifact and an execution. 
        
           // There are different kinds of events, relating to both input and output, as 
        
           // well as how they are used by the mlmd powered system. 
        
           // For example, the DECLARED_INPUT and DECLARED_OUTPUT events are part of the 
        
           // signature of an execution. For example, consider: 
        
           // 
        
           //   my_result = my_execution({"data":[3,7],"schema":8}) 
        
           // 
        
           // Where 3, 7, and 8 are artifact_ids, Assuming execution_id of my_execution is 
        
           // 12 and artifact_id of my_result is 15, the events are: 
        
           //   { 
        
           //       artifact_id:3, 
        
           //       execution_id: 12, 
        
           //       type:DECLARED_INPUT, 
        
           //       path:{step:[{"key":"data"},{"index":0}]} 
        
           //   } 
        
           //   { 
        
           //       artifact_id:7, 
        
           //       execution_id: 12, 
        
           //       type:DECLARED_INPUT, 
        
           //       path:{step:[{"key":"data"},{"index":1}]} 
        
           //   } 
        
           //   { 
        
           //       artifact_id:8, 
        
           //       execution_id: 12, 
        
           //       type:DECLARED_INPUT, 
        
           //       path:{step:[{"key":"schema"}]} 
        
           //   } 
        
           //   { 
        
           //       artifact_id:15, 
        
           //       execution_id: 12, 
        
           //       type:DECLARED_OUTPUT, 
        
           //       path:{step:[{"key":"my_result"}]} 
        
           //   } 
        
           // Other event types include INPUT/OUTPUT and INTERNAL_INPUT/_OUTPUT. 
        
           // * The INPUT/OUTPUT is an event that actually reads/writes an artifact by an 
        
           //   execution. The input/output artifacts may not declared in the signature, 
        
           //   For example, the trainer may output multiple caches of the parameters 
        
           //   (as an OUTPUT), then finally write the SavedModel as a DECLARED_OUTPUT. 
        
           // * The INTERNAL_INPUT/_OUTPUT are event types which are only meaningful to 
        
           //   an orchestration system to keep track of the details for later debugging. 
        
           //   For example, a fork happened conditioning on an artifact, then an execution 
        
           //   is triggered, such fork implementating may need to log the read and write 
        
           //   of artifacts and may not be worth displaying to the users. 
        
           // 
        
           // For instance, in the above example, 
        
           // 
        
           //   my_result = my_execution({"data":[3,7],"schema":8}) 
        
           // 
        
           // there is another execution (id: 15), which represents a `garbage_collection` 
        
           // step in an orchestration system 
        
           // 
        
           //   gc_result = garbage_collection(my_result) 
        
           // 
        
           // that cleans `my_result` if needed. The details should be invisible to the 
        
           // end users and lineage tracking. The orchestrator can emit following events: 
        
           // 
        
           //   { 
        
           //       artifact_id: 15, 
        
           //       execution_id: 15, 
        
           //       type:INTERNAL_INPUT, 
        
           //   } 
        
           //   { 
        
           //       artifact_id:16,  // New artifact containing the GC job result. 
        
           //       execution_id: 15, 
        
           //       type:INTERNAL_OUTPUT, 
        
           //       path:{step:[{"key":"gc_result"}]} 
        
           //   }

.

zijianjoy · 2021-06-06T05:51:36Z

Questions

What is the relationship of Input/Output tab vs the ML metadata tab in

pipelines/frontend/src/pages/ExecutionDetails.tsx

Lines 126 to 145 in d9c0196

    
           <SectionIO 
        
             title={'Declared Inputs'} 
        
             artifactIds={this.state.events[Event.Type.DECLARED_INPUT]} 
        
             artifactTypeMap={this.state.artifactTypeMap} 
        
           /> 
        
           <SectionIO 
        
             title={'Inputs'} 
        
             artifactIds={this.state.events[Event.Type.INPUT]} 
        
             artifactTypeMap={this.state.artifactTypeMap} 
        
           /> 
        
           <SectionIO 
        
             title={'Declared Outputs'} 
        
             artifactIds={this.state.events[Event.Type.DECLARED_OUTPUT]} 
        
             artifactTypeMap={this.state.artifactTypeMap} 
        
           /> 
        
           <SectionIO 
        
             title={'Outputs'} 
        
             artifactIds={this.state.events[Event.Type.OUTPUT]} 
        
             artifactTypeMap={this.state.artifactTypeMap} 
        
           />

? How should we merge them into one? Possible solution: Move the INPUT/OUTPUT/DECLARED_INPUT/DECLARED_OUTPUT to Input/Output tab, and shows only Properties/Custom Properties in ML Metadata tab.

How do we differentiate Parameter and Artifact from MLMD?
What is the relationship between DECLARED_INPUT and INPUT? How to show them in static pipeline mode?

Bobgy · 2021-06-06T10:11:28Z

These questions are right to the point! Let me try to explain some context, I don't have a clear answer to some of them, you'll need to do some designing.

Questions

What is the relationship of Input/Output tab vs the ML metadata tab in

pipelines/frontend/src/pages/ExecutionDetails.tsx

Lines 126 to 145 in d9c0196

<SectionIO

title={'Declared Inputs'}

artifactIds={this.state.events[Event.Type.DECLARED_INPUT]}

artifactTypeMap={this.state.artifactTypeMap}

/>

<SectionIO

title={'Inputs'}

artifactIds={this.state.events[Event.Type.INPUT]}

artifactTypeMap={this.state.artifactTypeMap}

/>

<SectionIO

title={'Declared Outputs'}

artifactIds={this.state.events[Event.Type.DECLARED_OUTPUT]}

artifactTypeMap={this.state.artifactTypeMap}

/>

<SectionIO

title={'Outputs'}

artifactIds={this.state.events[Event.Type.OUTPUT]}

artifactTypeMap={this.state.artifactTypeMap}

/>

? How should we merge them into one? Possible solution: Move the INPUT/OUTPUT/DECLARED_INPUT/DECLARED_OUTPUT to Input/Output tab, and shows only Properties/Custom Properties in ML Metadata tab.

In KFP v1, input / output tab shows info parsed from argo workflows, but ML metadata tab shows info from MLMD. In v2 & v2 compatible, both will come from MLMD (and they are duplicate), so some merging or information rearrangement is necessary as you thought.

Here's my gut feeling arrangement (mostly similar to your proposal), feel free to discuss:

Input/output tab

shows info from MLMD
in addition to showing preview + download link, we can add a link to MLMD artifact details page

ML Metadata tab

suggest remove the tab altogether, because in KFP v2 compatible, we do not allow users to customize execution metadata/custom properties, so there's not much left to show

Link to execution details page should probably be shown all the time (e.g. as the side content title, see below

How do we differentiate Parameter and Artifact from MLMD?

The KFP MLMD data model is that input parameters are logged as input:<parameter-name> custom properties of the execution.
Output parameters are logged as output:<parameter-name> custom properties.

Similar to PR: #5793, we will soon standardize to move parameters to fields of a custom property metadata. metadata is of Struct type, so it can include key value pairs like input:<param>, output:<param> like mentioned above.

Artifacts are what you already observed in ML metadata tab, they are connected to executions by event.

What is the relationship between DECLARED_INPUT and INPUT? How to show them in static pipeline mode?

Answer (ref from proto file):

For example, the DECLARED_INPUT and DECLARED_OUTPUT events are part of the signature of an execution

https://github.com/google/ml-metadata/blob/47150524ee5ceee9766a034c4fbe5427440dd79e/ml_metadata/proto/metadata_store.proto#L100-L138

Thanks for the question, after re-reading the documentation, now I realized I had a wrong understanding of DECLARED_INPUT. For all inputs/outputs in KFP tasks, they should be declared input/outputs because they are part of the KFP component signature. Non-declared inputs/outputs is a concept only TFX uses.

However, until now, because KFP does not have the non-declared inputs/outputs concept, we are logging all inputs & outputs as pure inputs and outputs. We need to confirm whether this is sth we need to change.

Ref: MLMD Terminology section of KFP v2 design

…5670 (#5859) * feat(frontend) Support Input/Output from MLMD for V2-compatible * fix test * address nit comments * Artifact Preview component, use events to get artifact name. * comment and UX rework * downloadable link

google-oss-robot assigned zijianjoy May 18, 2021

Bobgy mentioned this issue May 18, 2021

KFP UI with KFP semantics #5675

Closed

1 task

Bobgy added the size/M label May 24, 2021

zijianjoy mentioned this issue Jun 15, 2021

feat(frontend) Support Input/Output from MLMD for V2-compatible. Fix #5670 #5859

Merged

1 task

google-oss-robot closed this as completed in #5859 Jun 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[UI] inputs/outputs tab in KFP semantics #5670

[UI] inputs/outputs tab in KFP semantics #5670

Bobgy commented May 18, 2021

Bobgy commented May 18, 2021

zijianjoy commented Jun 6, 2021

zijianjoy commented Jun 6, 2021

Bobgy commented Jun 6, 2021

Questions

[UI] inputs/outputs tab in KFP semantics #5670

[UI] inputs/outputs tab in KFP semantics #5670

Comments

Bobgy commented May 18, 2021

Bobgy commented May 18, 2021

zijianjoy commented Jun 6, 2021

Current INPUT/OUTPUT tab

Use MLMD

zijianjoy commented Jun 6, 2021

Questions

Bobgy commented Jun 6, 2021

Questions