Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backend): sort by run metrics - step 3. Part of #3591 #4251

Merged
merged 38 commits into from
Jul 31, 2020
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
5b67273
enable pagination when expanding experiment in both the home page and…
jingzhang36 Jun 17, 2020
d3f8d1d
Revert "enable pagination when expanding experiment in both the home …
jingzhang36 Jun 17, 2020
601c257
Merge remote-tracking branch 'upstream/master'
jingzhang36 Jun 18, 2020
a95ce51
Merge branch 'master' of github.com:jingzhang36/pipelines
jingzhang36 Jun 22, 2020
949e99b
Merge remote-tracking branch 'upstream/master'
jingzhang36 Jun 23, 2020
2ad60e7
Merge branch 'master' of github.com:jingzhang36/pipelines
jingzhang36 Jun 23, 2020
61128e3
Merge remote-tracking branch 'upstream/master'
jingzhang36 Jul 2, 2020
0f3003b
Merge remote-tracking branch 'upstream/master'
jingzhang36 Jul 6, 2020
88f9df9
Merge remote-tracking branch 'upstream/master'
jingzhang36 Jul 7, 2020
9d65153
Merge remote-tracking branch 'upstream/master' into master
jingzhang36 Jul 8, 2020
bf9d42c
Merge branch 'master' of github.com:jingzhang36/pipelines into master
jingzhang36 Jul 8, 2020
5e49d3f
Merge remote-tracking branch 'upstream/master' into master
jingzhang36 Jul 9, 2020
cbcd57f
Merge branch 'master' of github.com:jingzhang36/pipelines into master
jingzhang36 Jul 9, 2020
983c0e1
Merge remote-tracking branch 'upstream/master' into master
jingzhang36 Jul 13, 2020
fbf9d40
Merge remote-tracking branch 'upstream/master' into master
jingzhang36 Jul 14, 2020
2ec3b1c
Merge remote-tracking branch 'upstream/master' into master
jingzhang36 Jul 17, 2020
0dbe684
sorting by run metrics is different from sorting by name, uuid, creat…
jingzhang36 Jul 17, 2020
432e9e5
Merge remote-tracking branch 'upstream/master' into master
jingzhang36 Jul 20, 2020
eca6ce6
Merge remote-tracking branch 'origin/master' into sort_metrics_2
jingzhang36 Jul 20, 2020
90b0857
unit test: add sorting on metrics with both asc and desc order
jingzhang36 Jul 20, 2020
76048b7
GetFieldValue in all models
jingzhang36 Jul 20, 2020
f5859ae
fix unit test
jingzhang36 Jul 20, 2020
dd401bf
whether to test in list_test. It's hacky when check mode == 'run'
jingzhang36 Jul 20, 2020
a75254d
Merge remote-tracking branch 'upstream/master' into master
jingzhang36 Jul 21, 2020
8b5e1c8
Merge remote-tracking branch 'origin/master' into tmp4
jingzhang36 Jul 21, 2020
1b3dcc4
move model specific code to model; prevent model package depends on l…
jingzhang36 Jul 23, 2020
80f514e
some assumption on token's Model field
jingzhang36 Jul 27, 2020
98353f9
fix the regular field checking logic
jingzhang36 Jul 27, 2020
ee22b69
Merge remote-tracking branch 'upstream/master' into master
jingzhang36 Jul 30, 2020
998ccc7
add comment to help devs to use the new field
jingzhang36 Jul 30, 2020
afc06b9
add a validation check
jingzhang36 Jul 30, 2020
be4be61
Listable object can be too large to be in token. So replace it with only
jingzhang36 Jul 30, 2020
5457634
Merge remote-tracking branch 'upstream/master' into master
jingzhang36 Jul 31, 2020
576ee2f
Merge branch 'master' of github.com:jingzhang36/pipelines into master
jingzhang36 Jul 31, 2020
69be419
Merge remote-tracking branch 'origin/master' into tmp4
jingzhang36 Jul 31, 2020
3b697ba
Merge branch 'tmp4' of github.com:jingzhang36/pipelines into tmp4
jingzhang36 Jul 31, 2020
6c71128
Merge remote-tracking branch 'upstream/master' into tmp4
jingzhang36 Jul 31, 2020
6124c82
matches func update
jingzhang36 Jul 31, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
152 changes: 88 additions & 64 deletions backend/src/apiserver/list/list.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ import (
api "github.com/kubeflow/pipelines/backend/api/go_client"
"github.com/kubeflow/pipelines/backend/src/apiserver/common"
"github.com/kubeflow/pipelines/backend/src/apiserver/filter"
"github.com/kubeflow/pipelines/backend/src/apiserver/model"
"github.com/kubeflow/pipelines/backend/src/common/util"
)

Expand All @@ -44,9 +45,6 @@ type token struct {
// SortByFieldValue is the value of the sorted field of the next row to be
// returned.
SortByFieldValue interface{}
// SortByFieldIsRunMetric indicates whether the SortByFieldName field is
// a run metric field or not.
SortByFieldIsRunMetric bool

// KeyFieldName is the name of the primary key for the model being queried.
KeyFieldName string
Expand All @@ -58,10 +56,21 @@ type token struct {
IsDesc bool

// ModelName is the table where ***FieldName belongs to.
// TODO(jingzhang36): we probably can deprecate this since we now have
// Model field.
ModelName string

// Filter represents the filtering that should be applied in the query.
Filter *filter.Filter

// The listable model this token is applied to. Not used in json marshal/unmarshal.
Model Listable `json:"-"`
// ModelType and the ModelMessage are helper fields to unmarshal data correctly to
// the underlying listable model, and this underlying listable model will be stored
// in the above Model field. Those two fields are only used in token's marshal and
// unmarshal methods.
ModelType string
ModelMessage json.RawMessage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you comment example values of these fields?
That helps understanding these fields.

My initial confusion when reading this: Is it an ML model or a data model?

Copy link
Contributor Author

@jingzhang36 jingzhang36 Jul 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is not about the ML terminology, but about a terminology used in backend implementation . We actually can get clues from the existing backend implementation. If we take a look at our code, there is a folder called backend/src/apiserver/model. In that folder, we have structs job, run, experiment, etc. In other words, the run, job, experiment, etc. are considered a type of "model". (I am not sure about the history here though). Moreover, all those models (aka, job, run, experiment, etc.) implement the "listable" interface. In other words, the set of jobs, (or runs, experiments) is listable.

So back to this struct token. Token is applied to a listable (aka a set of runs, jobs, experiments etc.). So, in the token, we can add this model field to indicate which model (aka, run, job, experiment) this token applies to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I add some explanation to this field, pointing to the folder. Hope that would help.

}

func (t *token) unmarshal(pageToken string) error {
Expand All @@ -77,10 +86,63 @@ func (t *token) unmarshal(pageToken string) error {
return errorF(err)
}

if t.ModelMessage != nil {
switch t.ModelType {
case "Run":
model := &model.Run{}
err = json.Unmarshal(t.ModelMessage, model)
if err != nil {
return errorF(err)
}
t.Model = model
break
case "Job":
model := &model.Job{}
err = json.Unmarshal(t.ModelMessage, model)
if err != nil {
return errorF(err)
}
t.Model = model
break
case "Experiment":
model := &model.Experiment{}
err = json.Unmarshal(t.ModelMessage, model)
if err != nil {
return errorF(err)
}
t.Model = model
break
case "Pipeline":
model := &model.Pipeline{}
err = json.Unmarshal(t.ModelMessage, model)
if err != nil {
return errorF(err)
}
t.Model = model
break
case "PipelineVersion":
model := &model.PipelineVersion{}
err = json.Unmarshal(t.ModelMessage, model)
if err != nil {
return errorF(err)
}
t.Model = model
break
}
}

return nil
}

func (t *token) marshal() (string, error) {
// Model in a token should not be nil, because this token is created when listing a model (i.e., run, job, experiment, pipeline and pipeline version).
t.ModelType = reflect.ValueOf(t.Model).Elem().Type().Name()
modelMessage, err := json.Marshal(t.Model)
if err != nil {
return "", util.NewInternalServerError(err, "Failed to serialize the listable object in page token.")
}
t.ModelMessage = modelMessage

b, err := json.Marshal(t)
if err != nil {
return "", util.NewInternalServerError(err, "Failed to serialize page token.")
Expand All @@ -101,7 +163,7 @@ type Options struct {
// Matches returns trues if the sorting and filtering criteria in o matches that
// of the one supplied in opts.
func (o *Options) Matches(opts *Options) bool {
return o.SortByFieldName == opts.SortByFieldName && o.SortByFieldIsRunMetric == opts.SortByFieldIsRunMetric &&
return o.SortByFieldName == opts.SortByFieldName &&
o.IsDesc == opts.IsDesc &&
reflect.DeepEqual(o.Filter, opts.Filter)
}
Expand Down Expand Up @@ -136,7 +198,8 @@ func NewOptions(listable Listable, pageSize int, sortBy string, filterProto *api

token := &token{
KeyFieldName: listable.PrimaryKeyColumnName(),
ModelName: listable.GetModelName()}
ModelName: listable.GetModelName(),
Model: listable}

// Ignore the case of the letter. Split query string by space.
queryList := strings.Fields(strings.ToLower(sortBy))
Expand All @@ -147,22 +210,12 @@ func NewOptions(listable Listable, pageSize int, sortBy string, filterProto *api
}

token.SortByFieldName = listable.DefaultSortField()
token.SortByFieldIsRunMetric = false
if len(queryList) > 0 {
var err error
n, ok := listable.APIToModelFieldMap()[queryList[0]]
n, ok := listable.GetField(queryList[0])
if ok {
token.SortByFieldName = n
} else if strings.HasPrefix(queryList[0], "metric:") {
// Sorting on metrics is only available on certain runs.
model := reflect.ValueOf(listable).Elem().Type().Name()
if model != "Run" {
return nil, util.NewInvalidInputError("Invalid sorting field: %q on %q : %s", queryList[0], model, err)
}
token.SortByFieldName = queryList[0][7:]
token.SortByFieldIsRunMetric = true
} else {
return nil, util.NewInvalidInputError("Invalid sorting field: %q: %s", queryList[0], err)
return nil, util.NewInvalidInputError("Invalid sorting field: %q on listable type %s", queryList[0], reflect.ValueOf(listable).Elem().Type().Name())
}
}

Expand Down Expand Up @@ -196,18 +249,8 @@ func (o *Options) AddPaginationToSelect(sqlBuilder sq.SelectBuilder) sq.SelectBu
// AddSortingToSelect adds Order By clause.
func (o *Options) AddSortingToSelect(sqlBuilder sq.SelectBuilder) sq.SelectBuilder {
// When sorting by a direct field in the listable model (i.e., name in Run or uuid in Pipeline), a sortByFieldPrefix can be specified; when sorting by a field in an array-typed dictionary (i.e., a run metric inside the metrics in Run), a sortByFieldPrefix is not needed.
var keyFieldPrefix string
var sortByFieldPrefix string
if len(o.ModelName) == 0 {
keyFieldPrefix = ""
sortByFieldPrefix = ""
} else if o.SortByFieldIsRunMetric {
keyFieldPrefix = o.ModelName + "."
sortByFieldPrefix = ""
} else {
keyFieldPrefix = o.ModelName + "."
sortByFieldPrefix = o.ModelName + "."
}
keyFieldPrefix := o.Model.GetKeyFieldPrefix()
sortByFieldPrefix := o.Model.GetSortByFieldPrefix(o.SortByFieldName)

// If next row's value is specified, set those values in the clause.
if o.SortByFieldValue != nil && o.KeyFieldValue != nil {
Expand Down Expand Up @@ -235,19 +278,6 @@ func (o *Options) AddSortingToSelect(sqlBuilder sq.SelectBuilder) sq.SelectBuild
return sqlBuilder
}

// Add a metric as a new field to the select clause by join the passed-in SQL query with run_metrics table.
// With the metric as a field in the select clause enable sorting on this metric afterwards.
func (o *Options) AddSortByRunMetricToSelect(sqlBuilder sq.SelectBuilder) sq.SelectBuilder {
if !o.SortByFieldIsRunMetric {
return sqlBuilder
}
// TODO(jingzhang36): address the case where runs doesn't have the specified metric.
return sq.
Select("selected_runs.*, run_metrics.numbervalue as "+o.SortByFieldName).
FromSelect(sqlBuilder, "selected_runs").
LeftJoin("run_metrics ON selected_runs.uuid=run_metrics.runuuid AND run_metrics.name='" + o.SortByFieldName + "'")
}

// AddFilterToSelect adds WHERE clauses with the filtering criteria in the
// Options o to the supplied SelectBuilder, and returns the new SelectBuilder
// containing these.
Expand Down Expand Up @@ -345,6 +375,12 @@ type Listable interface {
APIToModelFieldMap() map[string]string
// GetModelName returns table name used as sort field prefix.
GetModelName() string
// Get the prefix of sorting field.
GetSortByFieldPrefix(string) string
// Get the prefix of key field.
GetKeyFieldPrefix() string
// Get a valid field for sorting/filtering in a listable model from the given string.
GetField(name string) (string, bool)
// Find the value of a given field in a listable object.
GetFieldValue(name string) interface{}
}
Expand All @@ -365,20 +401,8 @@ func (o *Options) nextPageToken(listable Listable) (*token, error) {
elemName := elem.Type().Name()

var sortByField interface{}
// TODO(jingzhang36): this if-else block can be simplified to one call to
// GetFieldValue after all the models (run, job, experiment, etc.) implement
// GetFieldValue method in listable interface.
if !o.SortByFieldIsRunMetric {
if value := elem.FieldByName(o.SortByFieldName); value.IsValid() {
sortByField = value.Interface()
} else {
return nil, util.NewInvalidInputError("cannot sort by field %q on type %q", o.SortByFieldName, elemName)
}
} else {
sortByField = listable.GetFieldValue(o.SortByFieldName)
if sortByField == nil {
return nil, util.NewInvalidInputError("Unable to find run metric %s", o.SortByFieldName)
}
if sortByField = listable.GetFieldValue(o.SortByFieldName); sortByField == nil {
return nil, util.NewInvalidInputError("cannot sort by field %q on type %q", o.SortByFieldName, elemName)
}

keyField := elem.FieldByName(listable.PrimaryKeyColumnName())
Expand All @@ -387,14 +411,14 @@ func (o *Options) nextPageToken(listable Listable) (*token, error) {
}

return &token{
SortByFieldName: o.SortByFieldName,
SortByFieldValue: sortByField,
SortByFieldIsRunMetric: o.SortByFieldIsRunMetric,
KeyFieldName: listable.PrimaryKeyColumnName(),
KeyFieldValue: keyField.Interface(),
IsDesc: o.IsDesc,
Filter: o.Filter,
ModelName: o.ModelName,
SortByFieldName: o.SortByFieldName,
SortByFieldValue: sortByField,
KeyFieldName: listable.PrimaryKeyColumnName(),
KeyFieldValue: keyField.Interface(),
IsDesc: o.IsDesc,
Filter: o.Filter,
ModelName: o.ModelName,
Model: listable,
}, nil
}

Expand Down
Loading