Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Application from helm chart #181

Merged
merged 1 commit into from
Jul 6, 2018
Merged

Remove Application from helm chart #181

merged 1 commit into from
Jul 6, 2018

Conversation

ukclivecox
Copy link
Contributor

No description provided.

@ukclivecox ukclivecox merged commit 57b87f3 into SeldonIO:master Jul 6, 2018
agrski added a commit that referenced this pull request Dec 2, 2022
* Initialise Go module

* Add WIP README for Hodometer (usage metrics)

* Add metrics to table in README

* Define structure for Go module

* Fix format descriptions in metrics table in README

* Rename metrics for concision in table in README

* Add metrics definitions as Go structs

* Rename metrics type for clarity

The name 'metrics' is too generic and precludes other types of metrics we might want to collect.

* Add erroneously missed package statement to Go file

* Add skeletons for collector interface & Seldon Core impl

* Rename dir for consistency with Go package name

* Update go.mod

* Remove comments from Markdown table in README

* Add constructor for Seldon Core collector

* Add method stubs for Seldon Core collector

* Add method impl for experiment metrics collection

* Use experiment metrics in overall metrics

* Add arg-parsing & validation logic in cmd package

* Rename type to avoid clashes with likely var names

* Update go.mod with new deps

* Rename main package to meet convention

* Add basic impl of main method in cmd package

This does not yet do anything with the collected metrics,
like sending them to a long-term store.

* Add method impl for pipeline metrics collection

* Update signatures of Seldon Core collector metrics methods

* Add fields to collector logging

* Add source field to collector logging

* Add method impls for server & model metrics collection

* Shorten collector logs as fields already provide context

* Add resource metrics to returned struct

* Collect resource metrics concurrently

* Reorder schedler RPC calls into logical groups

* Add RPC for scheduler status in scheduler API

* Add snapshot parameters to gRPC subscription requests in Protobuf contract

* Regenerate from Protobuf contracts

* Use snapshot mode for resource subscriptions

* Rename parameter for consistency & clarity

* Refactor metrics level to pkg from cmd

* Remove unnecessary function

This was superceded by parsing a metrics level from a string.

* Add metrics level param to collect() method

* Add condition on including resource metrics

* Use structs to collect multiple metrics per resource type

* Use uints for clear semantics in metrics declarations

* Remove unnecessary type casts

* Add feature-level metrics from server details

* Add feature-level metrics to overall results

* Allow pipeline status RPC to return multiple results

Rather than using a snapshot parameter to the pipeline subscription,
this changes the semantics of the status RPC.
On consideration, a snapshot in the context of a stream feels like a
request to bootstrap the complete state before _continuing_ the stream,
rather than a request for a limited stream.
On the other hand, the status RPC is already asking for a static view
of the system.

This is a test for just the pipeline status initially.

* Update generated files from Protos

* Formatting

* Apply single-pipeline logic to streaming status implementation

* Add unimpl'd scheduler RPC method from updated Protos

* Add very basic impl of scheduler status RPC method

* Remove unnecessary blank lines

* Add checks for non-nil metrics before using them

* Add subscriber name field to pipeline status request

Always including a subscriber name aids debug-ability, especially when there are many potential subscribers.

* Update pipeline collector to use new status request instead of subscription

* Make pipeline-counting logic more robust

This adds checks on the status of the pipeline as an aggregate of any versions it may have.
Thus, we only count pipelines which are deemed 'active'.
Necessarily, there is more error checking.

* Fix typo in pipeline activity logic

* Add collector method to fetcher scheduler details

* Add Makefile & linting rules for Hodometer

* Fix linting issue with error creation

* Add nolint directive for metrics level

The metrics level may be unused in code, but it clarifies usage for humans
and may be used in the (near) future.

* Use insecure gRPC connection to scheduler from hodometer collector

This is intended as a temporary measure; TLS is planned for all services.

* Fix scheduler server tests

Support optional strings.
Add stub pipeline status stream implementation.
Update pipeline status test to use new API.

* Add logging to metrics collection method in hodometer

* Add scheduler metrics into overall usage metrics

* Add log for requester name to pipeline status request handler

* Add methods to snapshot all pipelines in store for subscribers

* Add loggers to scheduler servers in tests

* Remove Vim swap file

* Update generated scheduler Proto file after merge

* Remove erroneously added generated Proto file

* Allow experiment status RPC to return multiple results

* Update experiment status handler to meet new API

This does not add support for having multiple pipelines, but rather changes the interface without changing behaviour.

* Fix typo in field name

* Add methods to snapshot all experiments in store for subscribers

* Update experiment collector to use new status request instead of subscription

* Allow server status RPC to return multiple results

* Remove unused Proto message definition - ServerReference

* Update generated Protobuf files

* Update server collector to use new status request instead of subscription

* Update server status handler to meet new API in scheduler

* Add methods to send snapshots of servers in store to status subscribers

* Replace call to gRPC API with store and conversion methods in server status subscription handler

It is strange to call a gRPC API from within not only the same service,
but also within the same package, wherein the relevant state and methods are available.
Secondly, the move to a response stream for server statuses would complicate the use of
gRPC, as we would need to provide an implementation of the stream interface.
Instead, we can simply use the store to retrieve the relevant data,
and convert the format of this using an accessible helper function.

* Update definition for when servers are active in collector

* Only count server resources when servers are active

* Fix overcommit counting bug by changing order of checks in collector

* Refactor server metrics logic to simplify testing

Separate protocol interactions and validation from business logic.

* Add test cases for server metrics

* Update go.mod + add go.sum

* Fix grammatical error in test case name

* Remove unnecessary details in hodometer test cases for simplicity

The ServerName field neither affects the tests nor adds insight into them,
thus is a prime candidate for removal to make the tests more legible.

* Add test cases for metrics level -> string method

* Add test cases for metrics level parsing

* Add missing type to metrics levels iota

* Change value for unrecognised metrics level when parsing

* Add case-sensitive test case for metrics level parsing

* Use case-insensitive comparison for metrics levels parsing

* Refactor pipeline metrics logic to simplify testing

* Remove blank line

* Add test cases for pipeline metrics

* Fix bug with total server memory allocation

Performing integer division by a (moderately) large divisor was truncating results to zero.
However, the intent was to have floating-point division for scaling.

* Allow model status RPC to return multiple results

* Update model collector to use new status request instead of subscription

* Update model status handler to meet new API in scheduler

* Remove unused context param to scheduler method

* Add methods to snapshot all models in store for status subscribers

* Update scheduler tests with right types & new methods for mocks

* Fix model status handler bug attempting to lock model too early

At the outset, we do not know whether a specific model has been requested or not.

* Add logs for subscriber names in status handlers

* Update model metrics collector for stylistic consistency

* Update seldon CLI for compatibility with new Protobuf contracts in scheduler

Add helper methods to avoid repetitive code.
Add subscriber name.
Add blank lines to separate logical sections.

* Add symlink to APIs for hodometer

* Add support for logging level to hodometer

* Add debug log for collected metrics

* Remove snapshot params to subscription requests as unnecessary

* Add skeleton for publishing metrics to MixPanel-compatible endpoint

* Update go.mod with MixPanel client lib dependency

* Add error return param to Publish method in interface

* Fix type error - provide event to MixPanel tracking call

* Add reflective methods to flatten usage metrics into MixPanel-compatible 'properties' map

* Add TODO on generic for upgrade to Go 1.18+

* Add test cases for struct-flattening (reflection-based) method

* Fix typo in comment

* Add usage metrics to MixPanel event as properties

* Add retry logic for publishing metrics

* Update go.mod with retry lib dependency

* Refactor retry config to package-level constants

* Add docstring for struct-flattening function to explain intended usage and limitations

* Add server replica count to metrics collected from scheduler

Also update tests with expected replica counts.

* Remove accidentally added Vim swap file

* Add env var handling for cluster ID

* Add cluster ID (as UUID) to collected metrics

* Update go.mod with UUID lib dependency

* Move field for k8s check to cluster metrics

* Remove is-k8s metrics field as redundant

Whether the cluster is on k8s or not can be inferred from whether or not the
k8s version is null or the empty string.
Removing the check field reduces code, reduces data transfer, and avoids logical
errors whereby it is inconsistent with the other k8s fields.

* Add logic for handling k8s metrics without actually collecting

* Add logging for publisher

* Allow nested pointers (to structs) in struct-flattening logic

* Add publisher into main func in hodometer

* Pass logger to retry-enabled HTTP client

* Add API key & URL for testing metrics publishing

* Add Make target for running hodometer tests

* Add Dockerfile for hodometer

* Add Make vars & targets to build Dockerfile

* Add impl of LeveledLogger for retry-enabled HTTP client

This has the benefit that logging levels are now respected,
reducing the verbosity and improving the cleanliness of logs.
The implementation is a thin shim to proxy calls to the underlying structured logger.

* Fix call to logger in publisher

The call was erroneously using the nested logger instance,
which ignored the extra fields provided by the publisher's context.

* Add punctuator for periodic actions

* Run metrics collection & publishing at set intraday intervals

* Reorder statements for legibility

* Specify Go test dirs explicitly

* Add log on successful metrics publishing

* Add build info package for hodometer

* Add logging for build info in main function

* Provide build info to build in Makefile

* Add version.txt for hodometer versioning

* Add log for next metrics run time

* Add hodometer details into published metrics

* Handle possible errors from sending on status streams

* Group status RPCs in scheduler Protobuf contract

* Use Make target in Docker build

This ensures consistency with local builds and avoids having to redefine
the same arguments.

* Fix incorrect binary path in Dockerfile

* Use same flags for Go builds & tests in Makefile
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant