Skip to content

Commit

Permalink
Automate ensembler image building with Turing API (#170)
Browse files Browse the repository at this point in the history
* Fix bug is dockerfile and makefile involving passing of env variables

* Remove dockerfile command to install ensembler dependencies in base image

* Add new config field to store references to saved pyfunc ensemblers

* Add ensembler type and config verification

* Add resource request and timeout limits to pyfunc ensembler option

* Add template for ensembler image building option

* Move getensemblerdirectory method from ensembling_job_service to ensembler_service

* Add new configs for pyfunc ensembler service building

* Add config field for ensembler service

* Add holders in appcontext to carry an ensembler service builder

* Make router deploy pyfunc ensembler as if it were a normal ensembler with docker configs

* Make deployment controller utilise pyfunc configs to generate docker image and configs

* Refactor tests with changes to ensembler config schema

* Add tests for imagebuilder for ensembler service

* Revert changes to get ensembler directory function

* Refactor appcontext test

* Create config class for resource request and timeout

* Fix migration script for new py_func_ref_config

* Fix lint comments

* Fix lint import formatting

* Rename pyfunc ensembler type and regen openapi

* Fix bug in ensembler Dockerfile

* Refactor Dockerfile and fix 8083 port for ensembler

* Fix port for ensembler in turing api

* Fix output message

* Update router deployment logs messages

* Increase specificity of error log in router deployment

* Refactor imagerequest and use a name that reflects each unique ensembler version

* Specify pyfunc ensembler images to have name based on their versions

* Add additional checks on ensemblers to api handlers

* Add mocks to existing tests after refactoring

* Add ensembler service configs to turing chart

* Add test for requests with ensemblers that cannot be found

* Update e2e test data

* Revert combination of resource request and timeout into a single object

* Regenerate openapi objects

* Revert changes to e2e test data

* Shift variables to uppercase for consistency

* Add ensembler service to example config

* Simplify yaml configs with anchors and aliases

* Rename py_func_ref_config to pyfunc_config

* Rename pyfunc_config to make it consistent with camelcase PyfuncConfig

* Refactor naming of pyfunc config in openapi

* Clean up openapi specs

* Remove redundant schema from openapi spec

* Remove redundant check on dockerconfig

* Remove unnecessary logical checks for router deployment

* Remove modifications left by IDE

* Remove go build tags

* Remove go build tag from integration_config

* Refactor image building step into DeployRouterVersion process

* Reformat code after lint tests

* Rework image/docker naming convention to include run ids

* Remove redundant dockerfile layer

* Clean up OpenAPI specs

* Add default values to docker_config when using pyfunc ensemblers

* Fix broken test
  • Loading branch information
deadlycoconuts authored Mar 7, 2022
1 parent a3d788d commit 44d4782
Show file tree
Hide file tree
Showing 47 changed files with 1,441 additions and 254 deletions.
137 changes: 111 additions & 26 deletions api/api/openapi.bundle.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -551,7 +551,7 @@ paths:
content:
application/json:
schema:
$ref: '#/components/schemas/inline_response_200'
$ref: '#/components/schemas/IdObject_1'
description: OK
"400":
description: Invalid project_id or router_id
Expand Down Expand Up @@ -651,7 +651,7 @@ paths:
content:
application/json:
schema:
$ref: '#/components/schemas/inline_response_202'
$ref: '#/components/schemas/RouterIdAndVersion'
description: Accepted
"400":
description: Invalid project_id, router_id or deploy request
Expand Down Expand Up @@ -682,7 +682,7 @@ paths:
content:
application/json:
schema:
$ref: '#/components/schemas/inline_response_200'
$ref: '#/components/schemas/IdObject_1'
description: OK
"400":
description: Invalid project_id or router_id
Expand Down Expand Up @@ -751,11 +751,11 @@ paths:
format: int32
type: integer
responses:
"200":
"202":
content:
application/json:
schema:
$ref: '#/components/schemas/inline_response_202'
$ref: '#/components/schemas/RouterIdAndVersion'
description: OK
"400":
description: Invalid project_id, router_id or version
Expand Down Expand Up @@ -832,7 +832,7 @@ paths:
content:
application/json:
schema:
$ref: '#/components/schemas/inline_response_202'
$ref: '#/components/schemas/RouterIdAndVersion'
description: Accepted
"400":
description: Invalid project_id, router_id, version_id or deploy request
Expand Down Expand Up @@ -1346,6 +1346,7 @@ components:
error: error
infra_config:
service_account_name: service_account_name
run_id: run_id
ensembler_name: ensembler_name
resources:
driver_cpu_request: driver_cpu_request
Expand Down Expand Up @@ -1400,6 +1401,7 @@ components:
EnsemblerInfraConfig:
example:
service_account_name: service_account_name
run_id: run_id
ensembler_name: ensembler_name
resources:
driver_cpu_request: driver_cpu_request
Expand All @@ -1423,6 +1425,9 @@ components:
x-go-custom-tag: validate:"required"
resources:
$ref: '#/components/schemas/EnsemblingResources'
run_id:
readOnly: true
type: string
env:
items:
$ref: '#/components/schemas/EnvVar'
Expand Down Expand Up @@ -1900,6 +1905,15 @@ components:
cpu_request: cpu_request
id: 0
ensembler:
pyfunc_config:
project_id: 7
ensembler_id: 9
resource_request:
min_replica: 0
max_replica: 6
memory_request: memory_request
cpu_request: cpu_request
timeout: timeout
updated_at: 2000-01-23T04:56:07.000+00:00
standard_config:
experiment_mappings:
Expand Down Expand Up @@ -1961,6 +1975,7 @@ components:
resource_request:
$ref: '#/components/schemas/ResourceRequest'
timeout:
pattern: ^[0-9]+(ms|s|m|h)$
type: string
log_config:
$ref: '#/components/schemas/RouterVersion_log_config'
Expand Down Expand Up @@ -2001,6 +2016,7 @@ components:
endpoint:
type: string
timeout:
pattern: ^[0-9]+(ms|s|m|h)$
type: string
annotations:
nullable: true
Expand All @@ -2023,8 +2039,10 @@ components:
max_replica:
type: integer
cpu_request:
pattern: ^(\d{1,3}(\.\d{1,3})?)$|^(\d{2,5}m)$
type: string
memory_request:
pattern: ^\d+(Ei?|Pi?|Ti?|Gi?|Mi?|Ki?)?$
type: string
type: object
LogLevel:
Expand All @@ -2047,8 +2065,10 @@ components:
batch_load: true
table: table
service_account_secret: service_account_secret
nullable: true
properties:
table:
pattern: ^[a-zA-Z][a-zA-Z0-9-]+\.\w+([_]?\w)+\.\w+([_]?\w)+$
type: string
service_account_secret:
type: string
Expand All @@ -2064,12 +2084,15 @@ components:
brokers: brokers
topic: topic
serialization_format: json
nullable: true
properties:
brokers:
description: Comma-separated list of host and port pairs that are the addresses
of the Kafka brokers.
pattern: ^([a-zA-Z]+:\/\/)?\[?([0-9a-zA-Z\-%._:]*)\]?:([0-9]+)(,([a-zA-Z]+:\/\/)?\[?([0-9a-zA-Z\-%._:]*)\]?:([0-9]+))*$
type: string
topic:
pattern: ^[A-Za-z0-9_.-]{1,249}$
type: string
serialization_format:
enum:
Expand Down Expand Up @@ -2112,6 +2135,7 @@ components:
endpoint:
type: string
timeout:
pattern: ^[0-9]+(ms|s|m|h)$
type: string
port:
type: integer
Expand Down Expand Up @@ -2142,6 +2166,15 @@ components:
type: object
RouterEnsemblerConfig:
example:
pyfunc_config:
project_id: 7
ensembler_id: 9
resource_request:
min_replica: 0
max_replica: 6
memory_request: memory_request
cpu_request: cpu_request
timeout: timeout
updated_at: 2000-01-23T04:56:07.000+00:00
standard_config:
experiment_mappings:
Expand Down Expand Up @@ -2179,11 +2212,14 @@ components:
enum:
- standard
- docker
- pyfunc
type: string
standard_config:
$ref: '#/components/schemas/EnsemblerStandardConfig'
docker_config:
$ref: '#/components/schemas/EnsemblerDockerConfig'
pyfunc_config:
$ref: '#/components/schemas/EnsemblerPyfuncConfig'
created_at:
format: date-time
readOnly: true
Expand Down Expand Up @@ -2235,12 +2271,14 @@ components:
nullable: true
properties:
image:
pattern: ^([a-zA-Z0-9]+(?:[._-][a-zA-Z0-9]+)*(?::\d{2,5})?\/)?([a-zA-Z0-9]+(?:[._-][a-zA-Z0-9]+)*\/)*([a-zA-Z0-9]+(?:[._-][a-zA-Z0-9]+)*)(?::[a-zA-Z0-9]+(?:[._-][a-zA-Z0-9]+)*)?$
type: string
resource_request:
$ref: '#/components/schemas/ResourceRequest'
endpoint:
type: string
timeout:
pattern: ^[0-9]+(ms|s|m|h)$
type: string
port:
type: integer
Expand All @@ -2261,6 +2299,34 @@ components:
- resource_request
- timeout
type: object
EnsemblerPyfuncConfig:
description: ensembler config when ensembler type is pyfunc
example:
project_id: 7
ensembler_id: 9
resource_request:
min_replica: 0
max_replica: 6
memory_request: memory_request
cpu_request: cpu_request
timeout: timeout
nullable: true
properties:
project_id:
type: integer
ensembler_id:
type: integer
resource_request:
$ref: '#/components/schemas/ResourceRequest'
timeout:
pattern: ^[0-9]+(ms|s|m|h)$
type: string
required:
- ensembler_id
- project_id
- resource_request
- timeout
type: object
TrafficRule:
example:
routes:
Expand Down Expand Up @@ -2390,6 +2456,15 @@ components:
memory_request: memory_request
cpu_request: cpu_request
ensembler:
pyfunc_config:
project_id: 7
ensembler_id: 9
resource_request:
min_replica: 0
max_replica: 6
memory_request: memory_request
cpu_request: cpu_request
timeout: timeout
updated_at: 2000-01-23T04:56:07.000+00:00
standard_config:
experiment_mappings:
Expand Down Expand Up @@ -2436,6 +2511,7 @@ components:
environment_name:
type: string
name:
pattern: ^[a-z0-9-]*$
type: string
config:
$ref: '#/components/schemas/RouterConfig_config'
Expand All @@ -2444,6 +2520,20 @@ components:
- environment_name
- name
type: object
IdObject_1:
$ref: '#/components/schemas/IdObject'
RouterIdAndVersion:
example:
router_id: 0
version: 6
properties:
router_id:
format: int32
type: integer
version:
format: int32
type: integer
type: object
Event:
example:
event_type: info
Expand Down Expand Up @@ -2767,13 +2857,18 @@ components:
value: value
properties:
name:
pattern: ^[a-zA-Z0-9_]*$
type: string
value:
type: string
required:
- name
type: object
ExperimentConfig:
additionalProperties:
nullable: true
readOnly: true
type: object
example:
type: nop
config: '{}'
Expand All @@ -2794,26 +2889,6 @@ components:
- header
- payload
type: string
inline_response_200:
example:
router_id: 0
properties:
router_id:
format: int32
type: integer
type: object
inline_response_202:
example:
router_id: 0
version: 6
properties:
router_id:
format: int32
type: integer
version:
format: int32
type: integer
type: object
EnsemblersPaginatedResults_allOf:
properties:
paging:
Expand Down Expand Up @@ -3023,6 +3098,15 @@ components:
memory_request: memory_request
cpu_request: cpu_request
ensembler:
pyfunc_config:
project_id: 7
ensembler_id: 9
resource_request:
min_replica: 0
max_replica: 6
memory_request: memory_request
cpu_request: cpu_request
timeout: timeout
updated_at: 2000-01-23T04:56:07.000+00:00
standard_config:
experiment_mappings:
Expand Down Expand Up @@ -3080,6 +3164,7 @@ components:
resource_request:
$ref: '#/components/schemas/ResourceRequest'
timeout:
pattern: ^[0-9]+(ms|s|m|h)$
type: string
log_config:
$ref: '#/components/schemas/RouterConfig_config_log_config'
Expand Down
3 changes: 3 additions & 0 deletions api/api/specs/jobs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -418,6 +418,9 @@ components:
x-go-custom-tag: validate:"required"
resources:
$ref: "#/components/schemas/EnsemblingResources"
run_id:
type: string
readOnly: true
env:
type: array
items:
Expand Down
Loading

0 comments on commit 44d4782

Please sign in to comment.