-
Notifications
You must be signed in to change notification settings - Fork 5
Autoscaler issue fixes; Episode 2 #7
base: main
Are you sure you want to change the base?
Conversation
@@ -554,7 +558,7 @@ def scale(self, replicas: int, metrics: dict) -> int: | |||
scale_out_interval=10, | |||
scale_in_interval=10, | |||
max_batch_size=8, # for auto batching | |||
timeout_batching=1, # for auto batching | |||
max_batch_delay=1, # for auto batching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batch_wait_interval would be slightly more explicit and consistent with the arguments above.
@@ -598,20 +601,22 @@ def __init__( | |||
self._last_autoscale = time.time() | |||
self.fake_trigger = 0 | |||
|
|||
for _ in range(min_replicas): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For another PR, we need to clean this out with structures. This isn't begineer friendly.
@@ -1,12 +1,12 @@ | |||
# !pip install 'git+https://github.com/Lightning-AI/stablediffusion.git@lit' | |||
# !pip install 'git+https://github.com/Lightning-AI/DiffusionWithAutoscaler.git' | |||
# !pip install 'git+https://github.com/Lightning-AI/DiffusionWithAutoscaler.git@debugging2' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be changed before merging.
@@ -506,7 +506,11 @@ class AutoScaler(LightningFlow): | |||
scale_in_interval: The number of seconds to wait before checking whether to decrease the number of servers. | |||
endpoint: Provide the REST API path. | |||
max_batch_size: (auto-batching) The number of requests to process at once. | |||
timeout_batching: (auto-batching) The number of seconds to wait before sending the requests to process. | |||
max_batch_delay: (auto-batching) The number of seconds to wait before sending the requests to workers. | |||
request_timeout: The number of seconds to wait before timing out a request. A request may timeout because of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you add this already ? Should this be dynamic on the number of element in the batch ?
No description provided.