You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When performing an awslogs get operation on a log group with many streams, even if I I specify the full log stream name I want to search, the performance is either very slow, or if the group has enough streams, I get a ThrottlingException error.
I've run into this when searching logs from AWS Batch. Batch uses the same log group for all jobs, /aws/batch/jobs, and puts the output from each job into its own stream. This means the /aws/batch/jobs log group ends up with a large number of streams if you use batch a lot. However, this shouldn't be a problem if I know the log stream I want to search.
For example, if I ran
awslogs get -GS -s 1d /aws/batch/job my-job/default/309e41b6173e4bb98171fb3529a58092
where the last argument is the complete log stream name, I would have expected fast performance since it doesn't need to search all log streams. However what the code actually does is treat the stream name as a regex, and list every log stream in the group and compare its name to the given pattern. This causes a ThrottlingException for me. In other instance where the group doesn't have quite so many streams, the problem manifests as just very slow performance.
Here is the error I get when I run the above command:
Traceback (most recent call last):
File "/Users/ajenkins/.local/bin/awslogs", line 8, in <module>
sys.exit(main())
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/awslogs/bin.py", line 179, in main
getattr(logs, options.func)()
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/awslogs/core.py", line 109, in list_logs
streams = list(self._get_streams_from_pattern(self.log_group_name, self.log_stream_name))
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/awslogs/core.py", line 102, in _get_streams_from_pattern
for stream in self.get_streams(group):
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/awslogs/core.py", line 261, in get_streams
for page in paginator.paginate(**kwargs):
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/botocore/paginate.py", line 255, in __iter__
response = self._make_request(current_kwargs)
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/botocore/paginate.py", line 332, in _make_request
return self._method(**current_kwargs)
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/Users/ajenkins/.local/pipx/venvs/awslogs/lib/python3.6/site-packages/botocore/client.py", line 661, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (ThrottlingException) when calling the DescribeLogStreams operation (reached max retries: 4): Rate exceeded
I've just created a pull request to fix this. I added a --stream-prefix option to awslogs get, which tells it to treat the log stream argument as a string prefix instead of as a regex. Then it can pass it as the logStreamNamePrefix argument to describe_log_streams, which results in much faster performance, since the filtering is done in AWS. This completely fixes the problem for me.
The text was updated successfully, but these errors were encountered:
When performing an
awslogs get
operation on a log group with many streams, even if I I specify the full log stream name I want to search, the performance is either very slow, or if the group has enough streams, I get aThrottlingException
error.I've run into this when searching logs from AWS Batch. Batch uses the same log group for all jobs,
/aws/batch/jobs
, and puts the output from each job into its own stream. This means the/aws/batch/jobs
log group ends up with a large number of streams if you use batch a lot. However, this shouldn't be a problem if I know the log stream I want to search.For example, if I ran
where the last argument is the complete log stream name, I would have expected fast performance since it doesn't need to search all log streams. However what the code actually does is treat the stream name as a regex, and list every log stream in the group and compare its name to the given pattern. This causes a
ThrottlingException
for me. In other instance where the group doesn't have quite so many streams, the problem manifests as just very slow performance.Here is the error I get when I run the above command:
I've just created a pull request to fix this. I added a
--stream-prefix
option toawslogs get
, which tells it to treat the log stream argument as a string prefix instead of as a regex. Then it can pass it as thelogStreamNamePrefix
argument todescribe_log_streams
, which results in much faster performance, since the filtering is done in AWS. This completely fixes the problem for me.The text was updated successfully, but these errors were encountered: