Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support metrics log stream #28

Merged
merged 3 commits into from
Feb 28, 2025
Merged

Conversation

airkei
Copy link
Contributor

@airkei airkei commented Jan 8, 2025

Why

https://tier4.atlassian.net/browse/RT4-13461
This PR support a new log group for metrics on CloudWatch({self.profile}-edge-otaclient-metrics).
The reason why we switch the log group for metrics is that AWS subscription filter apply its filter based on the target log group level. Thus, by separating as log group, cloudwatch can apply the filter efficiently.

What

Change the existing data format in the queue and add LogGroupType field.

  • before: Queue[tuple[ecu_id, LogMessage]]
  • after: Queue[tuple[LogGroupType(enum), ecu_id, LogMessage]]

Once logging thread receives the log group type from the queue, handle the target CloudWatch log group based on the incoming parameter. The new log group and stream name will be

  • log
    • group
      /aws/greengrass/edge/{self.region}/{self.account_id}/{self.profile}-edge-otaclient (same as the existing)
    • stream
      yyyy/MM/dd/fms-${stage}-edge-${vehicle_id}-Core}/${ecu_id} (same as the existing)
  • metrics
    • group
      /aws/greengrass/edge/{self.region}/{self.account_id}/{self.profile}-edge-otaclient-metrics
    • stream
      same as the existing log stream name

NOTE: Metrics is supported only in gRPC interface. In the existing HTTP interface, there is no way to use metrics.

Test

  • Verified that the modified pytests are passed.
  • Verified E2E tests(the log was output as expected from ECU to CloudWatch)

Comment on lines 240 to 247
@computed_field
@property
def aws_cloudwatch_metrics_log_group(self) -> str:
return (
f"/aws/greengrass/edge/{self.region}/"
f"{self.account_id}/{self.profile}-edge-otaclient-metrics"
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new log group name for metrics

from queue import Queue
from typing import Literal, TypedDict

from typing_extensions import NotRequired, TypeAlias

LogsQueue: TypeAlias = "Queue[tuple[str, LogMessage]]"
LogsQueue: TypeAlias = "Queue[tuple[LogGroupType, str, LogMessage]]"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the data format of the queue.

@@ -76,10 +76,13 @@ def generate_random_msgs(
) -> list[tuple[str, LogMessage]]:
_res: list[tuple[str, LogMessage]] = []
for _ in range(msg_num):
_ecu, *_ = random.sample(ecus_list, 1)
_ecu_id, *_ = random.sample(ecus_list, 1)
_log_group_type = random.choice(list(LogGroupType))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test both log group type.

@airkei airkei changed the title feat: support metrics log stream DNM: feat: support metrics log stream Jan 9, 2025
@airkei airkei force-pushed the feat/support_metrics_log_stream branch from cce23c4 to 33972d8 Compare January 16, 2025 02:36
@airkei airkei force-pushed the feat/support_metrics_log_stream branch from 33972d8 to 3e59bb5 Compare February 26, 2025 08:33
Copy link
Contributor

github-actions bot commented Feb 26, 2025

Coverage

Coverage Report
FileStmtsMissCoverMissing
src/otaclient_iot_logging_server
   __init__.py30100% 
   __main__.py19194%52
   _common.py190100% 
   _log_setting.py261061%63, 65–66, 68–69, 73–74, 77–78, 80
   _sd_notify.py33875%42, 52, 57–59, 65–67
   _utils.py54296%73, 137
   _version.py90100% 
   aws_iot_logger.py1055844%70–72, 74–75, 78, 81–82, 85, 91, 95–102, 105–106, 109, 112–117, 121–124, 128–130, 133–135, 138–143, 158, 164–166, 168–172, 176, 226, 233–236
   boto3_session.py35974%50, 58–59, 61, 76–77, 81, 83, 91
   config_file_monitor.py44686%64–66, 83–85
   configs.py46197%75
   ecu_info.py37197%75
   greengrass_config.py101595%155, 274–277
   log_proxy_server.py482939%46–47, 49–51, 54–56, 59–60, 64, 67, 69–70, 73, 76, 80–82, 84–85, 89–90, 97–98, 100–101, 106, 112
   servicer.py59591%59, 101–103, 119
src/otaclient_iot_logging_server/v1
   __init__.py10100% 
   _types.py470100% 
   api_stub.py140100% 
TOTAL70013580% 

Tests Skipped Failures Errors Time
53 0 💤 0 ❌ 0 🔥 16.909s ⏱️

@airkei airkei changed the title DNM: feat: support metrics log stream feat: support metrics log stream Feb 26, 2025
Comment on lines +90 to +110
for log_group_name in log_group_names:
try:
client.create_log_group(logGroupName=log_group_name)
logger.info(f"{log_group_name=} has been created")
except exc_types.ResourceAlreadyExistsException as e:
logger.debug(
f"{log_group_name=} already existed, skip creating: {e.response}"
)
raise e.__cause__ from None
logger.error(f"failed to create {log_group_name=}: {e!r}")
raise
except Exception as e:
logger.error(f"failed to create {log_group_name=}: {e!r}")
raise
except ValueError as e:
if e.__cause__ and isinstance(
e.__cause__, awscrt.exceptions.AwsCrtError
):
logger.error(
(f"failed to create mtls connection to remote: {e.__cause__}")
)
raise e.__cause__ from None
logger.error(f"failed to create {log_group_name=}: {e!r}")
raise
except Exception as e:
logger.error(f"failed to create {log_group_name=}: {e!r}")
raise
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create both AWS CloudWatch log groups for LOG and METRICS.

@airkei airkei force-pushed the feat/support_metrics_log_stream branch from 3e59bb5 to 9b1a809 Compare February 26, 2025 10:12
@@ -86,7 +97,7 @@ async def _put_log(
)
# logger.debug(f"receive log from {ecu_id}: {_logging_msg}")
try:
self._queue.put_nowait((ecu_id, _logging_msg))
self._queue.put_nowait((_logging_group_type, ecu_id, _logging_msg))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put group type to queue, then aws_iot_logger thread will get and handle it.

Comment on lines +201 to +203
_log_group_type, _ecu_id, _log_msg = self._queue.get_nowait()
# always log type is LOG in HTTP
assert _log_group_type == LogGroupType.LOG
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify log group for HTTP.

Comment on lines +275 to +277
assert _log_group_type == convert_from_log_type_to_log_group_type(
item.log_type
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify log group for gRPC.

@airkei airkei self-assigned this Feb 26, 2025
@airkei airkei marked this pull request as ready for review February 26, 2025 10:16
@airkei airkei requested a review from a team as a code owner February 26, 2025 10:16
Copy link
Member

@Bodong-Yang Bodong-Yang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Overall LGTM, only some minor comments related to variables' name of logging group.

Comment on lines 74 to 77
self._session_config = session_config
self._log_group_name = session_config.aws_cloudwatch_log_group
self._metrics_group_name = session_config.aws_cloudwatch_metrics_log_group
self._interval = interval
Copy link
Member

@Bodong-Yang Bodong-Yang Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the variable naming here, previously we only have one log_group, so log_group equals to otaclient_logs log_group.
But now we have two log_groups, it becomes kind of ambiguous, (otaclient_logs_)log_group and (otaclient_)metrics_group are both (aws_cloudwatch_)log_group.

I suggest that we prefix a otaclient_logs to the variable name related to (otaclient_logs_)log_group

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, agree.
Long formalized clear name should be better than short unclear name, fixed in the latest commit.

Comment on lines 240 to 247
@computed_field
@property
def aws_cloudwatch_metrics_log_group(self) -> str:
return (
f"/aws/greengrass/edge/{self.region}/"
f"{self.account_id}/{self.profile}-edge-otaclient-metrics"
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggests that also change the previous aws_cloudwatch_log_group property name into aws_cloudwatch_otaclient_logs_log_group as we have two log groups now.

And to make naming schema matching, we can also rename aws_cloudwatch_metris_log_group into aws_cloudwatch_otaclient_metrics_log_group.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, fixed.

except Empty:
break

for log_stream_suffix, logs in message_dict.items():
for (log_group_type, log_stream_suffix), logs in message_dict.items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one, I just know that we can directly unpack tuple here!

@airkei airkei requested a review from Bodong-Yang February 27, 2025 10:31
Copy link
Member

@Bodong-Yang Bodong-Yang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!
By the way, is the AWS configured to have metrics_groups created or to allow otaclient creates the metrics_group?

UPDATED: seems the case is AWS is configured to allow greengrass web.auto agent create log_group with any names. Since iot-logger also uses greengrass cert, we can do log_group creation.

( Although I think it is not a good practice to do so, I think FMS should define log_groups with IaC in advance, instead of letting edge to do so 🤔

@airkei
Copy link
Contributor Author

airkei commented Feb 28, 2025

By the way, is the AWS configured to have metrics_groups created or to allow otaclient creates the metrics_group?
UPDATED: seems the case is AWS is configured to allow greengrass web.auto agent create log_group with any names.

Yes, correct. In the test, verified the target log group is created from iot-logger as expected.
image

( Although I think it is not a good practice to do so, I think FMS should define log_groups with IaC in advance, instead of letting edge to do so 🤔

Yes, correct. Ideally, CloudWatch group should be created from template first, then iot-logger just puts the log to it. Otherwise, resource(log group) will be created outside of the control of IaC like this case.

@Bodong-Yang
Copy link
Member

Yes, correct. Ideally, CloudWatch group should be created from template first, then iot-logger just puts the log to it. Otherwise, resource(log group) will be created outside of the control of IaC like this case.

Let's consult with FMS team later about this topic 👍, seems like the current settings in IaC have been there for quite a long time, time to update it.

@airkei airkei merged commit bfee052 into main Feb 28, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants