feat: support gRPC logging endpoint #27

airkei · 2025-01-08T08:44:55Z

Why

https://tier4.atlassian.net/browse/RT4-13461
This PR support gRPC interface in addition to the existing HTTP interface.
This PR is not breaking change, this PR doesn't support metrics yet(will do in next PR).

What

gRPC server will be created in addition to the existing HTTP server. They work in parallel in the different ports.

LISTEN_PORT for HTTP
LISTEN_PORT_GRPC for gRPC

Once the gRPC server receives the request, it puts same format data as the existing to the queue, then a thread receives the data and put the logs to cloudwatch. This PR doesn't change the data format in the queue.
By following ota-client, it converts the parameters to instance variables then handle it in the servicer.

Test

Verified that the existing pytest and new tests/test_log_proxy_server.py tests are passed.
Verified that log was output into CloudWatch as expected from ECU(VM).

airkei · 2025-01-09T00:24:23Z

src/otaclient_iot_logging_server/log_proxy_server.py

-class LoggingPostHandler:
-    """A simple aiohttp server handler that receives logs from otaclient."""
-
-    def __init__(self, queue: LogsQueue) -> None:
-        self._queue = queue
-        self._allowed_ecus = None
-
-        if ecu_info:
-            self._allowed_ecus = ecu_info.ecu_id_set
-            logger.info(
-                f"setup allowed_ecu_id from ecu_info.yaml: {ecu_info.ecu_id_set}"
-            )
-        else:
-            logger.warning(
-                "no ecu_info.yaml presented, logging upload filtering is DISABLED"
-            )
-
-    # route: POST /{ecu_id}
-    async def logging_post_handler(self, request: Request):
-        """
-        NOTE: use <ecu_id> as log_stream_suffix, each ECU has its own
-              logging stream for uploading.
-        """
-        _ecu_id = request.match_info["ecu_id"]
-        _raw_logging = await request.text()
-        _allowed_ecus = self._allowed_ecus
-        # don't allow empty request or unknowned ECUs
-        # if ECU id is unknown(not listed in ecu_info.yaml), drop this log.
-        if not _raw_logging or (_allowed_ecus and _ecu_id not in _allowed_ecus):
-            return web.Response(status=HTTPStatus.BAD_REQUEST)
-
-        _logging_msg = LogMessage(
-            timestamp=int(time.time()) * 1000,  # milliseconds
-            message=_raw_logging,
-        )
-        # logger.debug(f"receive log from {_ecu_id}: {_logging_msg}")
-        try:
-            self._queue.put_nowait((_ecu_id, _logging_msg))
-        except Full:
-            logger.debug(f"message dropped: {_logging_msg}")
-            return web.Response(status=HTTPStatus.SERVICE_UNAVAILABLE)
-
-        return web.Response(status=HTTPStatus.OK)
-
-


Moved to servicer.py.

airkei · 2025-01-31T07:26:28Z

tests/test_log_proxy_server.py

    @pytest.fixture(autouse=True)
    def prepare_test_data(self):
        self._msgs = generate_random_msgs(msg_num=self.TOTAL_MSG_NUM)

-    async def test_server(self, client_sesion: aiohttp.ClientSession):
+    async def test_http_server(


test for HTTP

airkei · 2025-01-31T07:27:12Z

src/otaclient_iot_logging_server/log_proxy_server.py

+    loop.create_task(_http_server_launcher())
+    loop.create_task(_grpc_server_launcher())
+    loop.run_forever()


create both http and grpc servers in different ports and run in parallel.

README.md

airkei · 2025-01-31T07:31:15Z

src/otaclient_iot_logging_server/servicer.py

+    async def put_log_http(self, request: Request) -> PutLogResponse:
+        """
+        put log message from HTTP POST request.
+        """
+        _ecu_id = request.match_info["ecu_id"]
+        _message = await request.text()
+
+        _code = await self.put_log(_ecu_id, _message)
+
+        if _code == ErrorCode.NO_MESSAGE or _code == ErrorCode.NOT_ALLOWED_ECU_ID:
+            _status = HTTPStatus.BAD_REQUEST
+        elif _code == ErrorCode.SERVER_QUEUE_FULL:
+            _status = HTTPStatus.SERVICE_UNAVAILABLE
+        else:
+            _status = HTTPStatus.OK
+
+        return web.Response(status=_status)


HTTP handler

src/otaclient_iot_logging_server/servicer.py

airkei · 2025-01-31T07:39:08Z

src/otaclient_iot_logging_server/servicer.py

+        )
+        # logger.debug(f"receive log from {ecu_id}: {_logging_msg}")
+        try:
+            self._queue.put_nowait((ecu_id, _logging_msg))


this PR don't change the data format in the queue(ecu_id, message). will change it in next PR.

for more information, see https://pre-commit.ci

github-actions · 2025-02-10T04:17:21Z

Coverage Report

File	Stmts	Miss	Cover	Missing
src/otaclient_iot_logging_server
__init__.py	3	0	100%
__main__.py	19	1	94%	52
_common.py	15	0	100%
_log_setting.py	27	10	62%	63, 65–66, 68–69, 73–74, 77–78, 80
_sd_notify.py	33	8	75%	42, 52, 57–59, 65–67
_utils.py	54	2	96%	73, 137
_version.py	8	0	100%
aws_iot_logger.py	101	55	45%	65–67, 69–72, 75, 81–87, 90–92, 95–100, 104–107, 111–113, 116–118, 121–126, 139, 145–147, 149–153, 157, 197, 204–207
boto3_session.py	35	9	74%	50, 58–59, 61, 76–77, 81, 83, 91
config_file_monitor.py	44	6	86%	64–66, 83–85
configs.py	46	1	97%	75
ecu_info.py	37	1	97%	75
greengrass_config.py	97	5	94%	155, 266–269
log_proxy_server.py	48	29	39%	46–47, 49–51, 54–56, 59–60, 64, 67, 69–70, 73, 76, 80–82, 84–85, 89–90, 97–98, 100–101, 106, 112
servicer.py	53	5	90%	58, 90–92, 108
src/otaclient_iot_logging_server/v1
__init__.py	1	0	100%
_types.py	47	0	100%
api_stub.py	14	0	100%
TOTAL	682	132	80%

Tests	Skipped	Failures	Errors	Time
53	0 💤	0 ❌	0 🔥	16.966s ⏱️

airkei · 2025-02-10T04:34:25Z

tests/test_log_proxy_server.py

+    @pytest.mark.asyncio
+    @pytest.fixture
+    async def launch_grpc_server(self, mocker: MockerFixture, mock_ecu_info):
+        mocker.patch(f"{MODULE}.server_cfg", _test_server_cfg)
+
+        queue: LogsQueue = Queue()
+        self._queue = queue
+
+        servicer = OTAClientIoTLoggingServerServicer(
+            ecu_info=self._ecu_info,
+            queue=queue,
+        )
+
+        server = grpc.aio.server()
+        v1_grpc.add_OTAClientIoTLoggingServiceServicer_to_server(
+            servicer=OTAClientIoTLoggingServiceV1(servicer), server=server
+        )
+        server.add_insecure_port(self.SERVER_URL_GRPC)
+        try:
+            await server.start()
+            yield
+        finally:
+            await server.stop(None)
+            await server.wait_for_termination()
+


launch grpc test server.

airkei · 2025-02-10T04:34:53Z

tests/test_log_proxy_server.py

+    async def test_grpc_server_check(self, _service: str, launch_grpc_server):
+        _req = pb2.HealthCheckRequest(service=_service)
+        async with grpc.aio.insecure_channel(self.SERVER_URL_GRPC) as channel:
+            stub = v1_grpc.OTAClientIoTLoggingServiceStub(channel)
+            _response = await stub.Check(_req)
+            assert _response.status == pb2.HealthCheckResponse.SERVING


test for Check funciton.

airkei · 2025-02-10T04:37:19Z

src/otaclient_iot_logging_server/servicer.py

+    async def http_put_log(self, request: Request) -> PutLogResponse:
+        """
+        put log message from HTTP POST request.
+        """
+        _ecu_id = request.match_info["ecu_id"]
+        _message = await request.text()
+
+        _code = await self._put_log(ecu_id=_ecu_id, message=_message)
+
+        if _code == ErrorCode.NO_MESSAGE or _code == ErrorCode.NOT_ALLOWED_ECU_ID:
+            _status = HTTPStatus.BAD_REQUEST
+        elif _code == ErrorCode.SERVER_QUEUE_FULL:
+            _status = HTTPStatus.SERVICE_UNAVAILABLE
+        else:
+            _status = HTTPStatus.OK
+
+        return web.Response(status=_status)


the existing HTTP handler.

airkei · 2025-02-10T04:38:16Z

src/proto_wrapper/proto_wrapper.py

Copied proto_wrapper and its test files from ota-client.

airkei · 2025-02-10T04:39:40Z

tests/test_log_proxy_server.py

+    async def test_grpc_server_put_log(self, launch_grpc_server):
+        # ------ execution ------ #
+        logger.info(f"sending {self.TOTAL_MSG_NUM} msgs to {self.SERVER_URL_GRPC}...")
+
+        async def send_put_log_msg(item):
+            _req = pb2.PutLogRequest(
+                ecu_id=item.ecu_id,
+                log_type=item.log_type,
+                timestamp=item.timestamp,
+                level=item.level,
+                message=item.message,
+            )
+            async with grpc.aio.insecure_channel(self.SERVER_URL_GRPC) as channel:
+                stub = v1_grpc.OTAClientIoTLoggingServiceStub(channel)
+                _response = await stub.PutLog(_req)
+                assert _response.code == pb2.ErrorCode.NO_FAILURE
+
+        for item in self._msgs:
+            await send_put_log_msg(item)
+
+        # ------ check result ------ #
+        # ensure the all msgs are sent in order to the queue by the server.
+        logger.info("checking all the received messages...")
+        for item in self._msgs:
+            _ecu_id, _log_msg = self._queue.get_nowait()
+            assert _ecu_id == item.ecu_id
+            assert _log_msg["message"] == item.message
+        assert self._queue.empty()


test for PutLog.

airkei · 2025-02-10T04:39:53Z

tests/test_log_proxy_server.py

+    async def test_gprc_reject_invalid_ecu_id(
+        self,
+        launch_grpc_server,
+    ):
+        _req = pb2.PutLogRequest(
+            ecu_id="bad_ecu_id",
+            message="valid_msg",
+        )
+        async with grpc.aio.insecure_channel(self.SERVER_URL_GRPC) as channel:
+            stub = v1_grpc.OTAClientIoTLoggingServiceStub(channel)
+            _response = await stub.PutLog(_req)
+            assert _response.code == pb2.ErrorCode.NOT_ALLOWED_ECU_ID
+
+    async def test_grpc_reject_invalid_message(self, launch_grpc_server):
+        _req = pb2.PutLogRequest(
+            ecu_id="main",
+            message="",
+        )
+        async with grpc.aio.insecure_channel(self.SERVER_URL_GRPC) as channel:
+            stub = v1_grpc.OTAClientIoTLoggingServiceStub(channel)
+            _response = await stub.PutLog(_req)
+            assert _response.code == pb2.ErrorCode.NO_MESSAGE


test for PutLog(error cases).

airkei · 2025-02-10T04:43:16Z

src/otaclient_iot_logging_server/log_proxy_server.py

+    async def _http_server_launcher():
+        handler = OTAClientIoTLoggingServerServicer(ecu_info=ecu_info, queue=queue)
+        app = web.Application()
+        app.add_routes([web.post(r"/{ecu_id}", handler.http_put_log)])


set HTTP handler, call http_put_log when HTTP server receives requests.

airkei · 2025-02-10T04:44:15Z

src/otaclient_iot_logging_server/log_proxy_server.py

+        servicer = OTAClientIoTLoggingServerServicer(
+            ecu_info=ecu_info,
+            queue=queue,
+        )
+        otaclient_iot_logging_service_v1 = OTAClientIoTLoggingServiceV1(servicer)


set gRPC handler, will call grpc_put_log via stub.

src/otaclient_iot_logging_server/v1/api_stub.py

airkei · 2025-02-11T23:31:55Z

There are many changes, but the most important changing files are the followings.

log_proxy_server.py
servicer.py

Zhenfeng-Sun

Looks good to me

for more information, see https://pre-commit.ci

sonarqubecloud · 2025-02-19T01:13:19Z

Please retry analysis of this Pull-Request directly on SonarQube Cloud

Bodong-Yang

Thank you for the PR! 👍

Nice and clean architecture, LGTM!
I only have some minor comment below, please take a look.

src/otaclient_iot_logging_server/log_proxy_server.py

Bodong-Yang · 2025-02-20T03:21:51Z

@airkei Also, about how otaclient detects otaclient-iot-logging-server grpc support.

I know that the Check grpc API has been added, and otaclient can use this API to check whether the otaclient-iot-logging-server is up or not.
But I am not sure this API can be used to let otaclient detect whether logging-server supports grpc or not.

Considering the fact that logging-server on main ECU might be launched slower than otaclient on sub ECU. How can otaclient distinguishes logging-server is not up or logging-server doesn't have grpc support?

sonarqubecloud · 2025-02-21T00:02:40Z

Quality Gate passed

Issues
3 New issues
0 Accepted issues

Measures
0 Security Hotspots
24.6% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

airkei · 2025-02-21T00:31:08Z

@airkei Also, about how otaclient detects otaclient-iot-logging-server grpc support.
I know that the Check grpc API has been added, and otaclient can use this API to check whether the otaclient-iot-logging-server is up or not.
But I am not sure this API can be used to let otaclient detect whether logging-server supports grpc or not.
Considering the fact that logging-server on main ECU might be launched slower than otaclient on sub ECU. How can otaclient distinguishes logging-server is not up or logging-server doesn't have grpc support?

Good question. I think there are two options.

Add new synchronization schema from sub-otaclient to main-logger via REST
1. when sub-otaclient wakes up, sub-otaclient send the empty message to main-logger(e.g: curl -X POST http://127.0.0.1:8083/autoware) until it success
2. sub-otaclient wait for main-logger wake up or timeout
3. once sub-otaclient detects the wake up of main-logger, judge if GRPC is supported or not by sending Check

PROS: we will not lose any message from sub-ecu
CONS:
- we still need to depend on HTTP for GRPC logging.
- sub-otaclient[critical application] need to wait for main-logging[non-critical application] wake up.

check grpc supports by polling
1. sub-otaclient send gRPC Check message with a constant interval like 1 min until it succeeds or timeout(10 min). Until it success, use the existing HTTP for logging

PROS: don't rely on HTTP for GRPC communication, sub-otaclient don't wait for main-logger.
CONS: Check will be continued to send by sub-ecu until main-logger wakeup or timeout.

I prefer Option.2, it doesn't break the existing sequence(dependency) between otaclient and logger.
But currently, the initial log from sub-ecu might be lost. If we think it is issue, we can adopt Option.1.

Anyway, either option will not depends on the logger implementation(this PR).

Bodong-Yang · 2025-02-21T01:06:04Z

But currently, the initial log from sub-ecu might be lost. If we think it is issue, we can adopt Option.1.

The initial log needs to be uploaded(as the initial logs contain important information like otaclient version and bootloader controller startup logs), so option1 is preferred I think. 🙏

we still need to depend on HTTP for GRPC logging.

Yes... But I am not expecting we can get rid of it in the near future... considering even otaclient v3.5.1 has many deployments in the production. It will be a long progress 😢

( By the way, due to the mismatch of startup time between main ECU iot-logger and sub ECU's otaclient, currently in most cases we will lost the initial logs from sub ECU 😢 , as otaclient doesn't implement retry mechanism on log pushing, let's resolve this later.

Bodong-Yang

LGTM, thank you!

airkei · 2025-02-21T01:10:46Z

The initial log needs to be uploaded(as the initial logs contain important information like otaclient version and bootloader controller startup logs), so option1 is preferred I think. 🙏

OK, in that case, let's proceed based on Option.1 :)

airkei changed the title ~~feat!: migrate the interface from REST to gRPC~~ DNM: feat!: migrate the interface from REST to gRPC Jan 8, 2025

airkei mentioned this pull request Jan 9, 2025

feat: support metrics log stream #28

Merged

airkei commented Jan 9, 2025

View reviewed changes

airkei force-pushed the feat!/migrate_from_rest_to_grpc branch from dffca6c to fe4edae Compare January 16, 2025 02:31

airkei commented Jan 31, 2025

View reviewed changes

README.md Show resolved Hide resolved

airkei changed the title ~~DNM: feat!: migrate the interface from REST to gRPC~~ DNM: feat: migrate the interface from REST to gRPC Jan 31, 2025

airkei commented Jan 31, 2025

View reviewed changes

src/otaclient_iot_logging_server/servicer.py Show resolved Hide resolved

airkei commented Jan 31, 2025

View reviewed changes

airkei changed the title ~~DNM: feat: migrate the interface from REST to gRPC~~ DNM: feat: support gRPC endpoint. Jan 31, 2025

airkei and others added 4 commits February 10, 2025 10:06

feat!: migrate the interface from REST to gRPC

9582770

remove level and timestamp for PutLog

394412e

[pre-commit.ci] auto fixes from pre-commit.com hooks

22f239e

for more information, see https://pre-commit.ci

add HTTP endpoint

518b04f

airkei force-pushed the feat!/migrate_from_rest_to_grpc branch from 5ab0e2c to 518b04f Compare February 10, 2025 01:06

support health check method

4d18a46

airkei changed the title ~~DNM: feat: support gRPC endpoint.~~ feat: support gRPC endpoint. Feb 10, 2025

airkei added 2 commits February 10, 2025 12:03

fix example environment

f4f47ff

wait for the server termination in test

870c7b4

airkei force-pushed the feat!/migrate_from_rest_to_grpc branch 3 times, most recently from 0a4cd5c to 22de49b Compare February 10, 2025 04:26

add workarround for Github Actions CI

adb11d1

airkei force-pushed the feat!/migrate_from_rest_to_grpc branch from 22de49b to adb11d1 Compare February 10, 2025 04:31

airkei commented Feb 10, 2025

View reviewed changes

src/proto_wrapper/proto_wrapper.py Outdated

Copy link

Contributor Author

airkei Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied proto_wrapper and its test files from ota-client.

airkei commented Feb 10, 2025

View reviewed changes

airkei changed the title ~~feat: support gRPC endpoint.~~ feat: support gRPC endpoint Feb 10, 2025

airkei commented Feb 10, 2025

View reviewed changes

src/otaclient_iot_logging_server/v1/api_stub.py Show resolved Hide resolved

airkei changed the title ~~feat: support gRPC endpoint~~ feat: support gRPC logging endpoint Feb 10, 2025

airkei marked this pull request as ready for review February 10, 2025 04:45

airkei requested a review from a team as a code owner February 10, 2025 04:45

airkei added the feature label Feb 10, 2025

Zhenfeng-Sun previously approved these changes Feb 14, 2025

View reviewed changes

fix bugs

cdcca0c

airkei dismissed Zhenfeng-Sun’s stale review via cdcca0c February 19, 2025 01:11

[pre-commit.ci] auto fixes from pre-commit.com hooks

09203a9

for more information, see https://pre-commit.ci

airkei requested a review from Zhenfeng-Sun February 19, 2025 01:20

airkei self-assigned this Feb 19, 2025

Zhenfeng-Sun requested a review from Bodong-Yang February 19, 2025 05:08

Bodong-Yang reviewed Feb 20, 2025

View reviewed changes

src/otaclient_iot_logging_server/log_proxy_server.py Outdated Show resolved Hide resolved

fix: aggregate handler

08d8b9e

airkei requested a review from Bodong-Yang February 21, 2025 00:31

Bodong-Yang approved these changes Feb 21, 2025

View reviewed changes

airkei merged commit d0318df into main Feb 21, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support gRPC logging endpoint #27

feat: support gRPC logging endpoint #27

airkei commented Jan 8, 2025 •

edited

Loading

airkei Jan 9, 2025

airkei Jan 31, 2025

airkei Jan 31, 2025 •

edited

Loading

airkei Jan 31, 2025

airkei Jan 31, 2025 •

edited

Loading

github-actions bot commented Feb 10, 2025 •

edited

Loading

airkei Feb 10, 2025

airkei Feb 10, 2025

airkei Feb 10, 2025

airkei Feb 10, 2025

airkei Feb 10, 2025

airkei Feb 10, 2025

airkei Feb 10, 2025

airkei Feb 10, 2025

airkei commented Feb 11, 2025

Zhenfeng-Sun left a comment

sonarqubecloud bot commented Feb 19, 2025

Bodong-Yang left a comment •

edited

Loading

Bodong-Yang commented Feb 20, 2025 •

edited

Loading

sonarqubecloud bot commented Feb 21, 2025

airkei commented Feb 21, 2025 •

edited

Loading

Bodong-Yang commented Feb 21, 2025 •

edited

Loading

Bodong-Yang left a comment

airkei commented Feb 21, 2025

feat: support gRPC logging endpoint #27

feat: support gRPC logging endpoint #27

Conversation

airkei commented Jan 8, 2025 • edited Loading

Why

What

Test

Choose a reason for hiding this comment

Choose a reason for hiding this comment

airkei Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

airkei Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Feb 10, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

airkei commented Feb 11, 2025

Zhenfeng-Sun left a comment

Choose a reason for hiding this comment

sonarqubecloud bot commented Feb 19, 2025

Bodong-Yang left a comment • edited Loading

Choose a reason for hiding this comment

Bodong-Yang commented Feb 20, 2025 • edited Loading

sonarqubecloud bot commented Feb 21, 2025

Quality Gate passed

airkei commented Feb 21, 2025 • edited Loading

Bodong-Yang commented Feb 21, 2025 • edited Loading

Bodong-Yang left a comment

Choose a reason for hiding this comment

airkei commented Feb 21, 2025

airkei commented Jan 8, 2025 •

edited

Loading

airkei Jan 31, 2025 •

edited

Loading

airkei Jan 31, 2025 •

edited

Loading

github-actions bot commented Feb 10, 2025 •

edited

Loading

Bodong-Yang left a comment •

edited

Loading

Bodong-Yang commented Feb 20, 2025 •

edited

Loading

airkei commented Feb 21, 2025 •

edited

Loading

Bodong-Yang commented Feb 21, 2025 •

edited

Loading