From 2f3356ba6200db3212257c1d48fbf1694333a134 Mon Sep 17 00:00:00 2001 From: Mengjin Yan Date: Wed, 12 Feb 2025 16:09:32 -0800 Subject: [PATCH] [Core] Enable JSON Structured Logging Format in Ray (#50471) ## Why are these changes needed? 1. Added the `JSON` logging format option in Ray 2. Update the API doc as well as the ray logging doc according to the change: - Rearranged the sections in existing ray logging document - Copied the content of the structured logging document from Anyscale document to ray logging document ## Related issue number ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Mengjin Yan Co-authored-by: Dhyey Shah Co-authored-by: Edward Oakes --- .../user-guides/configure-logging.md | 284 +++++++++++++----- .../_private/ray_logging/logging_config.py | 33 +- 2 files changed, 230 insertions(+), 87 deletions(-) diff --git a/doc/source/ray-observability/user-guides/configure-logging.md b/doc/source/ray-observability/user-guides/configure-logging.md index 358691ed5584..c9b0808c0636 100644 --- a/doc/source/ray-observability/user-guides/configure-logging.md +++ b/doc/source/ray-observability/user-guides/configure-logging.md @@ -221,7 +221,7 @@ All Ray loggers are automatically configured in ``ray._private.ray_logging``. To import logging logger = logging.getLogger("ray") -logger # Modify the Ray logging config +logger.setLevel(logging.WARNING) # Modify the Ray logging config ``` Similarly, to modify the logging configuration for Ray libraries, specify the appropriate logger name: @@ -242,12 +242,9 @@ ray_data_logger.setLevel(logging.WARNING) # Here's how to add an aditional file handler for Ray Tune: ray_tune_logger.addHandler(logging.FileHandler("extra_ray_tune_log.log")) ``` -(structured-logging)= -## Structured logging -Implement structured logging to enable downstream users and applications to consume the logs efficiently. -### Application logs -A Ray app includes both driver and worker processes. For Python apps, use Python loggers to format and structure your logs. +### Using Ray logger for application logs +A Ray app includes both driver and worker processes. For Python apps, use Python loggers to format your logs. As a result, you need to set up Python loggers for both driver and worker processes. ::::{tab-set} @@ -265,7 +262,7 @@ Set up the Python logger for driver and worker processes separately: ![Set up python loggers](../images/setup-logger-application.png) -If you want to control the logger for particular actors or tasks, view [customizing logger for individual worker process](#customizing-worker-process-loggers) +If you want to control the logger for particular actors or tasks, view the following [customizing logger for individual worker process](#customizing-worker-process-loggers). ::: @@ -275,38 +272,202 @@ If you are using any of the Ray libraries, follow the instructions provided in t :::: -### System logs -Most of Ray’s system or component logs are structured by default.
+### Customizing worker process loggers + +Ray executes Tasks and Actors remotely in Ray's worker processes. To provide your own logging configuration for the worker processes, customize the worker loggers with the instructions below: +::::{tab-set} + +:::{tab-item} Ray Core: individual worker process +Customize the logger configuration when you define the Tasks or Actors. +```python +import ray +import logging +# Initiate a driver. +ray.init() + +@ray.remote +class Actor: + def __init__(self): + # Basic config automatically configures logs to + # stream to stdout and stderr. + # Set the severity to INFO so that info logs are printed to stdout. + logging.basicConfig(level=logging.INFO) + + def log(self, msg): + logger = logging.getLogger(__name__) + logger.info(msg) + +actor = Actor.remote() +ray.get(actor.log.remote("A log message for an actor.")) + +@ray.remote +def f(msg): + logging.basicConfig(level=logging.INFO) + logger = logging.getLogger(__name__) + logger.info(msg) + +ray.get(f.remote("A log message for a task.")) +``` -Logging format for Python logs
```bash -%(asctime)s\t%(levelname)s %(filename)s:%(lineno)s -- %(message)s +(Actor pid=179641) INFO:__main__:A log message for an actor. +(f pid=177572) INFO:__main__:A log message for a task. ``` +::: -Example:
+:::{tab-item} Ray Core: all worker processes of a job + +```{admonition} Caution +:class: caution +This is an experimental feature. The semantic of the API is subject to change. +It doesn't support [Ray Client](ray-client-ref) yet. ``` -2023-06-01 09:15:34,601 INFO job_manager.py:408 -- Submitting job with RAY_ADDRESS = 10.0.24.73:6379 + +Use `worker_process_setup_hook` to apply the new logging configuration to all worker processes within a job. + +```python +# driver.py +def logging_setup_func(): + logger = logging.getLogger("ray") + logger.setLevel(logging.DEBUG) + warnings.simplefilter("always") + +ray.init(runtime_env={"worker_process_setup_hook": logging_setup_func}) + +logging_setup_func() ``` +::: +:::{tab-item} Ray libraries +If you use any of the Ray libraries, follow the instructions provided in the documentation for the library. +::: -Logging format for CPP logs
+:::: -```bash -[year-month-day, time, pid, thread_id] (component) [file]:[line] [message] +(structured-logging)= +## Structured logging +Implement structured logging to enable downstream users and applications to consume the logs efficiently. + +### Application logs + +Ray enables users to configure the Python logging library to output logs in a structured format. This setup standardizes log entries, making them easier to handle. + +#### Configure structured logging for Ray Core + +```{admonition} Ray libraries +If you are using any of the Ray libraries, follow the instructions provided in the documentation for the library. ``` -Example:
+The following methods are ways to configure Ray Core's structure logging format: -```bash -[2023-06-01 08:47:47,457 I 31009 225171] (gcs_server) gcs_node_manager.cc:42: Registering node info, node id = 8cc65840f0a332f4f2d59c9814416db9c36f04ac1a29ac816ad8ca1e, address = 127.0.0.1, node name = 127.0.0.1 +##### Method 1: Configure structured logging with `ray.init` +```python +ray.init( + log_to_driver=False, + logging_config=ray.LoggingConfig(encoding="JSON", log_level="INFO") +) ``` -:::{note} -Some system component logs are not structured as suggested above as of 2.5. The migration of system logs to structured logs is ongoing. -::: +You can configure the following parameters: + +* `encoding`: The encoding format for the logs. The default is `TEXT` for plain text logs. +The other option is `JSON` for structured logs. +In both `TEXT` and `JSON` encoding formats, the logs include Ray-specific fields such as `job_id`, `worker_id`, `node_id`, `actor_id`, and `task_id`, if available. + +* `log_level`: The log level for the driver process. The default is `INFO`. +Available log levels are defined in the [Python logging library](https://docs.python.org/3/library/logging.html#logging-levels). + +When you set up `logging_config` in `ray.init`, it configures the root loggers for the driver process, Ray actors, and Ray tasks. + +```{admonition} note +The `log_to_driver` parameter is set to `False` to disable logging to the driver +process as the redirected logs to the driver will include prefixes that made the logs +not JSON parsable. +``` + +##### Method 2: Configure structured logging with an environment variable + +You can set the `RAY_LOGGING_CONFIG_ENCODING` environment variable to `TEXT` or `JSON` to set the encoding format for the logs. +Note that the environment variable needs to be set before `import ray`. + +```python +import os +os.environ["RAY_LOGGING_CONFIG_ENCODING"] = "JSON" + +import ray +import logging -### Add metadata to structured logs -If you need additional metadata to make logs more structured, fetch the metadata of Jobs, Tasks or Actors with Ray’s {py:obj}`ray.runtime_context.get_runtime_context` API. +ray.init(log_to_driver=False) +# Use the root logger to print log messages. +``` +#### Example + +The following example configures the `LoggingConfig` to output logs in a structured JSON format and sets the log level to `INFO`. +It then logs messages with the root loggers in the driver process, Ray tasks, and Ray actors. +The logs include Ray-specific fields such as `job_id`, `worker_id`, `node_id`, `actor_id`, and `task_id`. + +```python +import ray +import logging + +ray.init( + log_to_driver=False, + logging_config=ray.LoggingConfig(encoding="JSON", log_level="INFO", additional_log_standard_attrs=['name']) +) + +def init_logger(): + """Get the root logger""" + return logging.getLogger() + +logger = logging.getLogger() +logger.info("Driver process") + +@ray.remote +def f(): + logger = init_logger() + logger.info("A Ray task") + +@ray.remote +class actor: + def print_message(self): + logger = init_logger() + logger.info("A Ray actor") + +task_obj_ref = f.remote() +ray.get(task_obj_ref) + +actor_instance = actor.remote() +ray.get(actor_instance.print_message.remote()) + +""" +{"asctime": "2024-07-15 19:06:06,469", "levelname": "INFO", "message": "Driver process", "filename": "test.py", "lineno": 12, "job_id": "03000000", "worker_id": "03000000ffffffffffffffffffffffffffffffffffffffffffffffff", "node_id": "824f9d7c6a82a0faf42b91f07b42667df0831034a713f04f28ba84b9"} +(f pid=4871) {"asctime": "2024-07-15 19:06:07,435", "levelname": "INFO", "message": "A Ray task", "filename": "test.py", "lineno": 17, "job_id": "03000000", "worker_id": "f8f84d811683e5d9e03744a4386b26a5cd6f6ca09fc5cdc8e1dbe5a3", "node_id": "824f9d7c6a82a0faf42b91f07b42667df0831034a713f04f28ba84b9", "task_id": "fa31b89f94899135ffffffffffffffffffffffff03000000"} +(actor pid=4939) {"asctime": "2024-07-15 19:06:08,700", "levelname": "INFO", "message": "A Ray actor", "filename": "test.py", "lineno": 23, "job_id": "03000000", "worker_id": "51d62f87e3867cdcad9aecd7b431068ea433b3104c8cc4ed1db6eef7", "node_id": "824f9d7c6a82a0faf42b91f07b42667df0831034a713f04f28ba84b9", "actor_id": "4a03b12afe5598a00eadcf9503000000", "task_id": "0ab01f2d6283d7194a03b12afe5598a00eadcf9503000000"} +""" +``` + +#### Add metadata to structured logs + +Add extra fields to the log entries by using the `extra` parameter in the `logger.info` method. + +```python +import ray +import logging + +ray.init( + log_to_driver=False, + logging_config=ray.LoggingConfig(encoding="JSON", log_level="INFO") +) + +logger = logging.getLogger() +logger.info("Driver process with extra fields", extra={"username": "anyscale"}) + +# The log entry includes the extra field "username" with the value "anyscale". + +# {"asctime": "2024-07-17 21:57:50,891", "levelname": "INFO", "message": "Driver process with extra fields", "filename": "test.py", "lineno": 9, "username": "anyscale", "job_id": "04000000", "worker_id": "04000000ffffffffffffffffffffffffffffffffffffffffffffffff", "node_id": "76cdbaa32b3938587dcfa278201b8cef2d20377c80ec2e92430737ae"} +``` + +If needed, you can fetch the metadata of Jobs, Tasks, or Actors with Ray’s {py:obj}`ray.runtime_context.get_runtime_context` API. ::::{tab-set} :::{tab-item} Ray Job @@ -382,78 +543,35 @@ If you need node IP, use {py:obj}`ray.nodes` API to fetch all nodes and map the :::: -## Customizing worker process loggers - -When using Ray, Tasks and Actors are executed remotely in Ray's worker processes. To provide your own logging configuration for the worker processes, customize the worker loggers with the instructions below: -::::{tab-set} - -:::{tab-item} Ray Core: individual worker process -Customize the logger configuration when you define the Tasks or Actors. -```python -import ray -import logging -# Initiate a driver. -ray.init() - -@ray.remote -class Actor: - def __init__(self): - # Basic config automatically configures logs to - # stream to stdout and stderr. - # Set the severity to INFO so that info logs are printed to stdout. - logging.basicConfig(level=logging.INFO) - - def log(self, msg): - logger = logging.getLogger(__name__) - logger.info(msg) - -actor = Actor.remote() -ray.get(actor.log.remote("A log message for an actor.")) - -@ray.remote -def f(msg): - logging.basicConfig(level=logging.INFO) - logger = logging.getLogger(__name__) - logger.info(msg) - -ray.get(f.remote("A log message for a task.")) -``` +### System logs +Ray structures most system or component logs by default.
+Logging format for Python logs
```bash -(Actor pid=179641) INFO:__main__:A log message for an actor. -(f pid=177572) INFO:__main__:A log message for a task. +%(asctime)s\t%(levelname)s %(filename)s:%(lineno)s -- %(message)s ``` -::: -:::{tab-item} Ray Core: all worker processes of a job - -```{admonition} Caution -:class: caution -This is an experimental feature. The semantic of the API is subject to change. -It doesn't support [Ray Client](ray-client-ref) yet. +Example:
+``` +2023-06-01 09:15:34,601 INFO job_manager.py:408 -- Submitting job with RAY_ADDRESS = 10.0.24.73:6379 ``` -Use `worker_process_setup_hook` to apply the new logging configuration to all worker processes within a job. +Logging format for c++ logs
-```python -# driver.py -def logging_setup_func(): - logger = logging.getLogger("ray") - logger.setLevel(logging.DEBUG) - warnings.simplefilter("always") +```bash +[year-month-day, time, pid, thread_id] (component) [file]:[line] [message] +``` -ray.init(runtime_env={"worker_process_setup_hook": logging_setup_func}) +Example:
-logging_setup_func() +```bash +[2023-06-01 08:47:47,457 I 31009 225171] (gcs_server) gcs_node_manager.cc:42: Registering node info, node id = 8cc65840f0a332f4f2d59c9814416db9c36f04ac1a29ac816ad8ca1e, address = 127.0.0.1, node name = 127.0.0.1 ``` -::: -:::{tab-item} Ray libraries -If you are using any of the Ray libraries, follow the instructions provided in the documentation for the library. +:::{note} +Some system component logs aren't structured as suggested preceding as of 2.5. The migration of system logs to structured logs is ongoing. ::: -:::: - (log-rotation)= ## Log rotation diff --git a/python/ray/_private/ray_logging/logging_config.py b/python/ray/_private/ray_logging/logging_config.py index babdc94e18ac..cf8865d7070d 100644 --- a/python/ray/_private/ray_logging/logging_config.py +++ b/python/ray/_private/ray_logging/logging_config.py @@ -3,7 +3,7 @@ from ray._private.ray_logging import default_impl from ray._private.ray_logging.constants import LOGRECORD_STANDARD_ATTRS -from ray._private.ray_logging.formatters import TextFormatter +from ray._private.ray_logging.formatters import TextFormatter, JSONFormatter from ray._private.ray_logging.filters import CoreContextFilter from ray.util.annotations import PublicAPI @@ -26,6 +26,7 @@ class DefaultLoggingConfigurator(LoggingConfigurator): def __init__(self): self._encoding_to_formatter = { "TEXT": TextFormatter(), + "JSON": JSONFormatter(), } def get_supported_encodings(self) -> Set[str]: @@ -98,12 +99,12 @@ def _apply(self): self._configure_logging() -LoggingConfig.__doc__ = f""" +LoggingConfig.__doc__ = """ Logging configuration for a Ray job. These configurations are used to set up the root logger of the driver process and all Ray tasks and actor processes that belong to the job. - Examples: + Examples: 1. Configure the logging to use TEXT encoding. .. testcode:: import ray @@ -119,11 +120,35 @@ def f(): logger.info("This is a Ray task") ray.get(f.remote()) + ray.shutdown() .. testoutput:: :options: +MOCK - 2024-06-03 07:53:50,815 INFO test.py:11 -- This is a Ray task name=__main__ job_id=01000000 worker_id=0dbbbd0f17d5343bbeee8228fa5ff675fe442445a1bc06ec899120a8 node_id=577706f1040ea8ebd76f7cf5a32338d79fe442e01455b9e7110cddfc task_id=c8ef45ccd0112571ffffffffffffffffffffffff01000000 + 2025-02-12 12:25:16,836 INFO test-log-config.py:11 -- This is a Ray task name=__main__ job_id=01000000 worker_id=51188d9448be4664bf2ea26ac410b67acaaa970c4f31c5ad3ae776a5 node_id=f683dfbffe2c69984859bc19c26b77eaf3866c458884c49d115fdcd4 task_id=c8ef45ccd0112571ffffffffffffffffffffffff01000000 task_name=f task_func_name=test-log-config.f timestamp_ns=1739391916836884000 + + 2. Configure the logging to use JSON encoding. + .. testcode:: + + import ray + import logging + + ray.init( + logging_config=ray.LoggingConfig(encoding="JSON", log_level="INFO", additional_log_standard_attrs=['name']) + ) + + @ray.remote + def f(): + logger = logging.getLogger(__name__) + logger.info("This is a Ray task") + + ray.get(f.remote()) + ray.shutdown() + + .. testoutput:: + :options: +MOCK + + {"asctime": "2025-02-12 12:25:48,766", "levelname": "INFO", "message": "This is a Ray task", "filename": "test-log-config.py", "lineno": 11, "name": "__main__", "job_id": "01000000", "worker_id": "6d307578014873fcdada0fa22ea6d49e0fb1f78960e69d61dfe41f5a", "node_id": "69e3a5e68bdc7eb8ac9abb3155326ee3cc9fc63ea1be04d11c0d93c7", "task_id": "c8ef45ccd0112571ffffffffffffffffffffffff01000000", "task_name": "f", "task_func_name": "test-log-config.f", "timestamp_ns": 1739391948766949000} Args: encoding: Encoding type for the logs. The valid values are