-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Logging configuration and message formats #30005
Comments
This sgtm!
Also I like this idea. Maybe Ray can provide a public API to standardize some logging output in the long term. Btw, should we also enable |
💯 Agree, when a user can't configure logging it can be very frustrating. |
Actually, I had some additional research here.
After some research, this may not be the best idea. The reason is the logging configuration happens upon the import time if we add it to init, meaning users will lose control over logs from Ray (this basically means all users are forced to use the default configuration of Ray loggers). It seems like this should happen after users' root logging configuration or after they specify the handler. @peytondmurray do you know how other libraries configure logger? |
Also cc @scottsun94 |
| Do you know how other libraries configure logger? @rkooo567 This is a much tougher question than I originally thought because at least in the libraries I took a look at there isn't much inspiration to be found. I looked at a handful of the major scientific python libraries for insight, and here is what I found:
_log = logging.getLogger(__name__)
# The decorator ensures this always returns the same handler (and it is only
# attached once).
@functools.lru_cache()
def _ensure_handler():
"""
The first time this function is called, attach a `StreamHandler` using the
same format as `logging.basicConfig` to the Matplotlib root logger.
Return this handler every time this function is called.
"""
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(logging.BASIC_FORMAT))
_log.addHandler(handler)
return handler Any of the import logging
_log = logging.getLogger(__name__) when they need to log something. Since child loggers by default propagate messages up to the handlers associated with their ancestor loggers, matplotlib doesn't need to set configuration on every module that needs to log a message. This leaves the choice about how logs are handled entirely up to the user, which is in accordance with python docs here. Of all the libraries I looked at, I like
import pymc
import logging
logging.basicConfig()
# then do something that creates a log entry In this case, Moving forwardFrom the Logging HOWTO:
IMO there are really two different situations we need to think about when tackling the problem of logging in Ray:
Here's one idea:
import ray
import numpy as np
import ...
import logging
...
logging.basicConfig(format="a custom logging format: %(asctime)s: %(message)s)
...
ray.init() Then the logging configuration the user sets would take precedence. On the other hand, if the user doesn't do any configuration, they still get well-formatted custom logs provided by ray automatically.
I made a simple example project which uses this "conditional configuration" approach here; if |
Thanks @peytondmurray for a survey of various libraries. I like the proposal in the sense that it offers good default as well as configurability. @rkooo567 would you be ok if we move forward like this? |
I will take a look at it by today |
The proposal sounds pretty good to me! Also thanks for the research on other library logging :). Can you defer doing 2? I think we can configure the logger upon ray.init, but we don't need to check if the root logger is set up now because of 2 reasons (sorry I am back band fourth here lol).
|
what's the status for this btw? |
Hi, I'm a bot from the Ray team :) To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months. If there is no further activity in the 14 days, the issue will be closed!
You can always ask for help on our discussion forum or Ray's public slack channel. |
Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message. Please feel free to reopen or open a new issue if you'd still like it to be addressed. Again, you can always ask for help on our discussion forum or Ray's public slack channel. Thanks again for opening the issue! |
Description
In the interest of improving both the user and development experiences with Ray, and after having a discussion with @xwjiang2010, I'm opening this issue to solicit community input about Ray's logging capabilities.
Background
Logging configuration
Logging in Ray is currently set up in multiple places. Some modules set custom logging configurations with calls to
basicConfig
:This makes sense to do inside separate processes because each process has its own root logger, although there are instances where
basicConfig
is called inside the main process.From the python docs,
basicConfig
In principle I think this means that only the first call to
basicConfig
modifies the root logger, and all other calls have no effect. Additionally, modifying the root logger may also not be intended because Ray's main logger is set up with the nameray
here.Currently the code to set up the main logger,
ray
, also appears in a number of different places in order to capture all the different entry points for ray execution, e.g. callingray.init()
, using the CLI to runray start
, etc:Log output
Currently for a number of different Ray use cases, a lot of output can be generated, which sometimes makes it hard to identify important information.
In this example, there are logging statements from multiple Ray libraries but it's hard to tell at a glance which ones they are. In part this is due to the training results being interspersed throughout, but also because the logging formatter is different for the different Ray libraries, meaning that logging messages from different parts of the code may have different message formats. As part of this effort, I'm hoping we can improve the clarity of these logging statements so that they are more visible when they are shown to the user.
Log Rotation
Another common requested feature is to allow for log rotation, particularly with respect to progress statements; see #28268.
Proposal
One option is to create a logger for each Ray library (tune, train, air, etc), and convert existing logging calls to use the logger from the appropriate library. Logging configuration would be done in the
__init__.py
for each library, helping to make configuration more uniform and transparent across different modules, and making it easier to address the issue of log rotation.To improve log message clarity, we can append the name of the Ray library to the logging statement to make it clear what part of the code the log message is being emitted from, and use similar message formats across multiple modules. For example:
The time, Ray library issuing the log statement, logging level, and location in the source where the message is emitted are clearly stated, and different Ray libraries have similar message formats.
Note: This RFC does not include any changes to remote actor logs, only driver side logging structure. Remote actor logs are for the most part handled on the user side, and so are outside the scope of this proposal.
I'd also be interested in input from the Ray community, so leave a comment if you have any feedback about anything discussed here. @xwjiang2010 Let me know if I missed anything here!
The text was updated successfully, but these errors were encountered: