-
Notifications
You must be signed in to change notification settings - Fork 697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SEDONA-228] Standardize logging modules #743
Conversation
…ency so modules could start using it
@Kimahriman For the common module, I think we should just use the regular java exception mechanism and let Please correct me if I am wrong :-) |
Nothing is added as a compile dependency, most of this is to standardize what framework gets used during the tests. For example, several of the test base classes have log4j 1.x code to reduce the logging level for various packages. I don't think these do anything for current Flink or Spark versions, because those both use log4j 2.x as the logging implementation, but they include I mentioned in the description, but the real thing that needs to be done is to migrate all logging to SLF4J. Currently if Spark or Flink dropped the log4j-1.2-api dependency, Sedona would no longer work I believe. This is kind of a step towards that, with two main parts:
Also part of this is that currently the Flink module doesn't support any form of logging during the tests, because the Flink dependencies don't provide any of the logging modules like Spark does for the dependencies that are currently included. With this going forward, we could easily enable logging in the common and flink modules by adding the slf4j-api as a provided dependency. This would enable the code to compile, and then with the two test dependencies mentioned above, common module tests could properly output logs, and at runtime the slf4j-api would be provided by either Flink or Spark directly. Especially if/as more of the non-Spark related core and sql module code get ported to the common module. Alternative options to change less things but still work toward that goal:
|
@Kimahriman Thank you for your great explanation! |
@Kimahriman One thing I just realized: We probably should just print logging to the console (stdout and stderr) directly. Otherwise the users won't be able to see the logging unless they config log4j explicitly. It will be better if they can see those WARN/ERROR message directly. What do you think? |
Do you mean for tests or for real use? For tests the logs should all go to For real use, the logs should keep working exactly as they were automatically using the logging frameworks provided by Flink and Spark. If someone were to try to use the common module directly without Flink or Spark, then they would need to provide their own logging configs. |
Great explanation! Thank you! |
Did you read the Contributor Guide?
Is this PR related to a JIRA ticket?
[SEDONA-XXX] my subject
.What changes were proposed in this PR?
Standardize on using log4j 2 for logging. Currently the logging modules are pulled in from Spark to use log4j 1 or 2 depending on the Spark version. The common module and flink module have no logging available to it. Both Flink and Spark use Log4j 2.x in their latest versions.
Main points:
log4j-1.2-api
as a provided dependency for all modules. Currently all the logging uses the log4j 1.x API, so this makes that compatible while still being able to run log4j 2.x (which is what actually happens at runtime with new flink and spark). Long term the logging should probably be switched to use SLF4J so the underlying implementation doesn't matter as much (this is also what flink and spark actually use I think)log4j-core
andlog4j-slf4j-impl
as test dependencies so that log4j 2.x gets used for logging for all testslog4j2.properties
file as a test resource to all projects so logs get set totarget/unit-tests.log
to reduce the verbosity of tests when running locally and for finding what tests fail in the CI logs. This was copied from how the Spark repo does it.How was this patch tested?
Just logging changes, existing tests.
Did this PR include necessary documentation updates?