-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AG-1041] Adds timestamp logging for processing runs #71
Conversation
src/agoradatatools/logs.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I've been thinking about this more, and configuring logging
has always been a tiny bit confusing. BUT... I think initializing the logging can potentially be done in __init__.py
. For example:
In __init__.py
logging.basicConfig(
stream=sys.stdout, level=logging.INFO, format="INFO: %(message)s"
)
Basically, you may not need to initialize the logger everytime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 LGTM! Nice addition of logging!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, and including the processing times in addition to the timestamps is a nice bonus!
I am curious why we are logging the "Uploading file to synapse storage" block twice per dataset (once without, and then again with, the timestamp), but ultimately that's harmless...
@JessterB Yeah I was wondering about that myself, but that has to do with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
This PR adds a new module to
agoradatatools
:logs.py
. The new module contains thelog_time
decorator function along with a couple of helper functions. log_time` works by decorating a function like so:It takes the name of the function (for configuring the logging output) and the
logger
instance for the module that it is being used in as arguments. At the moment,log_time
has configurations forprocess_dataset
andprocess_all_files
, so that we can time and timestamp the processing of each dataset and data processing as a whole.In practice, when you run
adt test_config.yaml
, you should see an output like this for each dataset:And like this for the whole process: