Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ocrd_network logging #1111

Merged
merged 11 commits into from
Oct 13, 2023
Merged

Improve ocrd_network logging #1111

merged 11 commits into from
Oct 13, 2023

Conversation

MehmedGIT
Copy link
Contributor

@MehmedGIT MehmedGIT commented Oct 11, 2023

This PR improves the logging mechanism of the ocrd_network module.

Additions:

  • A separate logging dir tree structure for the modules (processing servers, processing workers, processor servers, mets servers, processing jobs). Configurable with an env variable.
  • Processing job-level logging - each job is logged into a separate file with format {job_id}.log
  • Processing job-level logging file paths are added to the Job models and preserved in the database.
  • The ocrd_network logging is based on the format provided in ocrd_utils
  • Support env variables for setting the root directory for logging/sockets
  • An endpoint for getting the log file of a processing job of a processor

Feel free to recommend suggestions.

@MehmedGIT MehmedGIT requested a review from joschrew October 11, 2023 14:52
Copy link
Member

@kba kba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a quick read-through, I'll contribute ideas once I have some more time for it.

Copy link
Contributor

@joschrew joschrew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested and it seems to work fine. I think it is a huge improvement, all structured now and much better to find the right logfile now.
What imho could be added is:

  • make logfile and and mets-server basedir configurable with env. For example when I test locally I use /data for the mets-server (this is better for my local setup with docker-compose) and when this was configurable it would be easier for me to set it up. And I could use /data in the docker-compose-setup as well and would not have to volume mount two directories (now I am mounting /data and /tmp).
  • add a logging endpoint for the processors: as now we have the log-file-path in the database it would be very easy to add and endpoint to the processing server to retrieve the log of a processor through http

@MehmedGIT
Copy link
Contributor Author

MehmedGIT commented Oct 12, 2023

make logfile and and mets-server basedir configurable with env. For example when I test locally I use /data for the mets-server (this is better for my local setup with docker-compose) and when this was configurable it would be easier for me to set it up. And I could use /data in the docker-compose-setup as well and would not have to volume mount two directories (now I am mounting /data and /tmp).

Resolved with 464f8ea

add a logging endpoint for the processors: as now we have the log-file-path in the database it would be very easy to add and endpoint to the processing server to retrieve the log of a processor through HTTP

Resolved with c81ffcc

@MehmedGIT MehmedGIT marked this pull request as ready for review October 12, 2023 12:21
@MehmedGIT MehmedGIT requested review from kba and joschrew October 12, 2023 12:21
Copy link
Member

@kba kba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, much cleaner and being able to get the log via the API is excellent.

Ready for release IMHO. Can you summarize the changes for the release notes?

@MehmedGIT
Copy link
Contributor Author

Can you summarize the changes for the release notes?

It's on top. I have updated the initial post.

@MehmedGIT
Copy link
Contributor Author

@kba, I forgot to update the ocrd --help section - thanks for adding that.

Considering this, I was wondering why should we add the environment variables manually to the help section. I think it would be possible to iterate over all variables inside the OcrdEnvConfig and print their descriptions. Based on prefix values we could also decide what to print, e.g., OCRD_NETWORK prefix would print only those environment variables.

@kba
Copy link
Member

kba commented Oct 13, 2023

Considering this, I was wondering why should we add the environment variables manually to the help section. I think it would be possible to iterate over all variables inside the OcrdEnvConfig and print their descriptions. Based on prefix values we could also decide what to print, e.g., OCRD_NETWORK prefix would print only those environment variables.

The main reasons I did the --help section manually is order of the output and that I found it difficult to get the \b escapes right. It is a source of inconsistency though, I'll think about finding a better way.

@kba kba merged commit 1ef2cf3 into master Oct 13, 2023
@kba kba deleted the network-log-refactoring branch October 13, 2023 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants