Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model summarize displayed twice before training starts #3458

Closed
ananthsub opened this issue Sep 11, 2020 · 18 comments
Closed

Model summarize displayed twice before training starts #3458

ananthsub opened this issue Sep 11, 2020 · 18 comments
Labels
bug Something isn't working good first issue Good for newcomers help wanted Open to be worked on
Milestone

Comments

@ananthsub
Copy link
Contributor

🐛 Bug

Model summarize is called in two spots, which results in duplication in the logs:

And I'm unsure how it's called again

119083522_2815247905426328_3914141584028152470_n

The first time is with the logger.info(...) since it's formatted. The second time it appears to be logged with print(...) since the table formatting doesn't handle newlines

To Reproduce

Run any training with weights_summary set on the trainer

Expected behavior

We only call summarize once

@ananthsub ananthsub added bug Something isn't working help wanted Open to be worked on labels Sep 11, 2020
@rohitgr7
Copy link
Contributor

Tried on master. Working fine. Is it happening with ddp backend only?

@ananthsub
Copy link
Contributor Author

Yes this happens with DDP backend

@awaelchli
Copy link
Contributor

awaelchli commented Sep 11, 2020

on my windows machine, single gpu, happens too. but the logs also appear twice, including all warnings.

TPU available: False, using: 0 TPU cores
TPU available: False, using: 0 TPU cores
CUDA_VISIBLE_DEVICES: [0]
CUDA_VISIBLE_DEVICES: [0]

¨
which means it is logging related, not model summary.
I will investigate

@Borda Borda added the good first issue Good for newcomers label Sep 15, 2020
@JVGD
Copy link

JVGD commented Sep 22, 2020

Just started migration of our codebase to Lightning and the first trainer I write have this issue... Could this be something related with duplicated handlers in logger? It might be that the logger.getLogger() which is a singleton is called twice, so it might be that 2 handlers are initialized, the root handler and another stream handler

This is my case:
image

@awaelchli
Copy link
Contributor

awaelchli commented Sep 22, 2020

yes, I think that's the case. does my PR not resolve the problem for you? which os are you on?

@JVGD
Copy link

JVGD commented Sep 22, 2020

I am using Ubuntu 18.04.

I tried you PR but unfortunately it does not solve the issue. Maybe I am doing something wrong?
image

I think this is what might be happening: https://jdhao.github.io/2020/06/20/python_duplicate_logging_messages/

@awaelchli
Copy link
Contributor

awaelchli commented Sep 22, 2020

@JVGD could you try again with my updated branch. sorry, I don't have much experience with this python logging, for me the duplicate messages are resolved locally. I need you as the tester 😄
I tried my windows and ubunut multi-gpu

@edenlightning edenlightning added this to the 0.9.x milestone Sep 22, 2020
@JVGD
Copy link

JVGD commented Sep 22, 2020

Unfortunately is still showing duplicated logs. I am going to do further testing, but as far as I know I think the pytroch_lightning logger adds an extra StreamHandler to the base root logger. Maybe that is why there are duplicates

@awaelchli
Copy link
Contributor

indeed, if we remove the streamhandler, this also solves the problem (for me). can you confirm?
https://github.com/PyTorchLightning/pytorch-lightning/pull/3514/files

@JVGD
Copy link

JVGD commented Sep 22, 2020

Amazing! it was just deleting that line and now it works like a charm. By the way, I did it in the pytorch-lightning==0.9.0 version, I was in my development environment. So I haven't been able to test it on your branch. But tomorrow I will!

Thank you for the support. Now it looks great:
image

I am not sure if that change alone suffices for avoiding duplicated logging in other platforms, but in my case it is indeed.

@awaelchli awaelchli changed the title Model summarize called twice before training starts Model summarize displayed twice before training starts Sep 22, 2020
@awaelchli
Copy link
Contributor

@JVGD this is all really strange. I checked again with a different machine (also ubuntu 18) and now It's not printing the model summary, while on my notebook it did.
I need to go through the python versions and see if there is a difference. ugh :(

@JVGD
Copy link

JVGD commented Sep 23, 2020

Mmm really strange @ashwinb, I think we almost got this bug hunted down.

I got an hypothesis: maybe you don't see any logs because the root logger does not have any handlers. This means, pytorch_lightning did not add any handlers (because we removed it), and app you are running does not add any other handlers.

I think it works in my case because my app already adds a StreamHandler at the beginning. So there is at least one StreamHandler.

If it helps, in my python project I check explicitly that there is only one StreamHandler in the root base logger. Maybe doing something like this could help. The idea is to add a StreamHandler only if there is none, but if it is at least one, then use that one (instead of adding another that will print duplicated logs).

        # Checking if there is already a stream
        # handler for avoinding log repetition
        console_hanlder_configured = any(
            [isinstance(handler, logging.StreamHandler)
             for handler in root_logger.handlers])

        # If console handler not configured, then add it
        if not console_hanlder_configured:
            # Handler for logging to stdout
            console_handler = logging.StreamHandler()
            console_handler.setFormatter(formatter)
            console_handler.setLevel(level)

            # Adding handler to logger
            root_logger.addHandler(console_handler)

@williamFalcon
Copy link
Contributor

are we good on this? i also made a bunch of stability fixes to ddp including duplicate logging and so on

@JVGD
Copy link

JVGD commented Oct 1, 2020

Last time I checked it wasn't but I am happy to check it out. The branch to test is still bugfix/duplicated-logging right?

@JVGD
Copy link

JVGD commented Oct 1, 2020

Tested, I am good on my side
image

@ianlini
Copy link

ianlini commented Jan 22, 2021

I still have the duplicate log issue like this:

GPU available: True, used: True
[2021-01-22 07:43:17,623][INFO] lightning: GPU available: True, used: True
TPU available: None, using: 0 TPU cores
[2021-01-22 07:43:17,623][INFO] lightning: TPU available: None, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]
[2021-01-22 07:43:17,623][INFO] lightning: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3]

I'm using pytorch-lightning version 1.1.1, and I configured the root log using logging.dictConfig with "disable_existing_loggers": False. The configuring was done after importing pytorch-lightning. I think it's a common approach.

I think these lines in __init__.py should be removed

_logger.addHandler(python_logging.StreamHandler())	
_logger.setLevel(python_logging.INFO)

This is the only library having this issue in my hundreds of dependencies.
related issue: #1503

@awaelchli
Copy link
Contributor

@ianlini see #4621 and #5509 for progress on this issue

@fingoldo
Copy link

fingoldo commented May 6, 2024

Just observed this doubled logging in the recent version of Lightning. It happens in summarize method of the ModelSummary class.
Execution of this single line log.info("\n" + summary_table) produces doubled log entry. I have my custom logger configured. I suppose either my logger or the logger of Lightning are not configured correctly (but I only observe such doubling when I work with lightning and no other libs).
Doubling disappeared when I changed the order of my logger initialization to happen AFTER all lightning inits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants