Trained regression model does not produce validation or test results #794

Ce11an · 2022-01-09T13:06:17Z

🐛 Bug

Related to Bug #405

When a regression model has been trained, using either trainer.validate() or trainer.test() results in an empty dictionary. I posted this a question related to this in Slack last week. I was told that this behaviour arises from using an outdated lightning API and that returning log and progress_bar is no longer required. Instead, returning self.log is required.

To Reproduce

Steps to reproduce the behaviour:

Code sample

# google colab GPU

import pytorch_lightning as pl
from pl_bolts.models.regression import LinearRegression
from pl_bolts.datamodules.sklearn_datamodule import SklearnDataModule
from sklearn.datasets import load_diabetes

# model
model = LinearRegression(input_dim=10, l1_strength=1, l2_strength=1)

# data
X, y = load_diabetes(return_X_y=True)
loaders = SklearnDataModule(X, y)

# train
trainer = pl.Trainer(gpus=-1, log_every_n_steps=20)
trainer.fit(model, train_dataloaders=loaders.train_dataloader(), val_dataloaders=loaders.val_dataloader())

# evaluate
trainer.validate(model=model, dataloaders=loaders.val_dataloader(), ckpt_path="best", verbose=True)

Output

Loaded model weights from checkpoint at /content/lightning_logs/version_8/checkpoints/epoch=7-step=159.ckpt
Validating: 0%
0/6 [00:00<?, ?it/s]
--------------------------------------------------------------------------------
DATALOADER:0 VALIDATE RESULTS
{}
--------------------------------------------------------------------------------
[{}]

Expected behaviour

For the mean val_loss to be returned.

It is expected that the error arises from here: https://github.com/PyTorchLightning/lightning-bolts/blob/22e494e546e7fe322d6b4cf36258819c7ba58e02/pl_bolts/models/regression/linear_regression.py#L75-L96

I believe this should be the following code:

def validation_epoch_end(self, outputs: List[Dict[str, Tensor]]) -> Dict[str, Tensor]:
    val_loss = torch.stack([x["val_loss"] for x in outputs]).mean()
    return self.log("val_loss", val_loss)

This code would need to be changed for both regression models and both test_epoch_end functions, too.

Environment

PyTorch Version (e.g., 1.0): 1.7.1
OS (e.g., Linux): Linux
How you installed PyTorch (conda, pip, source): pip
Build command you used (if compiling from source):
Python version: 3.7
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information: PyTorch Lightning 1.5.8; Lightning Bolts v0.5.0

Additional context

Used Google Colab to run the code.

The text was updated successfully, but these errors were encountered:

Ce11an · 2022-01-23T14:25:49Z

Would it be okay if I make a PR for this issue? Thanks 😄

stale · 2022-04-16T01:52:50Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Ce11an added fix fixing issues... help wanted Extra attention is needed labels Jan 9, 2022

loodvn mentioned this issue Jan 17, 2022

Logging and progress bar values not detaching #793

Closed

omaralvarez mentioned this issue Feb 4, 2022

Fix LinearRegression Bolt #802

Closed

8 tasks

stale bot added the won't fix This will not be worked on label Apr 16, 2022

stale bot closed this as completed Apr 25, 2022

Ce11an mentioned this issue Nov 27, 2022

Reviewed LogisticRegression #950

Merged

11 tasks

Borda added bug Something isn't working and removed fix fixing issues... labels Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trained regression model does not produce validation or test results #794

Trained regression model does not produce validation or test results #794

Ce11an commented Jan 9, 2022 •

edited

Loading

Ce11an commented Jan 23, 2022

stale bot commented Apr 16, 2022

Trained regression model does not produce validation or test results #794

Trained regression model does not produce validation or test results #794

Comments

Ce11an commented Jan 9, 2022 • edited Loading

🐛 Bug

To Reproduce

Code sample

Output

Expected behaviour

Environment

Additional context

Ce11an commented Jan 23, 2022

stale bot commented Apr 16, 2022

Ce11an commented Jan 9, 2022 •

edited

Loading