Apex.amp.initialize: lr_scheduler.LambdaLR AttributeError: 'function' object has no attribute 'self' #574

Zepyhrus · 2020-07-31T01:43:30Z

🐛 Bug

During training with mixed precision, scheduler initialized failed with AttributeError: 'function' object has no attribute '__self__'.

To Reproduce (REQUIRED)

Input:

if mixed_precision:
        model, optimizer = amp.initialize(model, optimizer, opt_level='O1', verbosity=0)

lf = lambda x: (((1 + math.cos(x * math.pi / epochs)) / 2) ** 1.0) * 0.9 + 0.1  # cosine
scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf)

Output:

Optimizer groups: 62 .bias, 70 conv.weight, 59 other
Traceback (most recent call last):
  File "train.py", line 477, in <module>
    train(hyp, tb_writer, opt, device)
  File "train.py", line 167, in train
    scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf)
  File "/home/ubuntu/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/optim/lr_scheduler.py", line 189, in __init__
    super(LambdaLR, self).__init__(optimizer, last_epoch)
  File "/home/ubuntu/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/optim/lr_scheduler.py", line 74, in __init__
    self.optimizer.step = with_counter(self.optimizer.step)
  File "/home/ubuntu/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/optim/lr_scheduler.py", line 56, in with_counter
    instance_ref = weakref.ref(method.__self__)
AttributeError: 'function' object has no attribute '__self__'

Expected behavior

Scheduler initialized successfully.

Environment

If applicable, add screenshots to help explain your problem.

OS: ubuntu 18.04
GPU: Tesla P2000;
CUDA: 10.2;
torch: 1.5.1/1.6.0 has the same probelm both;
apex: 0.1;

Additional context

Following this exactly the same problem in yolov3, I have reinstall pytorch/apex pefectly, but problem persists.

The interesting thing is: when I set model, optimizer = amp.initialize(model, optimizer, opt_level='O1', verbosity=0) the opt_level option to O2 or O0, problem solved, but the training loss will yield infinite in several steps.

If I disable mixed_precision, training problem totally solved, hope this will provide some cues for debugging.

The text was updated successfully, but these errors were encountered:

github-actions · 2020-07-31T01:44:26Z

Hello @Zepyhrus, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

Cloud-based AI systems operating on hundreds of HD video streams in realtime.
Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

glenn-jocher · 2020-07-31T06:45:41Z

There is a PR open for pytorch 1.6. native amp. You can use this branch or simply wait a few days for it to get merged with origin/master. See #573

Zepyhrus · 2020-08-12T06:24:13Z

There is a PR open for pytorch 1.6. native amp. You can use this branch or simply wait a few days for it to get merged with origin/master. See #573

Hi glenn, thanks for replying. The problem is, this issue persists when I switch back to Pytorch 1.5.1.

glenn-jocher · 2020-08-12T17:55:04Z

it appears you may have environment problems. Please ensure you meet all dependency requirements if you are attempting to run YOLOv5 locally. If in doubt, create a new virtual Python 3.8 environment, clone the latest repo (code changes daily), and pip install -r requirements.txt again. We also highly recommend using one of our verified environments below.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.6. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Google Colab Notebook with free GPU:
Kaggle Notebook with free GPU: https://www.kaggle.com/models/ultralytics/yolov5
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Docker Image https://hub.docker.com/r/ultralytics/yolov5. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests evaluate proper operation of basic YOLOv5 functionality, including training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu.

github-actions · 2020-09-14T00:39:07Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Zepyhrus added the bug Something isn't working label Jul 31, 2020

github-actions bot added the Stale Stale and schedule for closing soon label Sep 14, 2020

github-actions bot closed this as completed Sep 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apex.amp.initialize: lr_scheduler.LambdaLR AttributeError: 'function' object has no attribute 'self' #574

Apex.amp.initialize: lr_scheduler.LambdaLR AttributeError: 'function' object has no attribute 'self' #574

Zepyhrus commented Jul 31, 2020

github-actions bot commented Jul 31, 2020 •

edited by glenn-jocher

Loading

glenn-jocher commented Jul 31, 2020

Zepyhrus commented Aug 12, 2020

glenn-jocher commented Aug 12, 2020 •

edited by UltralyticsAssistant

Loading

github-actions bot commented Sep 14, 2020

Apex.amp.initialize: lr_scheduler.LambdaLR AttributeError: 'function' object has no attribute '__self__' #574

Apex.amp.initialize: lr_scheduler.LambdaLR AttributeError: 'function' object has no attribute '__self__' #574

Comments

Zepyhrus commented Jul 31, 2020

🐛 Bug

To Reproduce (REQUIRED)

Expected behavior

Environment

Additional context

github-actions bot commented Jul 31, 2020 • edited by glenn-jocher Loading

glenn-jocher commented Jul 31, 2020

Zepyhrus commented Aug 12, 2020

glenn-jocher commented Aug 12, 2020 • edited by UltralyticsAssistant Loading

Requirements

Environments

Status

github-actions bot commented Sep 14, 2020

Apex.amp.initialize: lr_scheduler.LambdaLR AttributeError: 'function' object has no attribute 'self' #574

Apex.amp.initialize: lr_scheduler.LambdaLR AttributeError: 'function' object has no attribute 'self' #574

github-actions bot commented Jul 31, 2020 •

edited by glenn-jocher

Loading

glenn-jocher commented Aug 12, 2020 •

edited by UltralyticsAssistant

Loading