Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference error #54

Open
davidvct opened this issue Apr 1, 2024 · 3 comments
Open

Inference error #54

davidvct opened this issue Apr 1, 2024 · 3 comments

Comments

@davidvct
Copy link

davidvct commented Apr 1, 2024

  1. I used this command for inference but encountered issue. Anyone knows how to fix this?
  • command: python launch.py --n_GPUs 1 main.py --batch_size 8 --precision single
  • error :
    [W socket.cpp:401] [c10d] The server socket has failed to bind to [::]:8023 (errno: 98 - Address already in use). [W socket.cpp:401] [c10d] The server socket has failed to bind to workstation2:8023 (errno: 98 - Address already in use). [E socket.cpp:435] [c10d] The server socket has failed to listen on any local network address.
  1. Another question is, how do I specify which model to use for inference?
@SeungjunNah
Copy link
Owner

  1. Are you launching many jobs from a single machine? Use different master ports per job.
    https://github.com/SeungjunNah/DeepDeblur-PyTorch/blob/master/src/option.py#L33

  2. Please provide more details.

@davidvct
Copy link
Author

davidvct commented Apr 2, 2024

  1. Changed the port and it works. Thanks!

  2. I was asking how to perform inference on test datasets with specific saved model. Now I managed to get the prediction running, using the below command:
    python launch.py --n_GPUs 1 main.py --save_dir 2024-04-01_14-13-08 --do_train False --do_validate False --start_epoch 270 --load_epoch 270

    But there are two issues I encountered:

    • the inference only perform images on some images. I have 60 test images in a test folder but only 6 were predicted.

    • after finished prediction with model-270.pt, the script will proceed predict with model-280.pt. It is not a big issue, but something to consider for future improvement.

@SeungjunNah
Copy link
Owner

  1. Please refer to Usage examples - Example commands
# save all of the evaluation results
python main.py --n_GPUs 1 --batch_size 8 --dataset GOPRO_Large --save_results all

Please refer to args.end_epoch and see how it is used in main.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants