Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explanation of the val.py #8968

Closed
1 task done
YousefABD92 opened this issue Aug 15, 2022 · 8 comments
Closed
1 task done

explanation of the val.py #8968

YousefABD92 opened this issue Aug 15, 2022 · 8 comments
Labels
question Further information is requested

Comments

@YousefABD92
Copy link

Search before asking

Question

Hi there!
Thanks a lot for your work.

I have a question please, I could not understand the usage of val.py
In my case, I have two different datasets and both are trained with yolov5. Now, is it possible to use the val.py file to validate the trained model of each dataset in the other? For example, I have dataset1 and want to test or Validate it in dataset2, is that possible?
could you please answer me with an example of the commands?

Thanks a lot

Additional

No response

@YousefABD92 YousefABD92 added the question Further information is requested label Aug 15, 2022
@MartinPedersenpp
Copy link

Assuming that your two datasets contain the same classes and you have a --data dataset.yaml file for each of the datasets.
My guess would be to use:
For dataset 1
python path/to/val.py --weights dataset1bestorlast.pt --data dataset2.yaml --img sameasusedfortrain
For dataset 2
python path/to/val.py --weights dataset2bestorlast.pt --data dataset1.yaml --img sameasusedfortrain

If you want to "test" it you have to use detect.py or use Torch hub inference and loop trough the opposite images of the tested dataset.

I hope this helps

@YousefABD92
Copy link
Author

YousefABD92 commented Aug 16, 2022

Thanks a lot
if we assume that the two datasets have the same classes what is going to be the differences between the two YAML files??

I have another some questions, please!
In val.py you have this:
task='val', # train, val, test, speed or study
What does that mean?
I mean test speed or study?!

the second one is taht do I need to reset the parameters as my custom-trained model?

I used a 16 of batch size and workers 4, but in your val.py default is 32 and 8 respectively.

    **batch_size=32,  # batch size**
    **workers=8,  # max dataloader workers (per RANK in DDP mode)**

the last question, can I use val.py with two different datasets which have a different number of classes?
actually, I use this one and got the following error:
AssertionError: best.pt (6 classes) trained on different --data than what you passed (10 classes). Pass correct combination of --weights and --data that are trained together.

@MartinPedersenpp
Copy link

if we assume that the two datasets have the same classes what is going to be the differences between the two YAML files??

You mention them as two datasets, which causes me to believe that they are placed in different locations and that would be the main difference between dataset1.yaml and dataset2.yaml

In val.py you have this:
task='val', # train, val, test, speed or study
What does that mean?
I mean test speed or study?!

These are flags for the smart inference mode, their behavior is described in line 366 to 391:

` if opt.task in ('train', 'val', 'test'): # run normally
if opt.conf_thres > 0.001: # #1466
LOGGER.info(f'WARNING: confidence threshold {opt.conf_thres} > 0.001 produces invalid results ⚠️')
run(**vars(opt))

else:
    weights = opt.weights if isinstance(opt.weights, list) else [opt.weights]
    opt.half = True  # FP16 for fastest results
    if opt.task == 'speed':  # speed benchmarks
        # python val.py --task speed --data coco.yaml --batch 1 --weights yolov5n.pt yolov5s.pt...
        opt.conf_thres, opt.iou_thres, opt.save_json = 0.25, 0.45, False
        for opt.weights in weights:
            run(**vars(opt), plots=False)

    elif opt.task == 'study':  # speed vs mAP benchmarks
        # python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5n.pt yolov5s.pt...
        for opt.weights in weights:
            f = f'study_{Path(opt.data).stem}_{Path(opt.weights).stem}.txt'  # filename to save to
            x, y = list(range(256, 1536 + 128, 128)), []  # x axis (image sizes), y axis
            for opt.imgsz in x:  # img-size
                LOGGER.info(f'\nRunning {f} --imgsz {opt.imgsz}...')
                r, _, t = run(**vars(opt), plots=False)
                y.append(r + t)  # results and times
            np.savetxt(f, y, fmt='%10.4g')  # save
        os.system('zip -r study.zip study_*.txt')
        plot_val_study(x=x)  # plot`

do I need to reset the parameters as my custom-trained model?
I used a 16 of batch size and workers 4, but in your val.py default is 32 and 8 respectively.

No you can use any batch size and worker size that fits your training machine to make training as fast as possible.

the last question, can I use val.py with two different datasets which have a different number of classes?
actually, I use this one and got the following error:
AssertionError: best.pt (6 classes) trained on different --data than what you passed (10 classes). Pass correct combination of --weights and --data that are trained together.

Nice that you found the answer on your own. I think if you want to validate on a dataset that does not match your .pt file, you have to rewrite val to ignore the classes that hasn't been trained on.

I hope this helps

@YousefABD92
Copy link
Author

you are my hero!

thanks a lot
regarding this please:

do I need to reset the parameters as my custom-trained model?
I used a 16 of batch size and workers 4, but in your val.py default is 32 and 8 respectively.
for this I am asking in the val.py file itself, Do I need to reset them or keep them as they are?

@MartinPedersenpp
Copy link

MartinPedersenpp commented Aug 16, 2022

do I need to reset the parameters as my custom-trained model?
I used a 16 of batch size and workers 4, but in your val.py default is 32 and 8 respectively.
for this I am asking in the val.py file itself, Do I need to reset them or keep them as they are?

val.py is essentially using the same dataloader as train.py I believe, so you can use your values, but as far as i known val.py is not as demanding, so you might be able to perform validation faster by increasing batch size and workers compared to your training. At some point you will reach your memory / cpu limit which will cause a crash probably.

Please don't forget to mark the issue as solved, if your question has been answered 🙂

@YousefABD92
Copy link
Author

if we assume that the two datasets have the same classes what is going to be the differences between the two YAML files??

You mention them as two datasets, which causes me to believe that they are placed in different locations and that would be the main difference between dataset1.yaml and dataset2.yaml

but still did not get what should I do here.
how to evaluate two different trained models with the same ymal file using val.py

what supposed to be the commands?!
please help me out

@MartinPedersenpp
Copy link

but still did not get what should I do here.
how to evaluate two different trained models with the same ymal file using val.py
what supposed to be the commands?!

Isn't the code from my first answer enough?

If you want to you can set batch size and workers, but if it can run on default then that will make it validate faster

For dataset 1
python path/to/val.py --weights dataset1bestorlast.pt --data dataset2.yaml --img sameasusedfortrain
For dataset 2
python path/to/val.py --weights dataset2bestorlast.pt --data dataset1.yaml --img sameasusedfortrain

@glenn-jocher
Copy link
Member

@MartinPedersenpp i noticed a misunderstanding in the previous response.

To evaluate two different trained models with the same YAML file using val.py, you can use the following commands:

For dataset 1:

python path/to/val.py --weights dataset1bestorlast.pt --data dataset1.yaml

For dataset 2:

python path/to/val.py --weights dataset2bestorlast.pt --data dataset2.yaml

You can keep the batch size and workers as they are in the val.py file, or adjust them according to your specific machine's capabilities. However, using default values should work efficiently for validation.

I hope this clears up any confusion. Let me know if you need further assistance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants