Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On the accuracy issue of MonoDETR on the KITTI test set #61

Open
KotlinWang opened this issue Mar 12, 2024 · 17 comments
Open

On the accuracy issue of MonoDETR on the KITTI test set #61

KotlinWang opened this issue Mar 12, 2024 · 17 comments

Comments

@KotlinWang
Copy link

I only achieved 11.59% accuracy on the official using your public weights. I would like to ask what causes this? I want to replicate the 16.47% accuracy of the paper on the test set.
results

@BeyondVE
Copy link

Excuse me, how is your AP performance

@Vipermdl
Copy link

image
This is my replicate results, the problem is same with u. We also look forward a reasonable explain.

@BeyondVE
Copy link

image This is my replicate results, the problem is same with u. We also look forward a reasonable explain.

Excuse me, is this the model of the original paper, or there are some changes. I applied for the upload model on the official website but failed, may I ask, how did you apply

@KotlinWang
Copy link
Author

@Vipermdl @BeyondVE I used the pre-trained weights he published. According to my understanding, the results in KITTI test set are obtained by model training in the whole trainval set.

@Vipermdl
Copy link

@Vipermdl @BeyondVE I used the pre-trained weights he published. According to my understanding, the results in KITTI test set are obtained by model training in the whole trainval set.

It seems to me that what the author released is a pre-trained weight that is only trained on the training set. The performance of their published weights on the validation set is consistent with the performance of our own reproduction. But the problem is that we cannot reproduce the test set performance, even if we use train+val data and make multiple attempts.

@KotlinWang
Copy link
Author

It seems to me that what the author released is a pre-trained weight that is only trained on the training set. The performance of their published weights on the validation set is consistent with the performance of our own reproduction. But the problem is that we cannot reproduce the test set performance, even if we use train+val data and make multiple attempts.

I have the same problem. What I want to know is how to choose the model with the best results when using the full training and validation set for training?

@BeyondVE
Copy link

@Vipermdl @BeyondVE I used the pre-trained weights he published. According to my understanding, the results in KITTI test set are obtained by model training in the whole trainval set.

It seems to me that what the author released is a pre-trained weight that is only trained on the training set. The performance of their published weights on the validation set is consistent with the performance of our own reproduction. But the problem is that we cannot reproduce the test set performance, even if we use train+val data and make multiple attempts.

After training, how good is your validation set?

@BeyondVE
Copy link

It seems to me that what the author released is a pre-trained weight that is only trained on the training set. The performance of their published weights on the validation set is consistent with the performance of our own reproduction. But the problem is that we cannot reproduce the test set performance, even if we use train+val data and make multiple attempts.

I have the same problem. What I want to know is how to choose the model with the best results when using the full training and validation set for training?

After training, how good is your validation set?

@Vipermdl
Copy link

It seems to me that what the author released is a pre-trained weight that is only trained on the training set. The performance of their published weights on the validation set is consistent with the performance of our own reproduction. But the problem is that we cannot reproduce the test set performance, even if we use train+val data and make multiple attempts.

I have the same problem. What I want to know is how to choose the model with the best results when using the full training and validation set for training?

After training, how good is your validation set?

If train+val is used for training, the result of val cannot be used as a reference indicator. I used the last parameters for testing. According to the code of author, TTA may be used, but this requires the author to open source their complete code.

@BeyondVE
Copy link

It seems to me that what the author released is a pre-trained weight that is only trained on the training set. The performance of their published weights on the validation set is consistent with the performance of our own reproduction. But the problem is that we cannot reproduce the test set performance, even if we use train+val data and make multiple attempts.

I have the same problem. What I want to know is how to choose the model with the best results when using the full training and validation set for training?

After training, how good is your validation set?

If train+val is used for training, the result of val cannot be used as a reference indicator. I used the last parameters for testing. According to the code of author, TTA may be used, but this requires the author to open source their complete code.

After training, what are the results of the verification set?can you reach the level of the paper,and why“If train+val is used for training, the result of val cannot be used as a reference indicator.” 。Thank you very much

@KotlinWang
Copy link
Author

It seems to me that what the author released is a pre-trained weight that is only trained on the training set. The performance of their published weights on the validation set is consistent with the performance of our own reproduction. But the problem is that we cannot reproduce the test set performance, even if we use train+val data and make multiple attempts.

I have the same problem. What I want to know is how to choose the model with the best results when using the full training and validation set for training?

After training, how good is your validation set?

If train+val is used for training, the result of val cannot be used as a reference indicator. I used the last parameters for testing. According to the code of author, TTA may be used, but this requires the author to open source their complete code.

After training, what are the results of the verification set?can you reach the level of the paper,and why“If train+val is used for training, the result of val cannot be used as a reference indicator.” 。Thank you very much

In my opinion MonoDETR is very unstable and needs to be repeated many times to get good results.

@BeyondVE
Copy link

It seems to me that what the author released is a pre-trained weight that is only trained on the training set. The performance of their published weights on the validation set is consistent with the performance of our own reproduction. But the problem is that we cannot reproduce the test set performance, even if we use train+val data and make multiple attempts.

I have the same problem. What I want to know is how to choose the model with the best results when using the full training and validation set for training?

After training, how good is your validation set?

If train+val is used for training, the result of val cannot be used as a reference indicator. I used the last parameters for testing. According to the code of author, TTA may be used, but this requires the author to open source their complete code.

After training, what are the results of the verification set?can you reach the level of the paper,and why“If train+val is used for training, the result of val cannot be used as a reference indicator.” 。Thank you very much

In my opinion MonoDETR is very unstable and needs to be repeated many times to get good results.

The verification set effect of my model is two points lower than that of the paper. May I ask what your best verification set effect is

@Ivan-Tang-3D
Copy link
Collaborator

  1. The training process is relatively unstable, so please use the same config and try some times.2.Please Refer to this link: Turn the model to original version(running in 3090) #51. The code is upgraded for the better and stable performance. U could refer to the link and revise the code, then try the original ckpt. Or use the A100/ other GPU to train this version code.

@ruiw0w
Copy link

ruiw0w commented Aug 7, 2024

在我看来,作者发布的是一个只在训练集上训练的预训练权重。他们公布的权重在验证集上的表现和我们自己复现的表现是一致的。但问题是,即使我们使用train+val数据,多次尝试,我们也无法复现测试集上的表现。

我也遇到同样的问题,我想知道在使用全量训练集和验证集进行训练时,如何选取效果最好的模型?

训练之后,你的验证集效果如何?

如果使用train+val进行训练,val的结果不能作为参考指标,我用了最后一个参数进行测试,根据作者的代码,可以使用TTA,但这需要作者开源他们的完整代码。

训练完验证集结果怎么样?能达到论文水平吗?为什么“如果用train+val进行训练,val的结果不能作为参考指标”。非常感谢

在我看来,MonoDETR 非常不稳定,需要重复多次才能获得良好的结果。

你好,你最终得到与论文中相近的测试集结果了吗?
我是用这个repo进行训练的结果,包含三个类别。
对于汽车类别的结果与你提交得到的结果相近。
image

@ruiw0w
Copy link

ruiw0w commented Aug 7, 2024

  1. 训练过程相对不稳定,请使用相同配置多试几次。2.请参考此链接:将模型转为原始版本(在 3090 中运行) #51。代码已升级,性能更好更稳定。您可以参考链接修改代码,然后尝试原始 ckpt。或者使用 A100/其他 GPU 训练此版本代码。

您好,现版本的repo是最稳定的吗?我在一张RTX 4090上训练,是否需要修改如您之前提到的group_num以及注释https://github.com/ZrrSkywalker/MonoDETR/blob/main/lib/models/monodetr/depthaware_transformer.py中467-473行?
我尝试修改并在您公开的权重以及我训练的权重上测试,性能有进一步的下降。
是修改后,再次训练?

@Aangss
Copy link

Aangss commented Dec 2, 2024

在我看来,作者发布的是一个只在训练集上训练的预训练权重。他们公布的权重在验证集上的表现和我们自己复现的表现是一致的。但问题是,即使我们使用train+val数据,多次尝试,我们也无法复现测试集上的表现。

我也遇到同样的问题,我想知道在使用全量训练集和验证集进行训练时,如何选取效果最好的模型?

训练之后,你的验证集效果如何?

如果使用train+val进行训练,val的结果不能作为参考指标,我用了最后一个参数进行测试,根据作者的代码,可以使用TTA,但这需要作者开源他们的完整代码。

训练完验证集结果怎么样?能达到论文水平吗?为什么“如果用train+val进行训练,val的结果不能作为参考指标”。非常感谢

在我看来,MonoDETR 非常不稳定,需要重复多次才能获得良好的结果。

你好,你最终得到与论文中相近的测试集结果了吗? 我是用这个repo进行训练的结果,包含三个类别。 对于汽车类别的结果与你提交得到的结果相近。 image

你在4090训练Batch是12吗,用最后一轮结果提交?

@ruiw0w
Copy link

ruiw0w commented Jan 10, 2025

在我看来,作者发布的是一个只在训练集上训练的预训练权重。他们公布的权重在验证集上的表现和我们自己复现的表现是一致的。但问题是,即使我们使用train+val数据,多次尝试,我们也无法复现测试集上的表现。

我也遇到同样的问题,我想知道在使用全量训练集和验证集进行训练时,如何选取效果最好的模型?

训练之后,你的验证集效果如何?

如果使用train+val进行训练,val的结果不能作为参考指标,我用了最后一个参数进行测试,根据作者的代码,可以使用TTA,但这需要作者开源他们的完整代码。

训练完验证集结果怎么样?能达到论文水平吗?为什么“如果用train+val进行训练,val的结果不能作为参考指标”。非常感谢

在我看来,MonoDETR 非常不稳定,需要重复多次才能获得良好的结果。

你好,你最终得到与论文中相近的测试集结果了吗? 我是用这个repo进行训练的结果,包含三个类别。 对于汽车类别的结果与你提交得到的结果相近。 image

你在4090训练Batch是12吗,用最后一轮结果提交?

我的Batch是16。是的,用最后一轮结果提交。类似的工作在训练过程中都会出现结果不稳定的情况,有什么训练技巧吗?还是说尝试进行多次训练。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants