Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are you planning to try "model_main.py" again instead of "train.py" #184

Open
anonym24 opened this issue Nov 6, 2018 · 14 comments
Open

Comments

@anonym24
Copy link

anonym24 commented Nov 6, 2018

I suppose "model_main.py" should work better and mb then even fixed some errors which can affect speed/accuracy of models

@anonym24
Copy link
Author

anonym24 commented Nov 8, 2018

I see model_main.py needs cocoapi (pycocotools)

For Windows we can install it from https://github.com/philferriere/cocoapi

@anonym24
Copy link
Author

anonym24 commented Nov 8, 2018

but it fails with next errors TypeError: can't pickle dict_values objects

tensorflow/models#5719

@anonym24
Copy link
Author

anonym24 commented Nov 8, 2018

Seems this error can be solved by
guildai/models@d88f1fb
tensorflow/models#4780 (comment)

@anonym24
Copy link
Author

anonym24 commented Nov 8, 2018

seems it finally started to work with model_train.py

I start train like this (# From the tensorflow/models/research/ directory)

python object_detection\model_main.py --pipeline_config_path=object_detection\training\ssd_mobilenet_v1_pets.config --model_dir=object_detection\images --num_train_steps=50000 --sample_1_of_n_eval_examples=1 --alsologtostderr

(more info: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_locally.md)

I see some checkpoints:

image

Though output of console isn't that useful: tensorflow/models#5719 (comment)

@anonym24
Copy link
Author

anonym24 commented Nov 9, 2018

Solved and it works ok after some configuration: tensorflow/models#5719 (comment)

So SSD Models can be train normally with model_train.py (works on Windows and Ubuntu), with legacy train.py you can get OOM errors

@anonym24
Copy link
Author

also by default model_train.py doesn't print any logs in your console

here'e a solution to fix it:

add tf.logging.set_verbosity(tf.logging.INFO) line after imports in model_train.py

image

image

@dscha09
Copy link

dscha09 commented Jan 8, 2019

@anonym24 How is model_main.py different from train.py?

@JanithT-Lboro
Copy link

where is this model_train.py file?

@anonym24
Copy link
Author

@dscha09 I don't know but with legacy train.py you can get OOM errors while starting training and at some point of training

@JanithT-Lboro https://github.com/tensorflow/models/blob/master/research/object_detection/model_main.py

@JanithT-Lboro
Copy link

I was able to solve this problem by just using a PC with better specs.

@Futi7
Copy link

Futi7 commented Feb 3, 2019

C:\tensorflow\models\research>python object_detection\model_main.py --pipeline_config_path=object_detection\training\ssd_mobilenet_v1_pets.config --model_dir=object_detection\images --num_train_steps=50000 --sample_1_of_n_eval_examples=1 --alsologtostderr
Traceback (most recent call last):
File "object_detection\model_main.py", line 26, in
from object_detection import model_lib
File "C:\Users\Fuat\AppData\Local\Programs\Python\Python35\lib\site-packages\object_detection\model_lib.py", line 27, in
from object_detection import eval_util
File "C:\Users\Fuat\AppData\Local\Programs\Python\Python35\lib\site-packages\object_detection\eval_util.py", line 27, in
from object_detection.metrics import coco_evaluation
File "C:\Users\Fuat\AppData\Local\Programs\Python\Python35\lib\site-packages\object_detection\metrics\coco_evaluation.py", line 22, in
from object_detection.utils import object_detection_evaluation
File "C:\Users\Fuat\AppData\Local\Programs\Python\Python35\lib\site-packages\object_detection\utils\object_detection_evaluation.py", line 39, in
from object_detection.utils import label_map_util
File "C:\Users\Fuat\AppData\Local\Programs\Python\Python35\lib\site-packages\object_detection\utils\label_map_util.py", line 21, in
from object_detection.protos import string_int_label_map_pb2
ImportError: cannot import name 'string_int_label_map_pb2'

I got this error.How can i fix it.I installed protobuf, compiled it, added to path.
OS:Windows 10
CPU:AMD A9 7th Gen.
Python version:3.6
Protobuf version:3.4.0

@dienhoa
Copy link

dienhoa commented Feb 7, 2019

@Futi7 I'm not sure on windows but I think you can solve it by running
protoc object_detection/protos/*.proto --python_out=.

@johnneijzen
Copy link

this work great

@EdjeElectronics
Copy link
Owner

EdjeElectronics commented Sep 21, 2019

Hey @anonym24 , I know it's been a year, but thanks for your work on this issue! I'm still considering whether to update my guide to use model_main.py rather than train.py. It's a bummer that it doesn't log training information in the console by default.

This issue (TensorFlow #6100) has some good information directly from a TensorFlow team member on the differences between train.py and model_main.py. From their comments, it sounds like there is no difference when using a normal PC for training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants