Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement generic object detector #601

Closed
wants to merge 33 commits into from
Closed

Implement generic object detector #601

wants to merge 33 commits into from

Conversation

kloudkl
Copy link
Contributor

@kloudkl kloudkl commented Jul 3, 2014

This provides a common interface for different candidate object regions proposal algorithms. Together with #560, the pure C++ object detector consisted of high speed regions selector, feature extractor, classifier and region merger would become reality (#548).

@bhack, please have a look. Where is the latest BING ported to OpenCV?

@bhack
Copy link
Contributor

bhack commented Jul 3, 2014

@kloudkl https://github.com/fpuja/opencv_contrib/tree/saliencyModuleDevelop
There will be also video saliency region candidates from motion in the module before gsoc end. It is a saliency API (static, objectness, motion).

cc: @fpuja @lenlen

@kloudkl
Copy link
Contributor Author

kloudkl commented Jul 4, 2014

Thanks for your porting!

Are you confident that your module will be merged into the official OpenCV repository? If so, what is the earliest time? Before the API appears in the next release of OpenCV, I'm afraid it's only appropriate to use in one's internal applications. Any way, this PR will finish a baseline object detector that can be easily extended to use more advanced algorithms in any phase of the pipeline.

@bhack
Copy link
Contributor

bhack commented Jul 4, 2014

It is not so easy to reply to your question. In Opencv 3.0 opencv-contrib will become an official part of the project. But I don't know how will be the package maintainers policy in Linux distors.

@vpisarev can you give us some feedback on this outlook?

@vpisarev
Copy link

vpisarev commented Jul 4, 2014

you are welcome to contribute some new functionality into OpenCV, not the main OpenCV, but into the contrib repository, for which we now have more or less automatic testing: http://pullrequest.opencv.org/#/summary/contrib. The module must follow our well, so far implicit guidelines, i.e. : have the same directory structure as other modules, use CMake as build system, use RST/Sphinx for documentation etc. Also, there should be some commitment to support the module at least for several months, i.e. you will be assigned all the bugs that users report. As soon as you submit pull request, we can evaluate it

@bhack
Copy link
Contributor

bhack commented Jul 4, 2014

@vpisarev I think that @kloudkl is interested to know if opencv gsoc projects will be merged in opencv-contrib and if opencv-contrib itself will have a new "official" distribution policy (tar.gz, distro's package etc.).

@vpisarev
Copy link

vpisarev commented Jul 4, 2014

all the gsoc results will be put into opencv_contrib repository following the same guidelines outlined above and also at http://code.opencv.org/projects/opencv/wiki/How_to_contribute.
there is no any distribution policy that contributors should care about. The whole opencv_contrib can be downloaded as .zip directly from github and then built using the standard OpenCV build system. Binary packages for opencv_contrib will likely be prepared by OpenCV 3.0 beta (around September this year). Itseez team will take care of this, but we certainly appreciate any help. In any case, opencv_contrib packaging issue will be solved in whole, there is no need to invent something new for each contributed module.

@kloudkl
Copy link
Contributor Author

kloudkl commented Jul 5, 2014

@vpisarev, thanks for your detailed answers! Your explanations inspired me that there could also be a caffe_contrib repository accompanying the main project for Caffe in the future. The functionality that depends on additional third party libraries such as the BING module in OpenCV could be placed there.

@kloudkl
Copy link
Contributor Author

kloudkl commented Jul 5, 2014

TODO:

  1. Test the non-maximum suppression regions merger
  2. Complete the generic CNN object detector including tests
  3. Train and test a object detection model on a public dataset with example training and testing network proto
  4. Performance evaluation

@ronghanghu
Copy link
Member

@kloudkl It seems that this PR is aligned with Rectangular Pooling #614, so that we can implement a spatial pyramid pooling detector mentioned in http://arxiv.org/pdf/1406.4729v1.pdf

@kloudkl kloudkl changed the title Implement regions of interest generator for object detection Implement generic object detector Jul 5, 2014
@kloudkl
Copy link
Contributor Author

kloudkl commented Jul 5, 2014

@ronghanghu, how are they aligned in your opinion? I would rather say Rectangular Pooling and #560 naturally work together.

@ronghanghu
Copy link
Member

@kloudkl Yes, you are right. I mean rectangular pooling #614 can be used in spatial pyramid pooling #560, so that a spatial pyramid pooling detector can be implemented using this PR.

@kloudkl
Copy link
Contributor Author

kloudkl commented Jul 12, 2014

It is really very hard to manage a PR that depends on so many others. To build and test this PR, multiple PRs including #560, #558 have to be mixed together. After all the features are completely tested and benchmarked, the commits that belong to this PR will be cherry-picked.

@kloudkl
Copy link
Contributor Author

kloudkl commented Jul 12, 2014

In order to speed up extracting features for thousand of regions that may contain objects, the spatial pyramid pooling layer directly pools the non-square regions of the feature maps. The most difficult part of the problem is that the layers used by the spp layer, i.e. split, pooling, flatten and concat layers must all support rectangular input blobs.

@bhack
Copy link
Contributor

bhack commented Jul 16, 2014

You can follow Bing PR here opencv/opencv_contrib#39

@bhack
Copy link
Contributor

bhack commented Sep 6, 2014

@kloudkl Bing objectness is merged now. You can find the docs here: http://docs.opencv.org/trunk/modules/saliency/doc/saliency.html

@shelhamer
Copy link
Member

There are useful ideas and excerpts in this PR but the agglomeration is an uncomfortable mix of changes and concerns. Scope is at issue too: there are myriad ways to fit Caffe into a detector pipeline in one's own fork, but for inclusion in the main project it should be unobstrusive with respect to other tasks but generally useful for detection itself. My worry is that this is a somewhat individual effort for different kinds of detectors and best handled locally or by scripting, but that's merely my impression. Don't let that discourage proposals for detection pipeline PRs!

Closing for these reasons and because the originating fork was cancelled and the contributions in this branch are mostly either in-progress in other PRs or out-of-date.

@shelhamer shelhamer closed this Oct 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants