-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python Hierarchy #13
Python Hierarchy #13
Conversation
Thanks! So I'm mainly wondering what is the bottleneck here and if there is a way to circumvent it. |
Also, Neal3 is surely the most computationally intensive algorithm. What happens with Neal2 or BlockedGibbs? |
Python::State out; | ||
synchronize_cpp_to_py_state(bayesmix::Rng::Instance().get(), py_gen); | ||
py::list params_py = vector_to_list(params.generic_hypers); | ||
py::list state_py = vector_to_list(state.generic_state); | ||
py::list draw_py = fun.attr("draw")(state_py,params_py,py_gen); | ||
out.generic_state = list_to_vector(draw_py); | ||
synchronize_py_to_cpp_state(bayesmix::Rng::Instance().get(), py_gen); | ||
return out; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I would delegate the computation of posterior hypers entirely to the "draw" python function, which should just receive the data allocated to this cluster for the moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, we didn't understand this comment, can you please explain more in details?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, you have two functions: compute_posterior_hypers
and draw
in fun.py
In practice, it's hardly the case that users will want to implement conjugate hierarchies in python. Therefore, I suggest working directly as if the hierarchy is non conjugate, and have the users implement a draw_prior
and a sample_full_conditional
pythonfunction. The second one could also call the draw_prior
(as in this case) but this logic would be within the python code, and therefore be handled only by the user
py::list L; | ||
for (auto &elem: x) { | ||
L.append(elem); | ||
} | ||
return L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would believe that there is a faster way than appending, perhaps already reserving the correct size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might help: https://pybind11.readthedocs.io/en/stable/advanced/cast/stl.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. We had changed the same thing in another function but hadn't noticed here. It affected a lot apparently. Now it runs much faster both with Neal2,3. 1100 iterations in 97 seconds for Neal3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the MAKE_OPAQUE
functionality in the docs should really help with the performance, then.
Also, note that you should not need to convert between py::list and std::vector, just pass the vector of doubles to the function and pybind should take care of the conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us know when you have further updates. I still would like you to run the code with a profiler (compile with debug mode though!) to see where optimization should be focused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. We modified the code to use the pybind11 built-in type conversions.
We will run the profiler in the upcoming days and will share the results.
About the make_opaque, we're still in the reading phase :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
we tried experimenting with MAKE_OPAQUE to pass the state vector by reference.
We turned std::vector into an opaque type (called DoubleVector) and managed to implement the push_back method in the corresponding class declaration. When overloading the access operator [] however we encountered the first problem: it worked for elements outside the vector as well, returning random numbers.
Moreover, since MAKE_OPAQUE deactivates the automatic conversion between python lists and std::vector we changed the state data structure into a generic pybind11 object, which would then be passed a DoubleVector object. We encountered some problems however in passing it by reference to python.
Finally it appears to us that most of the pybind11 documentation deals with running c++ from python, and not viceversa (as in our case). Can you point us towards some working examples of MAKE_OPAQUE related to what we're trying to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not an expert on this. One option would be to open an issue in the pybind11
github and see if a developer answers!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, after receiving the answer on the pybind11 github we were finally able to make it work. However, it doesn't make the code any faster (actually it seems to slightly slow it down). This is probably because the vectors we are passing are of extremely small size.
You can find the implementation in the py_hier_opaque branch, is it better to keep this approach (which maybe will make code faster in the event someone needs to pass bigger vectors) or to use the automatic copy conversions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that's interesting!
I don't have a clear answer to that, maybe the best thing is to benchmark the approaches on a multivariate case. We're not interested in anything fancy.
I would do a mock-up where you don't code all the stuff properly. E.g. the lpdf evaluator might return simply the sum of data (even if this does not make any sense) and the sampler might just sampler randomly from a normal distribution without any meaning. Just to benchmark the conversion speed!
…subscript in vector/list conversions
Hello, we started to work on profiling our code. We tried different profilers but in the end the one that seemed more interpretable to us was gprof (which is also the one prof. Formaggia showed us during the course). You can find attached the results we obtained, we still have to think about it but we thought you could give us some helpful tips on how to interpret it. Thanks |
So it appears that 16% of the time is spent in creating pybind arrays:
which takes more than twice the time compared to the second most expensive function. |
For the moment, I would suspend optimization (maybe open the issue on pybind11's GitHub) and move on to having the example working via the python interface. In the end, users should be able to sample from the model using the |
We are currently trying to run the new hierarchy from python. So far we have encountered a lot of problems with the installation of the library. In our fork we have updated the bayesmix and we are using our fork of bayesmix as a submodule. To summarize, we are getting 2 types of errors:
We are kind of stuck on this in the last few days :'( |
The first error is likely our fault. I pushed a patch on pybmix/master to fix the CMakeLists Moreover, at line 93 of the bayesmix/CMakeLists.txt file, in place of This should fix the include errors. Let me know if the ninja error persists. If so, we can avoid using ninja I believe |
Seems like it doesn't solve the problem. The error can be reproduced by using the latest versions of our forks. Attached is the full error log. |
This is likely because you need to delete the build folder and the egg-info one. Then try to build stuff using cmake in the usual way mkdir build Let me know if this works, in this case we can set aside the setup.py file for a moment and have a hacky workaround. Also, it looks like you're missing some folders inside the bayesmix submodule. This might be due to the previous installation system (?). You might try to delete your pybmix folder locally and just clone it back. |
The error I attached was produced on a fresh clone of the repo and a new environment to avoid any unnecessary errors. I'll try the building stuff now and let you know. |
What about the output of |
No it compiles without any warnings or errors. Totally fine. |
Then a workaround for the moment is not to build with
To obtain the path to the build dir in a smart way, assume that the build dir is always in
the same should work in "estimators.py" that is the only other file that calls directly pybmixcpp The main disadvantage is that this does not install the package to the system library, so that you will need to work only in the root folder of pybmix. I will try to have a look at your building issues in the next days. |
This approach seems to not be working because seems like the proto files were being generated ('converted') in the setup.py |
You're right. You can achieve this in dozens of ways. The simplest one that comes to my mind is to have a shell script that does it, then call it by hand since you should not modify the protos at any rate. |
To make a shall script is straight-forward. WHat I'm wondering is how is that two_to_three_command even working? I have 2 doubts:
So I'm guessing, there is something else happening before that I don't see. Maybe its this?
What am I missing? |
So in the cmake file we first compile proto files into python classes which are written in PROTO_OUT_DIR. Basically, in the lines of code you flagged, i am converting all the proto classes from python2 2 to Python3. Since the input and the output Path coincide, 2to3 will overwrite the old classes with the new ones. So before running 2to3, you should build stuff with cmake so that it compiles the protobuf messages into python classes. Maybe is this step you're missing? |
Clear now. Thanks. Hopefully next time I'll bother you just to say it works :) |
Here come the good news! The very first working example of Python hierarchy. The example can be found in |
Great news indeed!! So far, it seems like the Another possibility is to have a meta-programming approach where we write the |
Hi. We get all types of problems, sometimes with protobuf, then we fix it, then with tbb, then we fix it, then with something else. Then also we managed to compile without errors, but inside python couldn't import the AlgorithmWrapper (not found): core dumped, tried to play with paths, but didn't seem to work anyway. We pushed this version in a new branch called 'Cmake_hell' |
Sorry I'm not following, when you said it was working, you didn't run the example? Or by some magic the example was working and now it's broken? I tried compiling your stuff and it compiles fine, but I get an ImportError:
The thing you can do is to locate the exact source of the error: does everything work on master? Does it work if you comment enough lines of code? (Like, all the new stuff?) Does the c++ executable you had before still work? Another thing to try is to go back to a very simple example, starting from pybind11's cmake example, and adding a simple function that computes like an addition written in python, called in C++ and then called from Python just as we're doing now |
Hi, to clarify: Do you have any suggestion? |
Just add them to the |
Well, that didn't work as well. It compiles, but can't find the "AlgorithmWrapper' and it's not a path issue as it seems. |
Ok great! I've opened a PR #14 proposing a new structure for the package. It splits the package in two and has more maneagable CMakeLists.txt files. If you try it out and it works for you, then we can convert this PR to be in that format |
Very good news ! You have to clone directly that branch, since the structure of the folders is different, git doesn't allow to checkout (this can be solved after merging). To run the examples, just run the docs/examples/test_run.py file. We propose to keep this structure if you don't mind. |
Keep te package/file structure you're most comfortable with. There is always time to change later in any case! Once everything works, i would close this PR and open a new one from the right branch. One last thing that you should make sure to check is how to deal with multiple |
We have implemented a way to decide which
Please refer to this branch for a working example. Let us know if you like the approach and if we should expand it to the other methods. |
I'm all in for that! Especially since we can hide this from the end-users and handle this in an internal function (I hope, otherwise we just write very good docs!!) |
Hi, we've extended the approach to all the other methods (of the non-conjugate case). |
Yes exactly what I had in mind! |
This is a draft implementation of the python NNIG hierarchy. We implemented the necessary metods in python in 'fun.py' which are then called from within
python_hierarchy.cc
located in pybmix/pybmix/core/pybmixcpp as was done theserialized_collector.cpp
by Mario. We migrated all the model-specific methods of a hiearachy to python (exceptupdate_hypers
at the moment).Couple of questions/observations:
run_mcmc
is dropping significantly, (well, because Python is slow); running Neal3 algorithm with Python hierarchies, DP mixing takes 25 minutes with 500 iterations; for non-conjugate models we suppose it will be even slower considering that also sampling from the full conditionals will be done in python;clear_summary_statistics
in python since it is kind of model specific method by nature, we are not sure if it is necessary, given that it slows down the runtime;To try the new python hierarchy it is needed to build and compile inside the pybmix/pybmix/core folder, where there is the CMake file, then it can be run with the
run_python.sh
file located in the same path.Note that we also changed some things in the bayesmix submodule, which is the forked repo of bayesmix. Particularly, we updated proto files for python hierarchy and changed the mersenne twister engine to 32 bit version.
Do you have any suggestions for more efficient implementation? If you approve the approach we can go on implementing more generic structures.
It is crucial for us (due to scholarship reasons) to finish the project in the summer exam sessions. To have an estimation of the time and work needed to finish the project; your feedback at the moment is very important. Let us know what you think about our approach and what needs to be changed.
Cheers!
P.S. The pre-commit hooks and styling/naming conventions are not followed yet since in this draft PR we still need to understand if the approach in general is accepted.