Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VROOM python wrapper #634

Closed
jonathf opened this issue Dec 26, 2021 · 18 comments
Closed

VROOM python wrapper #634

jonathf opened this issue Dec 26, 2021 · 18 comments

Comments

@jonathf
Copy link
Collaborator

jonathf commented Dec 26, 2021

Hello there!

I've built a basic wrapper to VROOM in Python:
https://github.com/jonathf/pyvroom

I made it partially because I wanted to get more familiar with the VROOM code base, partially to improve on my own pybind11 skills, but also because I want to make some statistical experiments with vroom, and I'm likely much faster to do so in Python.

Currently I am using a fork of this repo where I have added some minimal patches to improve the Python interface. Preferably from my point of view, these patches should live upstream here in this repo. Would you guys be open to discussing improving the VROOMS interface to Python?

@nilsnolde
Copy link
Contributor

having done similar bindings in the past for related projects (mostly routing engines), gotta say: kudos! really clean and nice! had a similar goal in mind for early this year, to be able to easily use vroom in a QGIS plugin, pretty sure I'd have chosen the lazy way and just expose one high-level function accepting basically the JSON API's parameters 😅

just skimmed for 5 mins but learnt more than in days of suffering through python packaging myself (and some neat pybind11 code!).. did it work cross-platform already?

@jonathf
Copy link
Collaborator Author

jonathf commented Dec 29, 2021

Thank you for the kind words. It helps a lot that the VROOM is clean and nice as well. :)

For this iteration I put the effort in on making Linux install work smooth with bundled binaries on Python 3.7-3.10. Getting from here to MacOS should be easy enough I think. I'm unsure how complicated it will be to get the builder to support Windows.

@nilsnolde
Copy link
Contributor

the manylinux images are definitely a godsend:) I'm really suprised though that you can easily build the vroom lib with the manylinux2014 tag, which seems to have gcc4.8 and thus zero c++17 compatibility (and as I remember vroom makes pretty heavy use of c++17 features like std::optional etc).. I only built with the manylinux_x_y tag which runs debian 9 and gcc 6, mostly for other dependency versions but also bcs of gcc.

anyways, sorry for the spamming of this issue, fairly unrelated stuff.. if you're up for it, I'll gladly help with win integration and support of the bindings in general, I'll make use of them as well soon and I'd need win compatibility. we can discuss more in your repo.

@jonathf
Copy link
Collaborator Author

jonathf commented Dec 29, 2021

Yes, indeed. I had to dig a bit, manylinux2014 is actually the only one variant of manylinux that fully supports c++17 currently:
pypa/manylinux#1012

And thanks for the offer, it would be much appriciated. I'll tag you in the appropriate issue.

@jcoupey
Copy link
Collaborator

jcoupey commented Dec 29, 2021

Thanks @jonathf for working on python bindings and sharing your work! Considering that a lot of people are more comfortable with python than with C++, I think that this brings value for the project overall. So yes I'd be inclined to discuss ways to help upstream.

I'm definitely not an expert with this kind of stuff but @nilsnolde somehow already endorsed your work and offered to contribute. So taking the thinking a step further, I think we should host the python bindings in the scope of this organization. We already did this with Docker images: instead of having images provided by downstream users under other organizations, we now have our own vroom-docker repo (which @nilsnolde is maintaining). This brings a lot of clarity and ease of use for all users.

If you're OK with the idea, I can setup a dedicated repo within the VROOM organization and provide you access so we can start working on this. Again, having everything under the same organization is IMHO the best way to join forces and build a wider community of users.

@nilsnolde
Copy link
Contributor

nilsnolde commented Dec 29, 2021

happy you suggest it, I was hesitant;) but no stress @jonathf in case you want to keep it in your space, plenty of reasons for that. imho and experience, libraries are better received if they're in the official org of the project. I personally trust those sources more than "some" third party lib. (well yeah, just repeated what julien said haha)

@jonathf
Copy link
Collaborator Author

jonathf commented Dec 29, 2021

Interesting proposal. I am not opposed to moving the project to have it under the VROOM-Project umbrella. Before I decide, I have a couple of questions:

  1. What is the current governance model for VROOM-Project and each of its repos?
  2. Will there be an expectation to keep the Python interface as close to the C++ source as possible? Larger deviation won't happen as it will result in maintainence hell, but smaller stuff that I think will work better might.

@nilsnolde
Copy link
Contributor

just my 2c, how I handle the docker repo:

I always check in with julien (assign him as reviewer) when I want to do non-trivial changes. and I try to document them well in commit messages & PRs, so that someone else wouldn't have a hard time to take over in worst-case, figuring out why I did what. I do keep very close to vroom's release schedule, which happens 3-4 times a year and literally takes 5 mins to upgrade the docker repo to a new version. probably it'll be a bit more work for the bindings, but I gladly help maintaining it. it's so clean, it just took a few minutes to understand what you're doing.

IMO the bindings can offer some more convenience that's just hard/verbose to handle in the c++ source . and as you said we'll have a high interest to not deviate too much. if some things are missing, like plan mode which requires another linked lib (hard on win), that's totally fine I think.

in any case, eventually we could have a sphinx documentation on readthedocs for the bindings. that'd be super helpful. in case you didn't try that before, it's pretty easy to set up (never tried with pybind11 but seems well integrated too).

@jonathf
Copy link
Collaborator Author

jonathf commented Dec 30, 2021

Thank you for your 2C.

I like your enthusiasm. :)
Yes, I also see the potential for where this is going.

I have a lot of experience with Sphinx and yes I plan to use it here for this project as well. This though is another topic for discussion. The Sphinx docs is likely going to be a lot bigger than the current documentation for vroom. Personally I am good at writing lots of docs quickly, but they can be quite rough in the edges. If we move this under the same umbrella you guys are either commiting to my quick-and-dirty docs, or more of the organisation needs to contribute to polishing.

As you said, this is a fairly clean wrapper code wise, so maintaining the code later if I later decide to move on should be trivial. And I highly doubt version updating will take much of my time as soon as we get to a place where we are feature complete.

Perhaps a detail, but somewhat important to me:
Is it possible create e.g. a VROOM-Project Github organization page which include a project overview, a little history outlined and list the maintainers and their responsibilities?

@nilsnolde
Copy link
Contributor

Is it possible create e.g. a VROOM-Project Github organization page

a good idea IMO, proof it's growing in community:)

@jcoupey
Copy link
Collaborator

jcoupey commented Dec 31, 2021

Thanks @jonathf for considering the idea. On your questions:

What is the current governance model for VROOM-Project and each of its repos?

I'm the most active user with full admin rights so TL;DR is I'm deciding what goes in, especially on the core stuff. Now in practice this is repo-dependent:

  • I'm maintaining the core vroom repo: answering questions, triaging issues, reviewing PR. I've been designing the solving approach together with @cvarnier and @nicodjeanmarc but I actually wrote most of the codebase and I'm still implementing most of the new features. I'm defining the priorities and roadmap based on the needs of our company and clients. We're lucky though to get (more and more) contributions on both features/enhancement and technical aspects like CI, build etc. Discussions on what changes are welcome usually happen in issues or PR and the process has been pretty smooth so far.
  • vroom-frontend is basically a static demo with not many changes over time
  • vroom-scripts is only interesting for those doing dev or benchmarking and does not get much traction either (it's mostly me making some dev scripts available)
  • vroom-express is slowly evolving to keep in sync with the upstream API and updating dependencies once in a while so there is not much to discuss there
  • vroom-docker is maintained by @nilsnolde. I have no specific knowledge of Docker and I'm not a user of the images myself so I leave it up to him to decide on how to best organize the repo and overall process.

Providing python bindings here would be quite similar to vroom-docker in the spirit. You could maintain the repo along with @nilsnolde since he also stepped forward and you two seem to have further ideas on the matter.

Will there be an expectation to keep the Python interface as close to the C++ source as possible?

I don't have any specific expectations since I would not be using the bindings myself. My only concern would be that things are done in a clean way, that is easy to use and maintain, including with upstream changes. From the above discussion, I'm not too worried on that.

Larger deviation won't happen as it will result in maintainence hell, but smaller stuff that I think will work better might.

That's close to what I call clean and easy to maintain. ;-)

The GitHub organization page is definitely a good idea, especially if we're adding yet another repo to the list...

@jonathf
Copy link
Collaborator Author

jonathf commented Jan 1, 2022

Sounds like I'm better of inside the organization than outside. I am therefore willing to move my repo.

BTW, since this company backed, does that mean the repos in the organization benefits from above free tier CI/CD?

@jcoupey
Copy link
Collaborator

jcoupey commented Jan 3, 2022

Great, I'll handle the repo setup in the next few days!

The company backing I mentioned is about allowing me to spend time developing VROOM as open-source and doing maintenance work on the project (and hosting the demo server). We use GitHub Actions for CI in the core repo (switched from Travis one year ago in #436) and this is free so far...

@jonathf
Copy link
Collaborator Author

jonathf commented Jan 4, 2022

Because of Python binaries has to be built for every supported Python version (3.7, 3.8, 3.9, 3.10), for every platform (Linux, MacOS, Window), every architechture (Typically 2 per platform), building Python wheels is going to be fairly resource demanding. Current full Github action pipeline (which still misses some targets) uses a little less tan 40 minutes (though only 15 min parallelized).

Github action budget is generous with 2000 min/month free, but in the begining while the codebase is being actively developed, I'll likely use over half of that. When the code becomes more mature, this will likely be much less of a problem of course.

If this is going to be an issue, perhaps waiting to move the codebase a bit is something to consider.

@jcoupey
Copy link
Collaborator

jcoupey commented Jan 4, 2022

Maybe it's possible to reduce the number of targets (even if only temporarily). Building across a huge variety of targets can result in a waste of resources in a dev period with many changes.

For example we don't even have CI builds for MacOs an Windows in our core GA workflow. Also we used to test two different versions for both GCC and clang and I don't recall we encountered any problem after downgrading to a single version per compiler (removing the oldest one).

Anyway no problem to wait a bit if this can be a bottleneck, you tell me!

@jonathf
Copy link
Collaborator Author

jonathf commented Jan 4, 2022

I think what you are suggesting is a good way to go i this intial faze. We'll reduce the number of build targets to perhaps 4-5 (or something) and scale up again afterwards.

@jcoupey
Copy link
Collaborator

jcoupey commented Jan 6, 2022

The rationale behind choosing the setup/versions in our core workflows is to match what you get on a typical Ubuntu LTS, because it's a widely used setup.

In our case, that would even bring the targets close to 1 (Linux with python 3.8). ;-)

@jcoupey
Copy link
Collaborator

jcoupey commented Jan 6, 2022

I just created the pyvroom repo and invited you both as maintainers. I'll close this ticket since we can now move all related topics there.

@jonathf no rush here, please do things at your own pace and only populate the repo when you feel it's a good time for you.

@jcoupey jcoupey closed this as completed Jan 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants