-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement building with CMake #478
Conversation
…m for all of CompilerGym
Hi @KyleHerndon! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
Hi @KyleHerndon, wow, this is quite the diff! :-) I'll have a quick skim through and leave any inline comments, though I should explain that I'm definitely no CMake expert and have limited experience in working with CMakeLists files. At this early stage I'd like to better understand what the motivation and end goals of this new CMake build system are. Here's a few scattered questions. Feel free to address any and ignore those that are not relevant :)
Cheers, |
################################################################################ | ||
# Autogenerated by build_tools/bazel_to_cmake/bazel_to_cmake.py from # | ||
# ../CompilerGym/compiler_gym/BUILD # | ||
# # | ||
# Use cmake_cmake_extra_content from iree/build_defs.oss.bzl to add arbitrary # | ||
# CMake-only content. # | ||
# # | ||
# To disable autogeneration for this file entirely, delete this header. # | ||
################################################################################ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the plan is to keep CMake as a parallel build to Bazel, then I suggest that these files not be checked-in. Of course, if replacing Bazel then that is a different matter :-)
Cheers,
Chris
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file was auto-generated, but further edits have been made on top of it.
Hi Chris, Is the goal of adding CMake to overcome a limitation of the current Bazel build system? I think our team would be willing to get CMake to feature parity if you wanted to use CMake exclusively, and we could definitely setup a script that could be used to automatically generate as much of a CMake file as possible from the BUILD files. Kyle |
Hi Kyle, Thanks for the quick response and extra context! :) That helps clear things up.
Ah I see, thanks for the explanation. So if we look at the main target of CompilerGym there are three classes of files: data, python, and binaries: (please excuse my crappy drawings 😄) At the moment, bazel is used to do the heavy lifting of coordinating, whenever any one of those bits changes, what needs rebuilding/re-testing. It looks like your goal is to move that responsibility from bazel to cmake. I am totally open to the idea of adopting a new build system if there is some significant advantages, such as making it easier to extend CompilerGym / integrate with third party libraries. However I just want to explore whether there is a lower-effort approach that you could try before putting too much work into your port :) Have you considered an alternate approach in which CMake is used to build your binary, handling all of the LLVM/MLIR side-of-things, which can then be treated as any other data file by the existing build system? Running the cmake build could be a totally ad-hoc step in the Makefile, or there is quite good support for integrating this type of use case directly in Bazel. I wonder if an approach like this could help you get something up and running without the effort having to achieve full feature parity with
If you want to purse then I think the goal should be to totally replace and remove all of the existing BUILD files. It would be worth us coming up with a couple of small example use cases that you could begin by implementing which would make it easier to check that it is on course for achieving the right goals. Cheers, |
This means that we can't use CMake tests. They must be build targets instead.
CMake is not a hermetic build system. |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
Hi Chris, Thanks for the lovely pictures, quite informative! I and others on my team have spent some time trying to get bazel working with LLVM/cmake but it's been a deep fruitless rabbit hole. I see from your bazel_llvm repo that you've had some trouble with it yourself too. I understand your concerns regarding hermiticity but I'm concerned it might not be viable while trying to get LLVM building properly from source, at least, not without some effort from LLVM to solidify their bazel build (which I don't see happening soon). As @sogartar mentioned, and I previously forgot, hermiticity is difficult to achieve in CMake, (though in practice I think you can get it looking pretty close). For this reason though, I don't think pursuing a complete replacement of bazel with CMake is likely to be desirable to you. It would be good to clarify though, would you object to having parallel build systems? A lot of compiler libraries, especially those based on C/C++ are more likely to use CMake instead of bazel, in my experience. I think parallel build systems could make the ease of extensions to CompilerGym easier in general regardless of just our use case. That said, a compromise that I think would probably match both our interests well would be to autogenerate the CMake from bazel for nearly everything, and just have some special annotations in the BUILD files or separate files that specify any additional information when absolutely necessary. I expect these cases are very small in number, besides at the top level directory and to define the special bazel to CMake analog functions. Also, I have a concrete idea of how to do this already. Regards, |
Codecov Report
@@ Coverage Diff @@
## development #478 +/- ##
===============================================
- Coverage 87.75% 87.49% -0.26%
===============================================
Files 111 111
Lines 6181 6181
===============================================
- Hits 5424 5408 -16
- Misses 757 773 +16
Continue to review full report at Codecov.
|
Hi @KyleHerndon and @sogartar, Sorry for the slow response!
No, I wouldn't object to parallel build systems. I was just suggesting whether there was a way to save you guys some effort. I will happily merge a cmake build if it enables you to do your work and makes it easier for others to extend CompilerGym. :-) Let's plan on parallel builds for now. Longer term I would be interested in seeing if we can address the hemeticity / testing issues so that cmake can fully replace bazel, that way there isn't the burden of maintaining two builds. Cheers, |
Commenting here because of iree-org/iree#7736
What issues have you run into, specifically? In https://github.com/google/iree we build LLVM with the upstream Bazel build. In fact, I was the one who went on quite a large digression from my normal work to unify the various incarnations of a Bazel bulid of LLVM upstream. I'd be happy to work with you on issues you're running into.
Again, in IREE, we generate our CMake build files from Bazel ones. This works well because we control both the Bazel and CMake side of things. I think that it has actually improved our build configurations because it adds constraints that prevent us from doing anything too creative 😛 Generating CMake from Bazel in full generality would be much trickier, I think. https://github.com/google/bazel-to-cmake started out trying to do this and didn't make it very far, which is why we went with our approach, which basically just translates BUILD files into CMakeLists.txt, where we have CMake functions that mirror Bazel rules and macros.
I think it would actually be quite difficult to do this fully in CMake. Perfect dependency tracking is really the main value proposition of Bazel (I also much prefer the language of BUILD files to CMake, but that's comparatively minor). If you have things nicely divided by directories where there are no cross dependencies, you can manually divide what gets built.
I would expect it to be much worse. Again this is Bazel's main value prop. Having used both, Bazel definitely wins on this one. CMake wins some other things that I think make it overall more practical to just build your damn project (like an actually-usable if statement), but it wasn't built with this goal and is not very good at achieving it.
Curious whether you've actually used this. I haven't tried it myself, but know some folks who did not have success with it. I'm not sure how much it's improved in recent years. Maybe we should take another look. More generally though, I've found that every time someone constructs a build-system-with-a-build-system it has ended up being something difficult to maintain that everyone hates. |
Thanks for weighing in on this @GMNGeoffrey!
I'd be fine with using a restricted subset of Bazel for CompilerGym. The important rules are
I have had some success with external CMake/Makefile rules for "simple" cases like: CompilerGym/compiler_gym/third_party/csmith/BUILD Lines 16 to 30 in 63acf99
It's far from perfect but often easier than writing a set of BUILD files for it. Before the LLVM upstream bazel support I tried (very) unsuccessfully to build LLVM as an external CMake target here. I think @kylescotshank makes a good point about CMake being widely adopted by existing compiler libraries, so I think there is a good case for CMake support aside from just the LLVM dependency. Cheers, |
If you'd like to fork our Bazel->CMake stuff and give it a try here, I'm all for that :-) Or if you want to try out making it more general. I think we don't want a general solution to be developed within the IREE repo, but perhaps if we have a few projects using similar approaches we could narrow in on a good general/maintainable trade-off. Right now we definitely do some things for expediency there that we'd probably like to continue doing without having to worry about affecting other projects (we're trying our best not to become a build-system project 😛 )
Ah yeah. I think we were looking specifically at LLVM. I suspect some less complicated projects wouldn't be too bad |
Deprecated by #498 |
Closing :-) |
For context, I believe Anush (@powderluv) has already been in discussion with you on subject of our proposed contributions of a CMake build system and implementing an MLIR environment, where CMake is a soft but extremely useful requisite for the MLIR environment.
I've put out this pull request before finishing implementation and clean up on this commit to solicit feedback and to hopefully save some effort on our part (Nod.ai). These changes are not completely ready as is.
In our development of the CMake build system for CompilerGym, we've borrowed a system from another library that can be used to automatically (partially) migrate bazel BUILD files into equivalent CMakeLists. So, one major reason we're looking for feedback before finishing is to make sure this format (copying Bazel style into CMake instead of using idiomatic CMake) will be fine as doing so will save us a lot of effort.
Additionally, if there's any patterns that you know that you'd like us to avoid, especially ones that you see us using in this pull request, please inform us so that we can implement your style requests and hopefully save ourselves from rewriting this PR once or twice.