Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on universal PKGBUILD generation #1

Open
stertingen opened this issue Jul 4, 2020 · 21 comments
Open

Thoughts on universal PKGBUILD generation #1

stertingen opened this issue Jul 4, 2020 · 21 comments

Comments

@stertingen
Copy link

stertingen commented Jul 4, 2020

Right now I'm pretty involved in my Diploma thesis, so I have not much time to continue something I've started a few months ago.

Based on the ideas of create_pkgbuild.py, I've played a little bit around to improve the automatic PKGBUILD generation. Some of the ideas below can be found in this script: https://gist.github.com/stertingen/76b0d9e5078674eff1fa086bb91fe8dd

This script tries to generate PKGBUILD files for every package in a ROS distribution, there are only few packages failing due to own naming conventions regarding Git tags or other minor issues. (Example: https://github.com/stack-of-tasks/eigenpy)

Important features & changes from create_pkgbuild.py:

  • Python 3 only, in both the PKGBUILD generation and the packages itselfes
  • Use rosdistro and catkin_pkg python tools provided by OSRF.
  • use catkin_make_isolated instead of cmake + make to create split packages by building all packages in a repos at once
  • Use rosdistro cache to reduce the number of HTTP requests. The rosdistro cache contains information on all repositories and packages of a ROS distribution and package.xml files for each (release) package of the ROS distribution. This way, the package.xml packages do not need to be downloaded for each package.
  • Spawn multiple threads to generate PKGBUILD files in parallel.

The generation of split packages introduce some advantages over individual Arch packages per ROS package:

  • Less PKGBUILD files -> less maintenance effort for manual interventions (like applying patches)
  • Less build time when executing yay -S ros-noetic-desktop-full

Currently, there are also some disadvantages in my implementation, too:

  • Right now, it is ROS1-specific. To support both ROS1 and ROS2, one could consider migrating to colcon.
  • It relies on some conventions regarding version tags and packaging. As there are some repositories not following these conventions, ROS introduced release repositories (https://github.com/ros-gbp) as information/build source for the ROS build farm.

By using ROS distro cache only, a PKBUILD generation could look like:
1. Use the archive specified by the release tag in the rosdistro cache
2. Use the version number from the rosdistro cache
3. Use dependencies from cached package.xml
4. Use either colcon or build tool specified by cached package.xml
This PKGBUILD generator would create a PKGBUILD file for each ROS package and relies on the distro cache only.

By using the ROS distro cache and the ROS release repository, split packages could become possible again:

  1. Fetch tracks.yml from the release repository specified by the distro cache
  2. Extract information about the source repository (URL and version tag) and use these for the PKGBUILD. Alternatively, archives for the source repository are also available as releases in the release repositories.
  3. Use dependencies/build tool from cached package.xml
    This PKGBUILD generator would create less PKGBUILD files, but needs to fetch the tracks.yml for each repository.

As the ROS release repository were introduced for automatic package generation, I would propose to make use of them. I see the potential to provide automatically generated PKGBUILD files for all ROS1 and ROS2 distributions as long as they support python 3 (no kinetic, limited melodic).

There are a few little things to consider:

  • In addition to rosdep, we might need to map rosdep keys to AUR packages as the ROS maintainers would not add those to the official rosdep keys.
  • Automatic pkgrel generation/increment. The ROS release repositories also introduce a pkgrel version number which my might either omit or translate into something like 1.2.3r4. I would also propose some sort of two-step PKGBUILD generation, in which the actual PKGBUILD generator (step 1) is stateless and only relies on the distribution files provided by the ROS infrastructure. In step 2, the new PKGBUILD is compared to the old one and the pkgrel is increased if needed.
@acxz
Copy link
Contributor

acxz commented Jul 7, 2020

This topic is def something we need to work on, thanks for tackling this issue.

Python 3 only, in both the PKGBUILD generation and the packages itselfes

This is not really an important change considering that the current create_pkgbuild.py works under python 3 and also uses the python3 executable.

Right now, it is ROS1-specific. To support both ROS1 and ROS2, one could consider migrating to colcon.

As of right now this is perfectly fine. While I do want to get ros2 working on Arch as soon as possible most ROS users are still on ROS 1 and as such continued development on ROS1-Arch is much appreciated.

It relies on some conventions regarding version tags and packaging.

This is always gonna be the case, we will handle those as special cases.

Use the archive specified by the release tag in the rosdistro cache

I don't like this because the release tag usually points to ros-gbp. See later on why.

Use the version number from the rosdistro cache

We should use the latest on available on the source github repository.

In addition to rosdep, we might need to map rosdep keys to AUR packages as the ROS maintainers would not add those to the official rosdep keys.

I think there was some ongoing conversations about this and an existing solution that hasn't been merged in yet: ros-infrastructure/rosdep#560

As the ROS release repository were introduced for automatic package generation, I would propose to make use of them.

@bionade24 @jwhendy and I went to some lengths to actually not use ros-gbp. The main reason for this was that the release of ros-gbp happens at a slower pace than upstream. This means that fixes/commits that address Arch Linux issues do not get patched into those releases. Not to mention that the patches that upstream uses have to be rewritten to work for ros-gbp releases due to the change in directory structure. Using the source releases gives us more flexibility as well as a faster turnaround in resolving issues. The closer we are to the source the better. I am pretty adamant on the decision to not use ros-gbp releases and continue using the packages own releases.

@jwhendy
Copy link

jwhendy commented Jul 7, 2020

@acxz

@bionade24 @jwhendy and I went to some lengths to actually not use ros-gbp.

No complaints from me there! Most of what I did all that scripting for and integration with ros.index was specifically to get all the PKGBUILDs updated to upstream vs. ros-gbp. Indeed, there is no point to that, in my opinion, in a non-compiled release. It's waaay more of a pain to patch those outdated releases vs. dealing with current foo-devel at the official repo.

@acxz
Copy link
Contributor

acxz commented Jul 7, 2020

@jwhendy thanks for chiming in.

How have you been, man? Been a while since I've seen you around, haha.

@jwhendy
Copy link

jwhendy commented Jul 7, 2020

Ha indeed! Had some mandatory furlough and vacation for COVID which I used to do a bedroom project, organize my garage, and do woodworking :) We have only just started getting lab access recently since being at home since ~mid March. I've done hardly anything with robots which leads me to be less personally motivated on this project, unfortunately.

Also, my job has shifted more to using something we developed over the past couple years vs. getting a lot of time for development. Anyway, life is good, but indeed I'm not super plugged in here at the moment. How about you??

@acxz
Copy link
Contributor

acxz commented Jul 7, 2020

Glad to hear you are staying positive in these times!

Luckily I still have my part time job (switched completely remote) but my research lab had closed down. Moved into another apartment and tried to double down on open source stuff with some free time, specifically getting ROCm (NVIDIA's CUDA) working on Arch (rocm-arch).

Our lab also recently opened back up and we are finally starting to do some hardware experiments, hopefully they go well, haha.

@bionade24
Copy link
Contributor

Yes, I can only agree that using ros-gbp is a bad idea. I would rather use GH's api to get the release URL and format them.

from github import Github
gh = Github("GH_OAUTH_KEY")
g = gh.get_organization()

#Releases don't seem to work for most packages somehow
for repo in g.get_repos():
    print(repo.name)
    for release in repo.get_releases():
        print(release)
        print(release.id)
        print(release.tarball_url)

#Getting tags & their URL seems to work fine, though.
for repo in g.get_repos():
    print(repo.name)
    for tag in repo.get_tags():
        print(tag)
        print(tag.tarball_url)

The organization is in the distribution.yaml, so g.get_organization(name).get_repo(name).get_tag(name).tarball_url should work.

@acxz Is rocm AMD's CUDA equivalent? Just curious.

@stertingen
Copy link
Author

I see, I did not know about the issues related to GBP back then, thanks for the feedback. This eliminates the use of tarballs provided by GBP repositories.

Are there disadvantages in using the release repositories just to fetch information (i.e. tracks.yml) on the source repositories?

@bionade24
Copy link
Contributor

Are there disadvantages in using the release repositories just to fetch information (i.e. tracks.yml) on the source repositories?

I don't see any information in there that is missing in the https://raw.githubusercontent.com/ros/rosdistro/master/melodic/distribution.yaml
from rosdistro. And it's hard to predict the URL changes of new commits needed to access the raw tracks.yaml files (I don't know if there is a non-raw way).

@acxz
Copy link
Contributor

acxz commented Jul 8, 2020

@bionade24

@acxz Is rocm AMD's CUDA equivalent? Just curious.

Yep it is AMD's high performance GPU compute stack equivalent to CUDA. It's also open source and the API is very similar to CUDA's to ensure people already familiar with CUDA can easily pick up ROCm, i.e. cudaMemcpy => rocMemcpy There is also a portability layer called HIP that allows you to write code which will use your native compiler, either rocclr (rocm's) or nvcc (cuda's) based on your system. This allows users to write code which will work on both AMD and NVIDIA platforms.

Are there disadvantages in using the release repositories just to fetch information (i.e. tracks.yml) on the source repositories?

Yeah as @bionade24 mentioned that would be the same as pulling info from rosdistro, although technically even rosdistro lags behind upstream releases right? I think we should use rosdistro to get the names (and URLs) of the ros packages for a particular ROS distro and then go to the github URL and get the latest release with github's API as @bionade24 suggested.

@stertingen
Copy link
Author

I don't see any information in there that is missing in the https://raw.githubusercontent.com/ros/rosdistro/master/melodic/distribution.yaml

The only information missing from the distribution file is the name of the version tag from the source repository, which I admit, is in the usual cases 1.2.3. In some cases, the version tag might be something like v1.2.3 in which reading tracks.yaml might come in handy. As mentioned below, the version number from this file might be newer than the one from distribution.yaml.

from rosdistro. And it's hard to predict the URL changes of new commits needed to access the raw tracks.yaml files (I don't know if there is a non-raw way).

Why that? The url on the master branch is always https://raw.githubusercontent.com/<orga>/<repo>/master/tracks.yaml and won't change.

Yeah as @bionade24 mentioned that would be the same as pulling info from rosdistro, although technically even rosdistro lags behind upstream releases right? I think we should use rosdistro to get the names (and URLs) of the ros packages for a particular ROS distro and then go to the github URL and get the latest release with github's API as @bionade24 suggested.

Yes, rosdistro lags behind upstream releases as ROS maintainers are responsible for merging PRs with new version information. But the release repositories are maintained by the upstream package maintainers, so when there's a new upstream release available, the release repository is also updated. (http://wiki.ros.org/bloom/Tutorials/ReleaseCatkinPackage)

Does the GitHub API know which versions of each package belong to each ROS distribution? For example, the catkin package provides the 0.7.x versions to melodic and the 0.8.x versions to noetic, other repositories have similar version policies. In case we can use the GitHub API to query information such as 'get me the latest release for noetic', using it makes sense using it, otherwise I'd prefer to either fall back to the version specified by rosdistro or the tracks.yml in the release repository.

@bionade24
Copy link
Contributor

https://raw.githubusercontent.com/<orga>/<repo>/master/tracks.yaml and won't change.
Sorry, thought there would be a commit id somewhere in it.

@jwhendy
Copy link

jwhendy commented Jul 8, 2020

@bionade24 my previous experience is that tags/releases were... weird as well. I think @stertingen is alluding to this based on the comment about v1.2.3 as well. I recall running into some oddities when going to list a repo's releases/tags and not finding a universal way to go from version to the tarball.

I could be wrong, but this is my recollection. I know maintaining something like a prefix list per repo dawned on me at one point, as it's usually just something like v for a minority of sites.

@bionade24
Copy link
Contributor

@bionade24 my previous experience is that tags/releases were... weird as well. I think @stertingen is alluding to this based on the comment about v1.2.3 as well. I recall running into some oddities when going to list a repo's releases/tags and not finding a universal way to go from version to the tarball.

I could be wrong, but this is my recollection. I know maintaining something like a prefix list per repo dawned on me at one point, as it's usually just something like v for a minority of sites.

If we use repo.get_tags()[0].name, we should always get the latest tag.

@stertingen
Copy link
Author

If we use repo.get_tags()[0].name, we should always get the latest tag.

Which gets the latest, but might not get the right tag as the latest tag might refer to a commit for another ROS distribution (possibly even ROS 2). (See also ros-melodic-arch/ros-melodic-catkin#7)

@jwhendy
Copy link

jwhendy commented Jul 9, 2020

@stertingen exactly. @bionade24 this was more like what I meant. I will try to find a specific example, but I recall doing something like extracting the version from distribution.yaml and then doing something like (this is pseudo-code, not real)

for tag in repo.get_tags():
    if tag == version_from_distribution_yaml:
        # do the magic

And that will fail, never finding it because 1.2.3 never equals v1.2.3. Again, this is from recollection, I'm ~85% sure that matching distribution.yaml to gh api for various repos was not 100% clean.

@bionade24
Copy link
Contributor

Yes, rosdistro lags behind upstream releases as ROS maintainers are responsible for merging PRs with new version information. But the release repositories are maintained by the upstream package maintainers, so when there's a new upstream release available, the release repository is also updated.

My fault, somewhere there I misread something. If the release repos are up to date, then go for them.

And that will fail, never finding it because 1.2.3 never equals v1.2.3. Again, this is from recollection, I'm ~85% sure that matching distribution.yaml to gh api for various repos was not 100% clean.

We could just try the other option before failing completely if it's really never 100%ly exact. My skeptic thoughts were based on the wrong believe that the commit id is in the URL to the files at GH, so there is no need to discuss at all.

@bastinat0r
Copy link

Hi, I also want to make something similar for ros2 packages. I changed the code superflore to generate PKGBUILD files. I guess building on superflore is nice, as it provides the packages for all ros distros (including ros2).
However I am not confident to just push the stuff I generated to aur so here comes my question:
How do you test those automatically installed packages before publishing on the official aur? I want to use some kind of dependency manager like yay, but I don't which tool to use for resolving dependencies in locally installed PKGBUILDs

@acxz
Copy link
Contributor

acxz commented Sep 16, 2020

You might be able to do so with aurutils, not sure tho.

@bionade24
Copy link
Contributor

Hi, I also want to make something similar for ros2 packages. I changed the code superflore to generate PKGBUILD files. I guess building on superflore is nice, as it provides the packages for all ros distros (including ros2).
However I am not confident to just push the stuff I generated to aur so here comes my question:
How do you test those automatically installed packages before publishing on the official aur? I want to use some kind of dependency manager like yay, but I don't which tool to use for resolving dependencies in locally installed PKGBUILDs

Just build in checking with namcap, Arch's official PKGBUILD and package-checking tool. That's the same that aurutils uses.

@bastinat0r
Copy link

bastinat0r commented Sep 22, 2020

I uploaded my modifications at https://github.com/bastinat0r/superflore in case you are interested. Currently it generates a PKGBUILD file with the correct sources for each package, however build/package instructions are not correct yet.
And the whole thing with the git-management for packages is not in place yet.

@bionade24
Copy link
Contributor

I uploaded my modifications at https://github.com/bastinat0r/superflore in case you are interested. Currently it generates a PKGBUILD file with the correct sources for each package, however build/package instructions are not correct yet.
And the whole thing with the git-management for packages is not in place yet.

I looked over it. First of all, I believe superflore is a legacy monster that's hard to touch doing weird things like using deprecated **kwargs instead of dicts and strings instead of enums. But your code is still very large for the result and needs a bit of shrinking imao. Also if it would use some dict for pkg information it would be easier to read and to debug. I commented the rest directly on the commit.

The program fails for me every time: (Tried superflore-gen-pkgbuilds --dry-run and with --ros-distro melodic)

Traceback (most recent call last):
  File "/home/oskar/.local/bin/superflore-gen-pkgbuilds", line 8, in <module>
    sys.exit(main())
  File "/home/oskar/.local/lib/python3.8/site-packages/superflore/generators/pkgbuild/run.py", line 173, in main
    generate_installers(
  File "/home/oskar/.local/lib/python3.8/site-packages/superflore/generate_installers.py", line 56, in generate_installers
    current, current_info, installer_name = gen_pkg_func(
  File "/home/oskar/.local/lib/python3.8/site-packages/superflore/generators/pkgbuild/gen_packages.py", line 86, in regenerate_pkg
    pkgbuild_text = current.pkgbuild_text()
  File "/home/oskar/.local/lib/python3.8/site-packages/superflore/generators/pkgbuild/gen_packages.py", line 189, in pkgbuild_text
    return self.pkgbuild.get_pkgbuild_text(org, org_license)
  File "/home/oskar/.local/lib/python3.8/site-packages/superflore/generators/pkgbuild/pkgbuild.py", line 151, in get_pkgbuild_text
    rc = self.version.split("-")[1][1:]
IndexError: list index out of range

Additionally, it generates a lot of ebuilds instead of pkgbuilds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants