Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External Python dependencies #1447

Closed
svartkanin opened this issue Aug 30, 2022 · 8 comments
Closed

External Python dependencies #1447

svartkanin opened this issue Aug 30, 2022 · 8 comments
Labels
POLL A change waiting for traction/discussion

Comments

@svartkanin
Copy link
Collaborator

I'm putting this here as a discussion point and/or exploration if it's possible at all.

Would there be a way for us to use external dependencies?
I don't know how the current release loop works or what the blockers/challenges are to accomplish this.

I'm not keen on pulling in heaps of external dependencies but things like pyparted might be extremely useful and would probably prevent a lot of bugs from reinventing the wheel :)

@Torxed
Copy link
Member

Torxed commented Aug 30, 2022

<background-rant>
I'm quite sure that if we dig through the archives of comments I have historically been quite against this.
The main reason for not doing it earlier was that archinstall was more or less a single-file library that could be downloaded with wget and be executed. I later transitioned to compiling it with nuitka3 instead. But this is no longer the state of archinstall.. It's now a official application/library that users don't have to source themselves.
</background-rant>

I have some reservations still, but they're minor:

  1. The library must be well maintained and have a large traction (pyparted is a good example, I think.. Even tho it was 2 years between the last two releases heh). This minimizes the risk of sudden abandonment and deprecation.
  2. It must be a Arch Linux package, this is to simplify packaging for upstream package
  3. Preferably the library should evict/replace large chunks of our own inventions (this is to avoid npm hell where we have large amount of dependencies for very small tasks)
  4. The library should be either extremely small (to avoid ISO bloating) or we should use a lot of the library's potential (to justify the somewhat large library)
  5. (Bonus point) it would be nice if the library's introduction was easy to revert in case of point 1 or 4

If we go down this route, I suspect it might be a perfect opportunity to do large scale rewrites. As a lot of what archinstall both suffers from, but does ok and has a lot of helper functions for - is disk management. pyparted does a lot of it better for sure. But that would mean a lot of rewriting code and thus perhaps allow for a v2.6.0 or even v3.0 release if we bunch up a lot of other rewriting (that would make quality of code a lot better.. We still have a lot of legacy code from like v1.0 that is far from best practice sadly).

Any other issues we can probably manage, either with build hooks or something else.

So yea, tl;dr I'm in favor these days with careful consideration on the introduction :)

@Torxed Torxed added the POLL A change waiting for traction/discussion label Aug 30, 2022
@svartkanin
Copy link
Collaborator Author

I do agree on that only libraries that give a lot of benefit should be included.

The library must be well maintained and have a large traction (pyparted is a good example, I think.. Even tho it was 2 years between the last two releases heh). This minimizes the risk of sudden abandonment and deprecation.

I think it's good to know that the library is still maintained, and probably should be in case of pyparted. I think it would be a case by case decision to make.

It must be a Arch Linux package, this is to simplify packaging for upstream package

Here's where my knowledge about packaging is quite blurry so any explanation would be appreciated :) Most python packages will be published in the pypi repository, how is that impacting things? I was under the impression that when the ISO is build it would pull the code from github and then just run the python setup.py install somehow? Meaning that dependencies in requirements.txt for example would just be pulled in during the build?

The library should be either extremely small (to avoid ISO bloating) or we should use a lot of the library's potential (to justify the somewhat large library)

Happy to do some analysis on this one to see the differences when including dependencies

Bonus point) it would be nice if the library's introduction was easy to revert in case of point 1 or 4

Essentially this means that we're implementing the library functionality (at the least the one relevant) ourselves, which can maybe even extracted from the library if shit hits the fan :D

@Torxed
Copy link
Member

Torxed commented Sep 2, 2022

I think it would be a case by case decision to make.

I agree, this is a reasonable effort from all parties.
My hope is that people don't take it personal when we eventually reject a PR introducing a dependency.

so any explanation would be appreciated :)

I am sure that I'm probably the wrong person to give a technical explanation of this as I've mostly made mistakes trying to package stuff in AUR. I was young and naive. But from my mistakes the things that I've picked up is:

There's depends that simply makes sure that the libraries we use will be installed on the system:

depends=(python)

The packages won't actually be contained within the package with this approach however.
You could surely solve the problem by running pip install --target=${target} during packaging making sure that the libraries gets installed to the appropriate "overlay" structure that the finished Arch package contains and then later unwrapped onto the user system during installation.

However, that would cause issues when users later try to do pacman -S python-pyparted as it will complain that files already exists on the target destination.
So the real solution is to simply state that this package needs our libraries and then let pacman internally install everything and keep track of conflicts by not having two packages update the same files.

Also, another benefit would be automatic updates when doing system upgrades.
In our case that would 99% of the time be irrelevant since it's mostly executed in an ISO environment.. But you know, there's a few of us that run this on live instances too ^^

Again, I'm a noob on this area and I'm sure @grazzolini or @dvzrv for instance (or really anyone packaging stuff) could explain way better, as I'm currently learning a lot from them myself.

Happy to do some analysis on this one to see the differences when including dependencies

I'd bet my cat, who I hold very dearly, that it's going to be a drop in the ocean 99% of the time. As long as we don't start pulling in TensorFlow stuff that is 500+MB we should be fine.

Essentially this means that we're implementing the library functionality (at the least the one relevant) ourselves, which can maybe even extracted from the library if shit hits the fan :D

We could also create wrapper functions that we call, that way we can just change the central places connecting to the libraries.
Kinda what we do already so yea, just keep it up and we should be fine ^^ Avoid using the libs inside Installer() for instance hehe.

@svartkanin
Copy link
Collaborator Author

@Torxed I had a look at this, and I wonder if it's actually simpler than doing things in the PKGBUILD file by simply adding it in the setup.cfg

[options]
packages = find:
python_requires = >= 3.8
install_requires =
    pyparted==3.12.0

this worked just fine for me when doing a python setup.py install or building the wheel first and then installing that one. The required dependencies will be included in the wheel already I believe.

As I understood the depends in the PKGBUILD is rather for actual pacman package dependencies rather than python packages, but I might have misinterpreted that.

So looking at the PKGBUILD there are two sections

build() {
  cd $pkgname-$pkgver
  python -m build --wheel --no-isolation
  PYTHONDONTWRITEBYTECODE=1 make man -C docs
}

package() {
  cd "$pkgname-$pkgver"
  python -m installer --destdir="$pkgdir" dist/*.whl
  install -vDm 644 docs/_build/man/archinstall.1 -t "$pkgdir/usr/share/man/man1/"
}

which indicate that the wheel is build and then installed. The python package dependencies if specified in the setup.cfg will then be included in the wheel.

So we might be good with simply adding the dependencies to the setup.cfg file :)

@Torxed
Copy link
Member

Torxed commented Sep 13, 2022

That will work but assume that someone does pacman -S python-pyparted after installing archinstall. Then pacman will fail I believe?

I haven't tested it myself so It's just a theory :)

@svartkanin
Copy link
Collaborator Author

@Torxed sorry for the late response, live got in the way :)
python-pyparted isn't an arch package, meaning it can't be installed with pacman. To other packages I don't think they will cause conflicts either, as they would be installed in two different locations.

If I run pip install tabulate it will be installed into ~/.local/bin/tabulate whereas if I'm installing it with pacman -S python-tabulate it will create an entry in /usr/bin/tabulate.

I've raised a first PR to introduce the external dependency support, it can be used as a starting point and to see if this would be right way to do it #1478 :)
looking forward for any feedback!

@Torxed
Copy link
Member

Torxed commented Feb 27, 2023

I've raised the question with a packager to see how easy it would be to adopt aur/python-pyparted into extra or something similar.

It would allow us to do #1478 but also introduce it in the PKGBUILD for proper packaging all the way :) It's mainly to avoid multiple packages owning the same files on disk without explicitly declaring it.

@svartkanin
Copy link
Collaborator Author

I would consider this as resolved as we're using dependencies now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
POLL A change waiting for traction/discussion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants