Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System level package managers #1420

Closed
7 tasks
andrew opened this issue May 10, 2017 · 7 comments
Closed
7 tasks

System level package managers #1420

andrew opened this issue May 10, 2017 · 7 comments

Comments

@andrew
Copy link
Contributor

andrew commented May 10, 2017

Until now Libraries.io has focus primarily on supporting Application level package managers, but we also want to be able to index what I've been calling "System level package managers", things that aren't usually described as direct dependencies of an application but are required for it to be installed and ran on a given operating system.

There are quite a few of them, mostly Linux based, good list on wikipedia, but there are a couple on windows and mac os too that we'll want to support: https://en.wikipedia.org/wiki/List_of_software_package_management_systems#Linux

We have some basic support for homebrew already but it need improvement (support for formulas not in the standard homebrew repo and brewfiles)

n.b still focused on package managers that install libraries rather than applications for now, i.e openssl and libxml rather than desktop applications installed via package managers.

Initial things involved in making this happen:

  • Deciding on which system level package managers to aim to support initially, i.e. the most popular/biggest ones
  • Investigate to see how we can get the data out and keep track of updates for each system package manager
  • Make changes to support multiple registries for a given package manager, i.e debian stable, unstable, launchpad (also needs to be done to improve maven support)

Extra things that will make this really useful:

  • Supporting arbitrary git/svn/etc repositories, most system level packages do not have their repositories hosted on github
  • Add support for indexing puppet/chef/ansible/salt configs that give us a connection between and application and it's system packages
  • Bridge the gap between application level packages and system level packages by building a system that can codify the loose connections between them, i.e. nokogiri depends on system level package libxml2but doesn't communicate that via it's gemspec
  • A meta repository of system level package virtual names and their specific names on each package manager/registry

Good package managers to start investigating:

  • apt
  • rpm/yum
  • nix/guix
  • dpkg
  • chocolatey
  • homebrew
@ncoghlan
Copy link

Another potential cross-platform package manager to add to the list: conda (and the conda-forge community packages)

That one covers quite a few popular data science packages (and their dependencies) across Python and R.

One possible approach for getting started on finding out what the system dependencies actually are might be to look for PyPI/RubyGems/etc packages that are also available in the language agnostic package manager of interest, and see what system packages the latter versions depend on. Getting a comprehensive mapping for that is a pain (hence release-monitoring.org), but you could begin with packages that follow the conventional naming schemes for their distro.

@regularfry
Copy link

I can describe some of the boundaries around apt, to give a flavour of the problem here.

  1. Many different distributions use the apt toolset, and the same system dependency may be supplied by different packages on different distributions.
  2. Some runtime system dependency packages will install their own build dependencies, others will not. Sometimes within the same distro.
  3. Different distributions may supply the same dependency in different numbers of packages. A single package on one distro might map to many packages on another.
  4. Different releases of the same distro may rename, remove, merge, or split packages compared to earlier and later releases.
  5. Some apt-based distributions have rolling releases, so could potentially rename, remove, merge, or split packages from one day to the next.
  6. Different packages in a single release can be alternatives for the same system dependency. It is entirely legal (and in some cases inevitable) that a file at a given path on the filesystem is provided by more than one package available in the release.

All this being said, if you've got, say, a working gem install'd nokogiri on a Debian of some sort, it's relatively straightforward to figure out what the runtime system dependencies that needs are: you can go from extension .so to package name via ldd. Given that, rather than try to gather dependencies by scanning repositories, might it make sense to build a popcon-style gem which people could voluntarily install, which would Do The Right Thing at gem install-time to post back the system dependency tree along with an OS fingerprint? That way you'd be getting Real Live Data(tm) rather than a best guess.

@ncoghlan
Copy link

@regularfry's post reminded me of a project I meant to mention in the Python context: https://github.com/pypa/auditwheel

That's part of the manylinux1 effort to build cross-distro Linux binaries for Python extension modules: auditwheel looks at shared libraries in the built wheel, and identifies external dependencies that aren't already covered by the manylinux1 ABI specification.

Beyond that, the comments about apt also apply to rpm - while the underlying tech is the same, the way Fedora/SuSE/Mageia/etc name and structure their system packages isn't identical.

@ncoghlan
Copy link

Another couple of potential data sources to add to the list:

  • Dockerfile commands in GitHub repos
  • CI configuration files in GitHub repos

The former might be too noisy to be useful (since the typical pattern I've seen is "install all your system level deps" then "install all your language level deps"), but if you can narrow them down to "here's a quick way to set up a local dev environment" containers that may be better.

The latter seems potentially more promising, but a quick look specifically at Nokogiri's setup (https://github.com/sparklemotion/nokogiri/blob/master/.travis.yml) suggests they don't rely on that to get the runtime dependencies installed.

@andrew andrew added the on hold label Jun 27, 2017
@andrew
Copy link
Contributor Author

andrew commented Jul 13, 2017

Another potential meta package manager to lump in with the system tools: https://github.com/Fylhan/qompoter

@ncoghlan
Copy link

I'm not sure if this belongs here or on a separate issue, but @mlouielu recently pointed me towards the package tracking on DistroWatch, where they monitor the versions of a couple of hundred different packages across the various Linux distros: https://distrowatch.com/packages.php

They don't seem to have any package-centric views that I can find (i.e. listing different versions of a package and which versions of which distros it appears in), but they do list breakdowns on each distro's page:

@andrew andrew added the roadmap label Oct 2, 2017
@andrew
Copy link
Contributor Author

andrew commented Oct 9, 2017

Moving this to the Backlog as we'd still like to implement it but can't see that happening in the near future.

@andrew andrew closed this as completed Oct 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants