Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nimble improvements #398

Open
haxscramper opened this issue Jul 4, 2021 · 22 comments
Open

Nimble improvements #398

haxscramper opened this issue Jul 4, 2021 · 22 comments

Comments

@haxscramper
Copy link

haxscramper commented Jul 4, 2021

The RFC is extremely conservative on breakages and new features - it only strives to formalize existing behavior and improve general experience of using nimble. I'm not going to propose any drastic measures as dropping versions and using git hash commits instead, or integrating package management with module system. As a result, majority of existing packages won't be affected in any way and common workflow of using nimble would stay largely the same.

Package registry with dependency metadata

Most package management solutions include centralized package index that keeps track of all package version, thus solving problem of finding requirements for a particular version of the package. For example, cargo is centralized and full information, including each package version is stored in git repo in the form of simple json files. When new version of the package is published simple edit to package file is made. Right now nim employs similar solution via nim-lang/packages, but it requires someone to manually merge PR with a new package. Not really scalable solution - sometimes you might need to wait several days before package is added to the registry, but I believe this can be automated. New version is recorded as simple one-line diff that adds information about new package version, it's requirements etc.

  {"name":"package","vers":"0.4.2","deps":[{"name":"depname","req":"^1.4"}],"url": "https://gitlab.com/XXX/YYY.git"}
+ {"name":"package","vers":"0.4.3","deps":[{"name":"depname","req":"^1.4"}],"url": "https://gitlab.com/XXX/YYY.git"}

Having package registry which records all package+version -> dependencies mapping is important for full dependency resolution. It also allows for projects like nim package directory to exist, which help increase discoverability of different nimble projects, and analyse whole ecosystem at once (which was crucial when writing this RFC - without access to comprehensive list of packages I would not be able to provide any concrete numbers).

IDEA: make nimble publish put current package metadata in the nim-lang/packages index - by either creating new package (like it does now), or pushing new version.

Changes to package publishing workflow

  git commit -m "[VERSION] v1.2.3"          # Commit your changes
  git tag v1.2.3                            # tag changes
+ nimble publish                            # push new package version to registry
  git push --tags                           # upload tags to github
  git push origin master                    # upload code to github

NOTE: Convenience command like nimble newversion could be introduced to tag, commit and push package all at once. This would also allow to write before hook for running full tests, docgen etc. before newversion: exec "nimble", "test"

Use explicit dependency graph in Nimble

Current implementation of dependency resolution does not construct explicit dependency graph, and instead just loops though requirements, almost immediately installing them which I believe to be the source of such bugs as "nimble loops infinitely trying to install codependent requirements" and "Dependency resolution depends on the order in requires" (could be prevented with explicit dependency graph construction).

Related links:

IDEA: Improve dependency resolution algorithm. Use full knowledge about requirements about each version of each package from previous section.

NOTE: Right now nimble test and similar tasks are always interleaved with dependency resolution and potentially package downloads. With full access to existing package metadata these steps can be simplified as only unregistered packages would have to be downloaded.

Do not require full manifest evaluation

Right now it is necessary to fully evaluate nimscript configuration in order to determine list of dependencies, as it might contain code with complicated logic. It cannot be statically reasoned about and introduces a lot of complexity in tooling -- nimble has to generate a .nims file, evaluate it separately. It processes nim code and modifies global variables like vesion, then prints result in the end. Output of script execution is parsed by nimble and only then get list of dependencies.

Arbitrary nimscript might be considered a good solution for custom task targets, but ultimately this leads to issues where even author = <author name here> might require full compiler to evaluate.

Instead, small subset of the package manifest must be written in a declarative manner (still using nimscript syntax, but with more strict rules). Specifically this concerns requires and couple more metadata fields.

  • version = "version" and packageName = "name" must have string literal and be located at the toplevel.
  • Non-optinoal dependencies are written as requires "dependency1", "depenendency2" and are located at the toplevel file as well.
  • Optional dependencies are written in when defined(windows) section. when section must contain static list of requires, identical to the toplevel ones. This part is particularly important as it allows to determine feature-based dependencies to avoid installing unnecessary packages, especially in case if they have after install that might fail whole installation.

Existing nimble packages almost universally comply with these requirements, and most of ones that don't have simple repeating pattern violating the rule import in nimble. Optional dependencies are already handled this way for most of the packages - treeform/hottie, minefuto/qwertycd and several others.

IDEA: Small subset of important metadata like requires were written in much stricter declarative manner.

NOTE: version and packageName information is redundant - first one is already stored in git tags (and nimble actually uses them to fetch required package versions), and second one is optional and "must match name specified inside <packagename>.nimble". The only advantage of having version in the .nimble configuration is that you don't need to shell out to git in order to find out package version.

NOTE: Additional restrictions might be placed of other metadata fields like foreignDep, author, description etc. Foreign dependencies might also be checked upon installation, but that can be implemented later. Almost all packages that we have today comply with the requirements:

  • scope:
    • toplevel means top level of the manifest - not inside of any task, when etc.
    • when shows number of times particular metadata field was encountered in when section
    • task shows number of encounters inside of task
  • type:
    • Canonical

      • For packges with seq values "canonical" way of writing is value = @["string", "string"]
      • Canonical for single-value variables: value = "string"
      • Canonical for requires and foreignDep is requires "string", "string"
    • ident -- metadata was set from identifier (most likely with import common, followed by value = importedConst)

    • spec -- any other way of writing value

                |_____toplevel______||_______when________||_______task________|
                  canon  ident   spec  canon  ident   spec  canon  ident   spec
namedbin......        4      0      0      1      0      0      0      0      0
foreigndep....        9      0     10     92      0      0      0      0      0
backend.......       25      0      0      0      0      0      0      0      0
installfiles..       35      0      0      0      0      0      0      0      0
skipext.......       37      0      0      0      0      0      0      0      0
installdirs...       57      0      0      2      0      0      0      0      0
skipfiles.....       68      0      0      0      0      0      0      0      0
bindir........       71      0      0      3      0      0      0      0      1
installext....      112      0      0      1      0      0      0      0      0
packagename...      114      0      0      0      0      0      0      0      0
bin...........      362      0      0      0      0      0      0      0      2
skipdirs......      409      0      0      0      0      0      0      0      0
srcdir........     1145      0      0      0      0      0      0      0      1
version.......     1736      0      0      0     15      0      0      0      3
description...     1745      0      0      0      9      0      0      0      1
author........     1748      0      0      0      9      0      0      0      0
license.......     1753      0      0      0      2      0      0      0      0
requires......     2677      0     17     44      0      0      0      0     43
  • Also, it seems like namedbin is almost never used.
  • Almost all uses of foreignDep happen inside of when section

Streamline nimble-tooling interaction.

Package manager should be a separate tool that does not create two-way information flow. Instead we should adopt simple model [pm] -> [compiler] or [pm] -> [build tool] -> [compiler]. Package manager either runs compiler, or configures environment where compiler can run. Intermediate build tools might include something like testament or other tooling. We already have a pretty nice configuration format in form of nim.cfg that would allow nimble to inform compiler of all the necessary configuration values.

Having volatile configuration file would make it easier to inspect how nimble called the compiler, and even though n linked issue suggest that "If the experience becomes seamless then the user really won't need to care about what Nimble does.", in practice it is quite hard to run nim compiler the same way nimble does it - the only option is to wait for compilation to fail and then copy error message that contains the command itself, and I don't believe we would be able to make this seamless enough so nobody would ever need to run compiler manually or check how nimble communicates with compiler.

Another very important advantage - nim.cfg has is support for external tooling that nimble can't interface with right now. For example testament - if someone wishes to use it for testing their projects simple exec("testament all") usually does the right thing in CI, but under the hood it knows nothing about actual project requirements and simply relies on --nimblePaths. I can't make nimble test use my own compiler build, nor is it possible to fully integrate external tooling in the project. For haxdoc I basically had to copy-paste dependency resolution part of nimble, remove package installation parts and then work based on that.

At the same having to run nimble setup each time after changing .nimble file could become annoying pretty fast, so old commands like nimble build and nimble test should not be deprecated. The change is mostly make following workflows synced with each other, and make it as easy as sometimes running single command.

Counter-argument to this approach is

Most Rust programmers don't invoke rustc directly, but instead do it through Cargo.

There's no explanation for why this approach is brilliant and greatly improves the user experience. Today I can do nim c hello.nim and it simply works. I don't need any cmake, configure or .nimble if I want to use nimble packages.

I tried using Rust and it was ridiculous that I couldn't do anything without learning Cargo. Same with many other languages where there's this whole ecosystem of random tools that hide away the compiler. Meanwhile in Go, the package manager and build system are first class citizens.

We can leave nimblePaths as they are now, and you would be able to use nim compiler as it is now - out of sync with actual package requirements, but if you don't care thats fine. The fix is one nimble setup away, so once you need it, you can configure everything very quickly. Any other tool can emulate this behavior as well (like testament).

This approach enables range of workflows

  • [nimble build] -> [nim c] -- nimble updates environment in which nim compiler would operate and then executes it. Simply shorthand for nimble sync followed by nim c src/main.nim
  • [nim c] -- nim compiler is launched as a standalone tool - it can read already existing environment configuration and work the same way as if it was launched directly by nimble.
  • <custom tool> -- custom tools have full access to nim.cfg and could easily work the same way as if they were launched by nimble without having to provide explicit support for that feature.
  • [other PM] -> [nim c] -- if someone wants to manage their environment using different tools, or even manually (for example using git submodules), it should be possible to write nim.cfg by hand (or using some helper tool). Submodule-based workflow is not really different from package-based one.

nim.cfg correctly sets environment for all subdirectories it is located in, which means paths are correctly set up for subdirectories, tests, other projects that you might want to develop in the same repository, dependencies and so on.

Package manager is allowed to edit nim.cfg to modify volalile configuration elements like --path related to project dependencies. It is possible to provide additional paths in the nim.cfg, for example when one of the dependencies is located in the git submodule. In that case configuration file might take following form:

# Section managed by nimble
--path:~/.nimble/pkg/foo-1.2.4
--path:~/.nimble/pkg/bar-0.1.12
# End of section managed by nimble

--path:submodule/src # Manually added to use submodule

Dependency resolution

Dependency resolution for nimble have been discussed multiple times in different contexts, specifically in "modern techniques for dependency resolution". Possible options for solving this problem that were mentioned:

  • Use existing dependency solver like libsolv -- provides low-level features to implement dependency resolution on top. Would require additional effort to design integration of the nimble
  • Implement custom solution using SAT solver like z3 -- requires a lot of additional work and research to implement.
  • Allow multiple versions of the same package -- would require custom complier support, and this must be a last resort approach, not a go-to fix that is enabled each time dependency resolution halts for some unknown reason.

All of these options are used in certain package managers, and with some effort they might be reimplemented for nim as well. But, while looking for existing solutions that could be easily adopted I've found one that seems to be suited especially well for the problem at hand -- PubGrub: Next-Generation Version Solving by Natalie Weizenbaum. The article introduces new dependency resolution algorithm called pubgrub. It has already been adopted by dart and swift package managers, and have several reimplementations in other languages. They were both mentioned in the linked article suggested in #890, which was written in 2016 and mentions both of these package managers, with comments on their currend dependency solver implementation. "Dart's pub includes a backtracking solver that often takes a long time.", "Swift's package manager uses a basic backtracking solver."

Basics of the algorithm are explained in the introduction article to the algorithm and much more detailed specification. Overview talk from Dart Conf by the algorithm author and introductory talk for Swift package manager. Implementations in different programming languages:

I've examined dart implementation in close detail and it seems there is no need for any specialized knowledge (compared to libsolv and especially z3 approach) to adopt the implementation for nim needs. The source code is extremely well documented, and paired with comprehensive documentation, for both algorithm and user-side behavior. The algorithm is designed to provide extremely concise and clear error messages about failure reasons.

Compared to alternative approaches discussed in #890, and specifically libsolv pubgrub has several important differences that make it especially well-suite for the task at hand:

  • Algorithm has several implementations, including two real-world uses swift (PR link) and dart (solver doc link) that show integration with features like lockfiles, semantic versioning, package features and more.
  • Has extensive documentation on usage and implementation, whereas libsolv has to be treated like a black box for the most part. There was no direct comparison between libsolv and pubgrub, so it is hard to provide a definitive answer which one is better.
  • Provides implementation of features like lockfiles, development dependencies, semantic versioning, feature-based dependency resolution (when defined(windows)) and special attention to error messages.
  • allows each package to provide a list of features that can be either enabled or disabled. This can be reused for os-specific optional dependency resolution by treating os as feature. It is automatically set or unset by nimble installation.

IDEA: adopt pubgrub algorithm for solving nimble dependency graph.

NOTE: With support for full package dependency graph it might be possible to improve current implementation a little more, and introducing such major change in the implementation should be carefully considered. If full dependency graph is introduced as proposed in the previous sections it might become less of an issue.

Quality-of-life features

In addition to changes directly related to package management, some quality-of-life improvements can be made.

End user

  • Update all installed nimble packages
  • Option download package only if it is required (no version is installed). Current nimble install -n foo automatically declines update, but downloads file and builds all binaries anyway. If current package version is not newest it installs things since no prompt was created.
  • make noninteractive init possible
  • Current directory does not contain .nimble file - look up in the directory tree?
  • Disable warnings (by default) from external packages when installing dependencies - I usually can't fix the source of the warnings anyway, so they are almost useless (in context of installing dependency).
  • Add special warning(), error() or hint() command for the build tasks, simihar to the cmake message. This is much better (can be configured, filtered out) than semi-random echo calls that are placed all over build tasks in some cases.
  • Not possible to perform build/test without dependency resolution and installation interfering. This becomes a non-issue if nimble is used to manage volatile configuration file - nim c simply does the right thing. But for cases when custom build task exists it should be possible to do nimble build --skip-dep-scan.
  • Several CLI flags are undocumented in --help (especially --json)
  • Very noisy output that repeats information about package resolution each time it is encountered in the dependency graph (for haxdoc I have one dependency printed 33 (!) times). Some of this might be important to uneunderstand why package installation failed, but some of this information can certainly be reduced.
  • Allow disabling binary build on dependency install - I don't always want to build binary for hybrid packages, and sometimes this has unintended side-effects as well (like overriding user-installed binaries).
  • Create LICENSE file in project when choosing license - otherwise github fails to show correct license number. We already know author's name, creation time, and copy pasting strformatted license texts would require almost zero maintenance.
  • Additional heuristics where git binary is used - if it fails for some reason (even unrelated to the original query), nimble falls back to hg and prints quite unhelpful error message 'hg' not in PATH.

Developers

Nimble already provides some api in form of nimblepkg package, but it does not provide full features of the nimble itself. For example I had to copy dependency resolution code and remove download/install parts for it in order to be able to correctly compile documentation for the whole project in haxdoc. Ideally most of the internal API for package handling should be available as a library. Also, this might help with testing internal implementation details such as package resolution.

This would also allow to freely experiment with alternative package managers that are fully interoperable with nimble - largely because they share the same core implementation and only differ in small details, like dependency resolution algorithm (we can postpone any changes in the nimble dependency resolution core, and if someone wants they can try out pubgrub in proof-of-concept package manager to see if this is really worth it).

Adopting changes

Most existing packages won't have any breakages. For few ones that used certain patterns listed below much easier (and cleaner) solution is provided.

Note: before we get comments like this that start mentioning all possible ways of writing requires I want to say that, as it turns out, people right now do indeed write requires "<string literal>" almost all the time, so it is not an issue that we are facing right now. Out of 2781 requires processed I've found that 2738 can already be consireded 'canonical', and ones that don't are written as requires: "str lit"

In some packages in order to retrieve version data following import must be resolved

when fileExists(thisModuleFile.parentDir / "src/faepkg/config.nim"):
  # In the git repository the Nimble sources are in a ``src`` directory.
  import src/faepkg/config
else:
  # When the package is installed, the ``src`` directory disappears.
  import faepkg/config

and then vesion is set as version = appVersion. This approach is used very few packages, specifically:

There are a couple more packages that have non-standard property configurations, like installFiles = @[TzDbVersion & ".json" in timezones and installExt = @[when defined(windows): "dll" elif in pvim. In nwsync package bin field uses following code snippet

bin = listFiles(thisDir()).
  mapIt(it.extractFilename()).
  filterIt(it.startsWith("nwsync_") and it.endsWith(".nim")).
  mapIt(it.splitFile.name)

Since 0.11.0 nimble defines NimblePkgVersion flag - we can simply put it in the nim.cfg so other tools could pick it up.

New features

Task-level dependencies

First requested in the github issue by mratsim. Separate implementation.

task section might contain requies or when .. requires sections. When task is executed nimble creates new requirement lists, write resolved dependencies in nim.cfg and executes body. All calls to external tools via exec("nim doc") operate as expected. before section is treated as part of the body itself. Special tasks like test are not different in any way.

Using minimal version for package resolution

Provide an option to consider minimal allowed dependency option for requires range rather than maximal one. This would allow developer to keep package requirements in check by enforcing correct minimal version ranges. It mostly solves "just works" problem keeps the devs honest about their requirements and the user happy. This would benefit package ecosystem as a whole.

Example of the problem this would allow to solve:

  • For example, the developer has dep >= 0.1.0 as a version constraint.
  • Then developer updated their software to rely on features for dep 0.1.3, but didn't update requirement since 0.1.3 >= 0.1.0
  • The end user, now with a software that pins the version of dep to 0.1.1, couldn't use this deps because it won't compile,and there is no way package manager could've prevented this.
  • If original developer had package manager to select minimal matching version (in our case that would be 0.1.0) the problem wouldn't exist as they just fixed correct requirement range.

This idea was suggested in the go article, and later considered by rust, zig, conan.

As of now almost all package requires are potentially subject to this issue (to some degree).

verLater        (   > V    ): 44
verEarlier      (   < V    ): 13
verEqLater      (   >= V   ): 2712
verEqEarlier    (   <= V   ): 3
verIntersect    (> V & < V ): 71
verEq           (    V     ): 45
verAny          (    *     ): 635
verSpecial      (  #head   ): 100

We already run CI for important packages to make sure there is no compiler regressions. Check like this could be added to futher improve ecosystem health in the long run.

Since this RFC proposes to change manifest format, it would be appropriate to also include format version as well. This can also be used to gradually adopt MVS - v1 uses current simplified resolution algorithm, and v2 uses MVS selection. For now, this is just an idea that does not address how v1 vs v2 are going to interact with each other.

Third-party libraries and foreign dependencies

Due to lack of standard way of building external libraries (compile C++ code for example) and exec(), doCmd()-based workarounds things like Nim - [...] nimble shell command injection (specifically Remove usage of command string-based exec interfaces) are more likely to happen.

It is not possible to engineer a way to properly interface with every existing build system for C++ there is, but it providing better and more secure (at least easier to audit) convenience API for calling external program in form of runCmd(cmd: string{lit}, args: varargs[string]) and deprecate (remove?) usages of exec. I would that would help avoid different shell quoting issues as well.

I'm pretty sure there is a lot of people who are not entirely thrilled by the idea of executing code with OS sideeffects just to get version number or find out package requirements list. Declarative subset of the manifest would allow to skip package evaluation and just simply read configuration. Right now it is not possible to disable exec, staticExec or execCmd execution in nimscript.

List of most commonly used exec commands - approximately 16% contain some form of string concatenation, and quite a few others build string externally. Overall it seems like on average almost every package calls to exec in one way or another.

`exec` has been used 2345 times in 2013 packages
out of which 379 commands contained &
nim        1392
nimble     304
git        59
rm         55
$cmd       41 # let cmd = "some string"; exec(cmd)
mkdir      26
testament  26
cd         25
release-it 24
echo       22
mv         20
$fmt       19
node       19
cmd        14
make       14
true       13
demo       13
docker     13

Recap

  • changed
    • Define strict declarative subset of manifest file that would include name, version and requires metainformation.
    • Store full information about package dependencies and all vesion in the nim package index.
    • Consider replacing current nimble dependency resolution algorithm with pubgrub.
    • Make nimble generate nim.cfg instead of calling it call nim directly. Configuration file contains all path for dependencies and additional information.
    • Based on my analysis proposed changes would not affect most of the packages in any way - they are already fully compilant with the specification. Few packages that rely on import to store version in separate file do this only as a a workaround.
  • added
    • add nimble setup command that would update package configuration file.
    • add nimble newversion [--major] [--minor] to automatically tag and commit changes.

Extra

Random facts

Processed 2013 packages via pnode in 5.329
Total commit count 203328
`exec` has been used 2345 times in 2013 packages
Total package release count 7706

Total number of commits per day in all packages at the time of analysis.

image

@saem
Copy link

saem commented Jul 4, 2021

I'll have to read it again, but when trying to leverage nimble with the VSCode extension I ran into a few issues and learned some lessons:

  • the json output binaries map doesn't provide information of the original files so no project files can be determined from this (standard output format is useless)
  • dependencies are common to all artifacts of which there can be many, you mention intra task dependencies but I might have missed it for artifacts
  • tasks help output is rather easily be broken for parsing... It's also not part of json output
  • there is no way to force nimble to run in dry run mode from the outside

You've covered a number of necessary improvements, awesome!

I think a problem that remains with nimble even after the ones above which is that the format and data structure of the declarative manifest do not focus on the end product (artifacts really) and have declarations focused on those.

Rather than trying to do that now, one very simple thing could be done, require a nimble file format version for any nimble file. Files without one will have one inserted automatically and eventually fail in subsequent releases if one isn't present. Then down the road some amount of automatic format upgrades can be made such that it becomes much more workable.

@dom96
Copy link
Contributor

dom96 commented Jul 4, 2021

This was shared as a draft to me so I already read most of this and gave my feedback, most (if not all) of which was incorporated into this RFC (thanks @haxscramper!)

I would really love to see all this implemented, some of it is already a part of nim-lang/nimble#913 which we'll hopefully be able to merge soon. I hope we can move fast with these proposals, most seem quite nicely isolated so maybe we could create issues to outline the sub-tasks to implement them, that way we can also keep track of who's working on what and what the progress is :)

As we discussed, for the dependency resolution changes the prerequisite is to create some nice test infra to reproduce the issues that our users faced. I think this is one of the most important things we can work on so we can get insight into the best way to fix them.

@haxscramper
Copy link
Author

haxscramper commented Jul 6, 2021

Completely static toplevel

A little more aggressive extension of the ideas proposed in the manifest evaluation section. The idea is simple - make all package metadata static. Based on the analysis of existing package configurations, approximately 99.288% of all configuration values are already written in 'canonical' form.

To be more specific, 'canonical' value format for different metadata fields is

field description canonical form
version Package version string "major.minor.patch"
author Author's name "<arbitrary string>"
description Short package description "<arbitrary string>"
packagename Package name "<string>"
license Package license "license name"
srcdir Source directory to take
installation files from
"relative/path"
bindir Directory where nimble build will output binaries "path"
backend Compilation backed One of "c", "cc", "cpp", "objc", "js"
bin List of compiled binary files @["path1", "path2"]
skipdirs Directories to skip while installing @["path1", "path2"]
skipext Extension to skip while installing @["ext1", "ext2"]
installfiles List of files which should
be exclusively installed
@["path1", "path2"]
installdirs List of directories which should
be exclusively installed
@["path1", "path2"]
skipfiles List of file names which should be skipped
during installation
@["path1", "path2"]
installext Extension to use while installing @["ext1", "ext2"]
namedbin Path-name mappig for binary files {"path": "name"}

Note that namedbin is currently declared as Table[], but due to extremely rare usage (4 times directly assigned and two more in form of namedBin["XXXX"] = "YYYY") this can be replaced with {"string", "string"}, even though it would break all uses.

Aside from package metadata might contain:

  • Helper procedure declarations used in task, after and before hooks.
  • when section for optional requires, foreign dependencies and assignments to some metadata fields
  • task, before and after hooks
  • imports or includes

Right now it possible to put arbitrary nimscript code at the toplevel, and it would be executed each time I want to access some package metadata. For example, if I want to know package version, and it contains code like mkDir() it would make me a directory, somewhere, each time.

Instead, I suggest that .nimble is statically rewritten into executable nimscript by nimble and then executed. It is not really different from current implementation where .nimble file is converted to nimscript that is includes an api file where task is defined as a template that modifies global list of nimbleTasks only when code runs.

template task*(name: untyped; description: string; body: untyped): untyped =
  proc `name Task`*() = body

  nimbleTasks.add (astToStr(name), description) # < This part is placed at nimscript toplevel
  
  if actionName.len == 0 or actionName == "help":
    success = true
  elif actionName == astToStr(name).normalize:
    success = true
    `name Task`()

nimble can perform file rewrite only based on AST - declare name Task procedure that would be executed when nimble Task is run.

Make binary build optional

This was first discussed in context of requires nimble potentially overwriting active nimble binary with a new one, even if that's undesirable. Hybrid packages were introduced to solve the problem "I have a binary package, but I also want my API to be reusable", but at the same time it is currently not possible to request only API part of a package.

IDEA: Allow requiring only library or binary part of a package as requires nimble/lib or requires nimble/bin. Current requirement format stays the same to avoid breakages, but if user need better control over what's installed, it should be supported.

EXAMPLE:

  • There is a package nimble that provides a binary and an API. API is for code reuse. These are the "features" the package offers. By "features" I do not mean it in the context of dependency resolution but simply "what stuff I can do with it" (like I can run nimble install, or nimble dump or import nimblepkg/version).
  • I want to reuse API that developer provided, but not really interested in absolutely all features of nimble package. More specifically, I'm interested in the "library part"
  • In order to avoid unnecessary costs (building, disc storage, additional dependencies, other inconveniences like packages overriding each other) I opt to explicitly communicate my intentions by listing subset of features I'm interested in, specifically library - i.e. nimble/lib, nimble/bin or both (simply nimble)

NOTE: That is all modeled as package features (already mentioned in context of optional dependencies). We can only do this for hybrid packages and os-specific dependencies. Latter one is automatically set by package manager - user can't set os = windows. Hybrid package choice is done using nimble/lib.

Provide better control over build configuration

Introduce build section that provides a way to give a better description of the dependencies related to particular build target. Right now, it is not possible to specify different backends for each build target, or different requirements.

build backend:
  requires "jester"
  backend = "c"
  # optionally you can override how this gets built
  # by writing exec("nim c blah"), otherwise you get this
  # by default

build frontend:
  requires "karax"
  backend = "js"

@haxscramper
Copy link
Author

In anticipation of potential "scope creep" comments about this RFC - I personally would prefer to have a clear plan that is thought through to and includes description of most of the required changes, instead of going thought multiple smaller RFCs that have to be coordinated with each other ("RFC for build section", "RFC for static toplevel" etc.)

When this RFC is discussed and finalized, I want to write a recap includes a clear list of actionable TODO steps, or specification similar to the final draft I did for #245 (comment)

@Araq
Copy link
Member

Araq commented Jul 6, 2021

I am happy to contribute a .nimble file parser that can process a .nimble file without executing it, I have the code lying around somewhere. The Nimble tool itself should use this code so that the restrictions are enforced properly.

@Araq
Copy link
Member

Araq commented Jul 15, 2021

The code is here now, nim-lang/Nim#18497 (parse_requires.nim module)

@haxscramper
Copy link
Author

haxscramper commented Aug 3, 2021

I've found Statistics for standard library usage thread and decided to generate updated stats. It is probably related more to #310 than nimble itself, but considering good package almost always comes up in discussions related to stdlib evolution I decided it is fine if I add stats here (and #310 is closed so). Stat script - haxscramper/hack@90e7670

note: I update stats from time to time, so you might also be interested in comparing file/package counts, or popularity growth rate.

module               per file / 38051     in % files           per package / 2265   in % packages        
exitProcs            1                    0.0000               1                    0.0004
exceptions           1                    0.0000               1                    0.0004
prelude              1                    0.0000               1                    0.0004
jsheaders            2                    0.0001               2                    0.0009
vmutils              2                    0.0001               2                    0.0009
formatfloat          2                    0.0001               2                    0.0009
ansi_c               2                    0.0001               2                    0.0009
ssl_certs            2                    0.0001               2                    0.0009
distros              2                    0.0001               2                    0.0009
jsformdata           2                    0.0001               2                    0.0009
future               2                    0.0001               2                    0.0009
oswalkdir            2                    0.0001               2                    0.0009
jsfetch              2                    0.0001               1                    0.0004
memory               2                    0.0001               2                    0.0009
posix_utils          2                    0.0001               2                    0.0009
asyncstreams         2                    0.0001               2                    0.0009
selectors            3                    0.0001               3                    0.0013
smtp                 3                    0.0001               3                    0.0013
memfiles             3                    0.0001               3                    0.0013
async                3                    0.0001               3                    0.0013
jsbigints            3                    0.0001               3                    0.0013
channels             3                    0.0001               3                    0.0013
sqlite3              3                    0.0001               3                    0.0013
parsecsv             4                    0.0001               2                    0.0009
typeinfo             4                    0.0001               4                    0.0018
punycode             4                    0.0001               2                    0.0009
threadpool           4                    0.0001               4                    0.0018
tempfiles            4                    0.0001               2                    0.0009
cstrutils            4                    0.0001               2                    0.0009
dynlib               4                    0.0001               4                    0.0018
rlocks               4                    0.0001               4                    0.0018
socketstreams        4                    0.0001               2                    0.0009
time_t               5                    0.0001               5                    0.0022
tasks                5                    0.0001               4                    0.0018
lexbase              5                    0.0001               3                    0.0013
effecttraits         5                    0.0001               3                    0.0013
parsesql             5                    0.0001               3                    0.0013
sysrand              5                    0.0001               5                    0.0022
volatile             5                    0.0001               3                    0.0013
jscore               5                    0.0001               3                    0.0013
unidecode            6                    0.0002               2                    0.0009
segfaults            6                    0.0002               3                    0.0013
mersenne             6                    0.0002               2                    0.0009
asyncfile            6                    0.0002               6                    0.0026
asyncfutures         6                    0.0002               3                    0.0013
stackframes          6                    0.0002               2                    0.0009
sharedlist           6                    0.0002               2                    0.0009
cpuinfo              6                    0.0002               6                    0.0026
rationals            6                    0.0002               2                    0.0009
strmisc              6                    0.0002               4                    0.0018
rdstdin              6                    0.0002               6                    0.0026
xmlparser            7                    0.0002               4                    0.0018
browsers             7                    0.0002               7                    0.0031
db_common            8                    0.0002               6                    0.0026
encodings            8                    0.0002               8                    0.0035
parsexml             8                    0.0002               6                    0.0026
setutils             8                    0.0002               2                    0.0009
db_sqlite            8                    0.0002               4                    0.0018
marshal              8                    0.0002               8                    0.0035
strbasics            8                    0.0002               2                    0.0009
packedsets           8                    0.0002               2                    0.0009
wrapnils             8                    0.0002               2                    0.0009
nre                  8                    0.0002               8                    0.0035
logic                8                    0.0002               4                    0.0018
varints              9                    0.0002               3                    0.0013
cookies              9                    0.0002               6                    0.0026
importutils          9                    0.0002               3                    0.0013
macrocache           9                    0.0002               8                    0.0035
ropes                9                    0.0002               4                    0.0018
enumutils            10                   0.0003               2                    0.0009
sums                 10                   0.0003               4                    0.0018
pegs                 10                   0.0003               5                    0.0022
htmlgen              10                   0.0003               6                    0.0026
dom                  10                   0.0003               6                    0.0026
parsejson            10                   0.0003               7                    0.0031
re                   10                   0.0003               7                    0.0031
htmlparser           11                   0.0003               9                    0.0040
genasts              12                   0.0003               4                    0.0018
jsconsole            13                   0.0003               3                    0.0013
fenv                 13                   0.0003               5                    0.0022
critbits             13                   0.0003               7                    0.0031
lenientops           14                   0.0004               7                    0.0031
mimetypes            15                   0.0004               7                    0.0031
lists                15                   0.0004               11                   0.0049
decls                16                   0.0004               7                    0.0031
parsecfg             16                   0.0004               11                   0.0049
jsonutils            16                   0.0004               7                    0.0031
asyncjs              17                   0.0004               5                    0.0022
asyncnet             17                   0.0004               15                   0.0066
asynchttpserver      17                   0.0004               12                   0.0053
editdistance         17                   0.0004               8                    0.0035
heapqueue            18                   0.0005               9                    0.0040
endians              19                   0.0005               15                   0.0066
md5                  19                   0.0005               17                   0.0075
jsffi                21                   0.0006               5                    0.0022
compilesettings      22                   0.0006               6                    0.0026
stats                22                   0.0006               12                   0.0053
xmltree              23                   0.0006               14                   0.0062
complex              23                   0.0006               5                    0.0022
strscans             24                   0.0006               14                   0.0062
enumerate            25                   0.0007               13                   0.0057
logging              25                   0.0007               15                   0.0066
isolation            28                   0.0007               6                    0.0026
cgi                  29                   0.0008               20                   0.0088
atomics              29                   0.0008               7                    0.0031
colors               30                   0.0008               18                   0.0079
intsets              31                   0.0008               11                   0.0049
exitprocs            32                   0.0008               20                   0.0088
base64               32                   0.0008               20                   0.0088
oids                 33                   0.0009               9                    0.0040
wordwrap             35                   0.0009               23                   0.0102
locks                36                   0.0009               12                   0.0053
httpcore             37                   0.0010               21                   0.0093
httpclient           47                   0.0012               33                   0.0146
parseopt             49                   0.0013               23                   0.0102
nativesockets        56                   0.0015               29                   0.0128
terminal             56                   0.0015               40                   0.0177
deques               62                   0.0016               25                   0.0110
bitops               65                   0.0017               27                   0.0119
posix                65                   0.0017               32                   0.0141
with                 73                   0.0019               23                   0.0102
net                  78                   0.0020               35                   0.0155
osproc               87                   0.0023               56                   0.0247
monotimes            89                   0.0023               55                   0.0243
strtabs              91                   0.0024               32                   0.0141
sha1                 92                   0.0024               53                   0.0234
uri                  99                   0.0026               44                   0.0194
asyncdispatch        102                  0.0027               39                   0.0172
parseutils           107                  0.0028               53                   0.0234
typetraits           109                  0.0029               48                   0.0212
unicode              127                  0.0033               54                   0.0238
random               142                  0.0037               65                   0.0287
sugar                149                  0.0039               48                   0.0212
streams              154                  0.0040               65                   0.0287
hashes               158                  0.0042               57                   0.0252
json                 193                  0.0051               80                   0.0353
algorithm            194                  0.0051               85                   0.0375
sets                 213                  0.0056               62                   0.0274
unittest             250                  0.0066               60                   0.0265
math                 285                  0.0075               117                  0.0517
times                325                  0.0085               123                  0.0543
macros               344                  0.0090               98                   0.0433
sequtils             389                  0.0102               83                   0.0366
options              412                  0.0108               82                   0.0362
strformat            442                  0.0116               125                  0.0552
tables               518                  0.0136               119                  0.0525
os                   730                  0.0192               225                  0.0993
strutils             1283                 0.0337               321                  0.1417

@haxscramper
Copy link
Author

haxscramper commented Sep 7, 2021

It seems like I've missed this in my original comment, but nimble develop does add something similar to nim.cfg. It is not entirely clear from the description what is the main purpose (intended workflow, how it replaces old nimble develop functionality), but it seems like this file can be used by other tools, maybe? It also adds another configuration format that other tools would have to read, which is less than ideal (we already have .cfg and .nims), but this RFC would have to be updated to account for this.

@haxscramper
Copy link
Author

haxscramper commented Sep 7, 2021

And again, because HUGE portion of the PR for "lock files" (but-actually-a-lot-more-than-just-lock-files) were discussed elsewhere ("have to implement two additional features for which zah specifically insisted."), some things might need to be re-evaluated to account for the changes in workflows, general expectations and so on.

@haxscramper
Copy link
Author

haxscramper commented Sep 19, 2021

Related - nim-lang/nimble#921, for OS-level dependencies. Supporting other build systems somehow might also be nice. Providing full dependency resolution is an overkill IMO, but since we already have foreignDeps, distros etc. this could be integrated.

Also partially related - #414, though package manager should generally be concerned with packages and libraries, but some functionality is shared (for include files etc). Building various artefacts via nimble would benefit from being able to at least explicitly search for dependency availability instead of failing somewhere mid-way when running cmake for the dependency.

@haxscramper
Copy link
Author

haxscramper commented Nov 3, 2021

I'm no longer interested in positioning myself as a main driving force behind the effort to steer nimble design. If nim core development team considers my original goals worth pursuing, they can reopen the RFC. I want to reiterate that I just no longer willing to aim for a role in the nim package management discussion, but at the same time I was told the RFC itself has many good ideas, so I'm not opposed in any way to someone keeping it open. Only that I no longer have the incentive to push for the ideas.

I'm moving to https://github.com/disruptek/nimph instead since it already supports all of my necessary workflows, including issues outlined above (such as simple sequential order of [pm] -> [compiler], support for external tools (I no longer have to teach haxdoc how to resolve nimble packages)).

@dom96
Copy link
Contributor

dom96 commented Nov 4, 2021

There is a lot of great ideas in this so I will reopen this.

CC @Araq, we discussed the need to have an explicit dependency graph, see the "Use explicit dependency graph in Nimble" section above for examples where this is necessary and what bugs it will resolve.

@dom96 dom96 reopened this Nov 4, 2021
haxscramper added a commit to haxscramper/hnimast that referenced this issue Nov 23, 2021
- CHANGED ::
  - Implementation improvements for the code pretty printer - more input
    nodes supported.
  - Clean up the implementation of the tree-sitter wrapper generator.
  - More predictable `treeRepr` output - nodes with comments are no longer
    split apart at random places.
  - Make a ton of `func` nodes into `proc`, because accessing comment field
    is no longer a side-effect-free operations, most likely due to the
    nim-lang/Nim#18760
- ADDED ::
  - Tree-sitter wrapper generator now can produce library wrappers that do
    not depend on hmisc for operation.
  - `addPragma` for enum declarations
- REMOVED ::
  - `nimble_aux.nim` and dependency on the nimble - I no longer work on the
    nim-lang/RFCs#398 and I see no reason to try
    and revese-engineer the dependency management solutions, `nimph`
    provides much better approach in this case (edit `nim.cfg`, then user
    can simply dump it as needed).
@Araq Araq mentioned this issue Dec 1, 2021
33 tasks
@Q-Master
Copy link

I think that using MVS is not that good idea. If I set some restrictions on a version of a library I almost always want to see the maximum available version inside the diapasone. This might be illustrated by a python pip versioning.
E.x. I want any version of library XXX <= 3.0 because I know that 3.1 breaks the compatibility and I don't want for now to fix the entire project to be compatible, but (!!!) for example later linux has 3.0, windows only 2.8 and BSD only 2.7.14.
So in case of MVS I must set 2 limits, the lower one (which I might not know at all for some rare OSes) and the higher one. In this case I will almost always have the lowest possible library in all cases including those, where it has fixes and runs better. We need very complex dependencies in my requirements to met somewhere the higher boundary of a limit.

@Araq
Copy link
Member

Araq commented Dec 20, 2021

the lower one (which I might not know at all for some rare OSes) and the higher one

The idea is that you use as the minimal version the version that you actually tested your package with. I don't see how you could not "know it at all". Most Nimble packages are not OS specific either and if you don't support an OS, document it as such.

@Q-Master
Copy link

Q-Master commented Dec 20, 2021

The idea is that you use as the minimal version the version that you actually tested your package with. I don't see how you could not "know it at all".

The idea is that e.x. I've tested my code with 3.0 but later author released 3.0.1 with bugfixes and with <3.1 constraint it will work, but with >3.0 <3.1 I always will get the original 3.0 and need to manually tweak requirements every time when the lower boundary needs to be pushed up.
In the case of setting lower boundary higher boundary became reduced and unneeded because always used only the lowest required package and in very rare cases it might go nearer to the upper boundary.
In the real production I've never seen that weird behaviour for already 20 years of development. Every time either equality is set or only the higher boundary. This lead to easy update of code on deploy without need to push anything anywhere if there are only bugfixes released to requirements, just rebuild and put to staging for final QA.

@Araq
Copy link
Member

Araq commented Dec 21, 2021

Well so you don't like the "minimal version" scheme, that's fine, but if we decide to follow it, you then would have to adapt your workflows. I'm not saying that we should adopt "minimal version", but it does make some sense. If you prefer to use version 3.0.1 because of the bugfixes it contains, version 3.0.1 is your minimal version. Either you depend on these bugfixes or you don't -- if you don't know if you do it seems safer to use the higher version number but that's nothing new, minimal version simply forces you to be honest about it: You tested it against 3.0.1 happily, so that's what your requirements really are, you don't want to go back to 3.0.0 then, it's risky and untested.

The problem with "pick the highest available version automatically (except for the 'breaking' versions)" is that nobody knows what it means as it keeps changing.

@Q-Master
Copy link

Well, the "minimal version" will be a very weird decision and it has no sense in case of a real working flows, because:

  1. The code base will always be stuck to a minimal(buggy or with security vulnerabilities) version because no one in his real mind will never check every time the deployment will be done(in nowadays it is made by automations, not manually) if there was an update for dependancies and for dependancies of dependancies and so on recursively and update everything and kick everyone to update their code to follow the fixes.
  2. The load to a packaging system will be very heavy because of a constant and continious version updates of almost all packages on any version update of a common dependancy for them. (chain reactions on every update)
  3. There's no any sense to set any diapasone constraints because in most cases all those constraints will be simlified to equality to the lowest boundary.

No any other package manager which is used now in production follows the "minimal version" procedure. That might be the higher than constraint, but it always gets the latest but not the minimal version. I think that those who has written them and massively using them are not dumb and understand what they're doing.

@Araq
Copy link
Member

Araq commented Dec 22, 2021

The "chain reaction" effect might not be as bad as you think it would be for the reason that "pick minimal version" is an incomplete way to describe the situation: The algorithm actually picks the maximum of the provided minima: If package A depends on B version 2.1 and package C depends on B version 2.2 and your package depends on A and C the version 2.2 will be picked up.

@amaank404
Copy link

Hey, I am new to nim and anything nimble related. Now here are a few things I would like you guys to think of:

  • Should overriding builtin task be allowed?
  • Should there be a separate target for dynamic libraries (Because currently we have to build dynamic libraries manually with nim command)
  • Should we also look at what Python does (setup.py) for separate files such as setup.nims/setup.nim instead of setup.nimble (Keeping compatibility with old nimble files).
    Please explain why/why not above feature should be implemented. :)

@CyberTailor
Copy link

Package registry with dependency metadata

I believe centralized package registry should be optional, especially if it requires non-free software like GitHub.

Do not require full manifest evaluation
version <...> information is redundant - first one is already stored in git tags

It's actually useful when you download archive tarballs instead of cloning the repository.

Foreign dependencies might also be checked upon installation, but that can be implemented later

I like Meson's dependency() that checks both pkg-config and cmake (with internal implementation for some special cases like threads).

Dependency resolution

Bundler, CocoaPods and Shards use Molinillo algorithm, it's doing backtracking too. I think it could be ported for Nim.

Although I'd prefer to focus on building and delegate dependency resolution to general-purpose package managers.

@Varriount
Copy link

I believe centralized package registry should be optional

Yes please. Speaking from experience with Python, requiring a central repository makes using package management in the private space a headache, as you end up having to host your own central repository.

@mratsim
Copy link
Collaborator

mratsim commented May 31, 2023

Discussion on Discord on package resolution regarding the new Atlas: https://discord.com/channels/371759389889003530/768367394547957761/1113215118709366835

Use existing dependency solver like libsolv -- provides low-level features to implement dependency resolution on top. Would require additional effort to design integration of the nimble

See OpenSuse presentation at Fosdem2008 - https://en.opensuse.org/images/b/b9/Fosdem2008-solver.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants