Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Refactor the pipfile API into a functional one #57

Closed
wants to merge 1 commit into from

Conversation

dstufft
Copy link
Member

@dstufft dstufft commented Jan 27, 2017

This is a WIP, it does not currently actually do all of the things that it would need to do and some of the things it currently does are just flat out wrong. However, it's close enough that it gives a reasonable idea (I think) of what the API and usage would be.

The core idea here is that we have two objects, Pipfile and PipfileLock and these two objects are completely immutable. This makes it a ton easier to reason about in code (because nothing can ever change the object out from underneath you) and I think it's a good direction for these packaging libraries to head towards in general.

You don't directly create these objects in the public API, instead you're given a few functions that will parse either a Pipfile or a Pipfile.lock and give you back already created instances of them. In the case that you want to modify them you simply use the pyristant transformers API or the evolvers API and you can safely do that without having to worry about who else might hold a reference to that object. In addition, Pipfile objects have a lock() method which allow you to create a PipfileLock object given an existing Pipfile and a resolved set of sources and packages.

Basically, an example of what this API would look like in use is:

import pipfile


# Load our Pipfile and our Pipfile.lock from the examples directory, if the
# Pipfile.lock does not exist, it will be a `None`.
p, pl = pipfile.find("examples/")


if pl is not None or p.digests() != pl.digests:
    # In this case we need to do something to resolve the entire dependeny tree
    # since we do not have an existing lockfile or that lockfile is stale, this
    # also gives us a chance to provide any sources that may not be baked into
    # the Pipfile but that should be baked into the Pipfile.lock.

    # In this example, we'll just reuse the sources given to us in the Pipfile.
    sources = p.sources

    # This is where we might resolve our dependencies, against since this is
    # just an example, we're going to just mirror what was given to us in the
    # Pipfile.
    # Note: This is actually wrong, since it should be pinning exact versions
    #       here. Ideally we would have an invariant that would protect against
    #       what we're doing here, but for now this is a good enough demo.
    packages = p.packages

    # Actually generate our lockfile, this is done by calling the Pipfile.lock()
    # method, which will ensure that we populate the lock file with computed
    # data such as the digests automatically.
    pl = p.lock(sources=sources, packages=packages)


# Here is where a hypothetical project could then consume the now created
# lockfile to actually do the install of the projects that we need.
do_the_install(pl)


# Now that the install is done, we can either exit, or if we were supposed to
# write out our files we could go ahead and do that as well. In this example
# we will rewrite out the files to where they need to go.
pipfile.save(p, pl, root="examples/")

While this PR currently does handle reading and saving files (and this works well enough for the lock files) it needs to get a lot better at how it handles the saving out a TOML file. In particular, it currently just completely overwrites the existing TOML file which blows away any comments and possibly changes the formatting of the file itself. This is largely a shortcoming of the TOML library that we're using, but ideally we'd be able to pass in an existing file to dumps() in the case of attempting to dump a Pipfile and it would instead merge our changes into that file instead of overwriting it. This merging could happen automatically in the case of pipfile.save().

Here are a few things to think about in terms of this PR:

  • Currently we have a singular set of functions like dumps(), dump(), loads(), and load() that handle both Pipfile and Pipfile.lock. In the case of the load functions it asks for the Pipfile and optionally you can pass in a Pipfile.lock if they exist, but the dump functions it only accepts one at a time, either a Pipfile or a PipfileLock. Should we instead have separate functions for loading/dumping a Pipfile and a PipfileLock?
  • The pipfile.save() function will (eventually) re-do the recursive search up for Pipfile and Pipfile.lock, and failing to find one it will just save them in the root directory (defaults to .). This does cause a race condition where someone could place a Pipfile in the directory tree between the root directory and where we originally loaded the files from. If this is a case that we care about, then perhaps the ideal thing to do is include the file path that we actually loaded the files from in the pipfile.find() API.
    • One problem with this is the Pipfile and PipfileLock objects are intentionally completely divorced from any sort of filesystem or even filesystem representation so we can't just attach the filenames/locations we loaded them from to those objects themselves, we could return it as another value in the tuple returned by pipfile.find(), however we need to either return a single value which is the directory we loaded them from (and we possibly get the same race condition if someone writes a Pipfile after we read from a pipfile... though I don't think this is a case we need to care about) or we need to return 2 additional values that point to the specific files which starts to get a lot more unwieldy. Personally I would just change the pipfile.find() API to return (directory, Pipfile, PipfileLock or None).
  • We currently do not do the digest checking for a stale Pipfile.lock file, leaving that up to the calling project. I think that this is the right thing to do, because different projects may want to do different things with it (for example, some may want to just install from the lockfile anyways, or some might want to ignore it and regenerate the lockfile). I could also see a use case for the same project to do different things depending on it's configuration.
  • Given Pipfile and PipfileLock are immutable, modifying them requires using the pyrisistent transformers API. This is a powerful API that allows a lot of flexibility in modifying an immutable object, you can see below for an example of modifying a Pipfile object. This is a bit more work than just using mutable objects, but there are nice benefits to using immutable objects, and generally the primary unit of work is going to be taking an existing Pipfile, resolving it's dependencies, and then writing it out to a Pipfile.lock (aka, what Pipfile.lock() does). If there are other things that are common we can also add additional shortcuts to them as well.
import re

import pipfile
from pipfile._structures import Requirement
from pyrsistent import ny, rex

p, pl = pipfile.find("examples/")

# Let's modify the source, say to rewrite it from using legacy PyPI to using
# Warehouse instead.
# Note, the original p is *not* modified, we only get the new value as `p`
# because we're assigning it again. If anything else held a reference to the
# original `p` they would still hold it and it would not be modified.
p = p.transform(
    # This translates to doing p.sources[ANY]["url"], it is defining a "path"
    # to traverse down the object to find the thing you want to modify.
    ["sources", ny, "url"],
    # This defines the action, since we're just passing in a function here, it
    # will be called with the value and the return value will be set.
    lambda u: u if not re.search(r"^https?://pypi\.python\.org/simple/?$", u) else "https://pypi.org/simple/",
)

# Let's say we want to add a new dependency as well, we can go ahead and do
# that too.
# TODO: In doing this I discovered this case is kind of crummy and requires
#       importing from a private module. I assuming that appending a new
#       source will too. We should probably add some kind of shortcut to doing
#       this to make it easier. Possibly something like
#        Pipfile().packages["default"].add_project()?
p = p.transform(
    ["packages", "default"],
    p.packages["default"].append(Requirement.create({"name": "foobar"})),
)

That is a whole lot of words. I want to see what others think before I push too much further down this path, but overall I think this creates a pretty decent setup that gives nice, immutable objects while also providing good convenience APIs.

@kennethreitz
Copy link
Contributor

in order to get this workable, let's get the searching working again.

@dstufft
Copy link
Member Author

dstufft commented Feb 1, 2017

Yea, Weds and Thurs are my OSS days, so I'm working on this today.

@kennethreitz
Copy link
Contributor

looking forward to it :)

@kennethreitz
Copy link
Contributor

@dstufft any update on this? should i try to take it over from here?

@kennethreitz
Copy link
Contributor

ping

@pradyunsg
Copy link
Member

@dstufft Ping?

@kennethreitz
Copy link
Contributor

P.S. I'm not against our current API  — I think it's quite a sane approach, but after looking at the Pip codebase, I understand that it takes immutability into consideration quite often.

A part of me thinks that this is overkill for this project, as these are quite simple functions.

@kennethreitz
Copy link
Contributor

But I defer to you @dstufft :)

@kennethreitz
Copy link
Contributor

P.S. I think it would possibly also reduce the maintainability of this library.

@jmcarp
Copy link

jmcarp commented Nov 4, 2017

@kennethreitz @dstufft: is this still in progress, or did you all decide to keep the API as-is? I'm looking forward to pyupio/safety#47, but it looks like it's blocked on this for now.

@audiolion
Copy link

Agreed, it would be really nice for people looking to start supporting Pipfile if the API is stable

@dstufft dstufft closed this Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants