[WIP] Refactor the pipfile API into a functional one #57

dstufft · 2017-01-27T05:16:10Z

This is a WIP, it does not currently actually do all of the things that it would need to do and some of the things it currently does are just flat out wrong. However, it's close enough that it gives a reasonable idea (I think) of what the API and usage would be.

The core idea here is that we have two objects, Pipfile and PipfileLock and these two objects are completely immutable. This makes it a ton easier to reason about in code (because nothing can ever change the object out from underneath you) and I think it's a good direction for these packaging libraries to head towards in general.

You don't directly create these objects in the public API, instead you're given a few functions that will parse either a Pipfile or a Pipfile.lock and give you back already created instances of them. In the case that you want to modify them you simply use the pyristant transformers API or the evolvers API and you can safely do that without having to worry about who else might hold a reference to that object. In addition, Pipfile objects have a lock() method which allow you to create a PipfileLock object given an existing Pipfile and a resolved set of sources and packages.

Basically, an example of what this API would look like in use is:

import pipfile


# Load our Pipfile and our Pipfile.lock from the examples directory, if the
# Pipfile.lock does not exist, it will be a `None`.
p, pl = pipfile.find("examples/")


if pl is not None or p.digests() != pl.digests:
    # In this case we need to do something to resolve the entire dependeny tree
    # since we do not have an existing lockfile or that lockfile is stale, this
    # also gives us a chance to provide any sources that may not be baked into
    # the Pipfile but that should be baked into the Pipfile.lock.

    # In this example, we'll just reuse the sources given to us in the Pipfile.
    sources = p.sources

    # This is where we might resolve our dependencies, against since this is
    # just an example, we're going to just mirror what was given to us in the
    # Pipfile.
    # Note: This is actually wrong, since it should be pinning exact versions
    #       here. Ideally we would have an invariant that would protect against
    #       what we're doing here, but for now this is a good enough demo.
    packages = p.packages

    # Actually generate our lockfile, this is done by calling the Pipfile.lock()
    # method, which will ensure that we populate the lock file with computed
    # data such as the digests automatically.
    pl = p.lock(sources=sources, packages=packages)


# Here is where a hypothetical project could then consume the now created
# lockfile to actually do the install of the projects that we need.
do_the_install(pl)


# Now that the install is done, we can either exit, or if we were supposed to
# write out our files we could go ahead and do that as well. In this example
# we will rewrite out the files to where they need to go.
pipfile.save(p, pl, root="examples/")

While this PR currently does handle reading and saving files (and this works well enough for the lock files) it needs to get a lot better at how it handles the saving out a TOML file. In particular, it currently just completely overwrites the existing TOML file which blows away any comments and possibly changes the formatting of the file itself. This is largely a shortcoming of the TOML library that we're using, but ideally we'd be able to pass in an existing file to dumps() in the case of attempting to dump a Pipfile and it would instead merge our changes into that file instead of overwriting it. This merging could happen automatically in the case of pipfile.save().

Here are a few things to think about in terms of this PR:

Currently we have a singular set of functions like dumps(), dump(), loads(), and load() that handle both Pipfile and Pipfile.lock. In the case of the load functions it asks for the Pipfile and optionally you can pass in a Pipfile.lock if they exist, but the dump functions it only accepts one at a time, either a Pipfile or a PipfileLock. Should we instead have separate functions for loading/dumping a Pipfile and a PipfileLock?
The pipfile.save() function will (eventually) re-do the recursive search up for Pipfile and Pipfile.lock, and failing to find one it will just save them in the root directory (defaults to .). This does cause a race condition where someone could place a Pipfile in the directory tree between the root directory and where we originally loaded the files from. If this is a case that we care about, then perhaps the ideal thing to do is include the file path that we actually loaded the files from in the pipfile.find() API.
- One problem with this is the Pipfile and PipfileLock objects are intentionally completely divorced from any sort of filesystem or even filesystem representation so we can't just attach the filenames/locations we loaded them from to those objects themselves, we could return it as another value in the tuple returned by pipfile.find(), however we need to either return a single value which is the directory we loaded them from (and we possibly get the same race condition if someone writes a Pipfile after we read from a pipfile... though I don't think this is a case we need to care about) or we need to return 2 additional values that point to the specific files which starts to get a lot more unwieldy. Personally I would just change the pipfile.find() API to return (directory, Pipfile, PipfileLock or None).
We currently do not do the digest checking for a stale Pipfile.lock file, leaving that up to the calling project. I think that this is the right thing to do, because different projects may want to do different things with it (for example, some may want to just install from the lockfile anyways, or some might want to ignore it and regenerate the lockfile). I could also see a use case for the same project to do different things depending on it's configuration.
Given Pipfile and PipfileLock are immutable, modifying them requires using the pyrisistent transformers API. This is a powerful API that allows a lot of flexibility in modifying an immutable object, you can see below for an example of modifying a Pipfile object. This is a bit more work than just using mutable objects, but there are nice benefits to using immutable objects, and generally the primary unit of work is going to be taking an existing Pipfile, resolving it's dependencies, and then writing it out to a Pipfile.lock (aka, what Pipfile.lock() does). If there are other things that are common we can also add additional shortcuts to them as well.

import re

import pipfile
from pipfile._structures import Requirement
from pyrsistent import ny, rex

p, pl = pipfile.find("examples/")

# Let's modify the source, say to rewrite it from using legacy PyPI to using
# Warehouse instead.
# Note, the original p is *not* modified, we only get the new value as `p`
# because we're assigning it again. If anything else held a reference to the
# original `p` they would still hold it and it would not be modified.
p = p.transform(
    # This translates to doing p.sources[ANY]["url"], it is defining a "path"
    # to traverse down the object to find the thing you want to modify.
    ["sources", ny, "url"],
    # This defines the action, since we're just passing in a function here, it
    # will be called with the value and the return value will be set.
    lambda u: u if not re.search(r"^https?://pypi\.python\.org/simple/?$", u) else "https://pypi.org/simple/",
)

# Let's say we want to add a new dependency as well, we can go ahead and do
# that too.
# TODO: In doing this I discovered this case is kind of crummy and requires
#       importing from a private module. I assuming that appending a new
#       source will too. We should probably add some kind of shortcut to doing
#       this to make it easier. Possibly something like
#        Pipfile().packages["default"].add_project()?
p = p.transform(
    ["packages", "default"],
    p.packages["default"].append(Requirement.create({"name": "foobar"})),
)

That is a whole lot of words. I want to see what others think before I push too much further down this path, but overall I think this creates a pretty decent setup that gives nice, immutable objects while also providing good convenience APIs.

kennethreitz · 2017-02-01T21:27:29Z

in order to get this workable, let's get the searching working again.

dstufft · 2017-02-01T21:28:24Z

Yea, Weds and Thurs are my OSS days, so I'm working on this today.

kennethreitz · 2017-02-01T23:29:31Z

looking forward to it :)

kennethreitz · 2017-05-25T19:17:57Z

@dstufft any update on this? should i try to take it over from here?

kennethreitz · 2017-08-24T21:43:46Z

ping

pradyunsg · 2017-09-02T17:08:54Z

@dstufft Ping?

kennethreitz · 2017-09-02T17:29:37Z

P.S. I'm not against our current API — I think it's quite a sane approach, but after looking at the Pip codebase, I understand that it takes immutability into consideration quite often.

A part of me thinks that this is overkill for this project, as these are quite simple functions.

kennethreitz · 2017-09-02T17:30:09Z

But I defer to you @dstufft :)

kennethreitz · 2017-09-02T17:30:34Z

P.S. I think it would possibly also reduce the maintainability of this library.

jmcarp · 2017-11-04T22:27:02Z

@kennethreitz @dstufft: is this still in progress, or did you all decide to keep the API as-is? I'm looking forward to pyupio/safety#47, but it looks like it's blocked on this for now.

audiolion · 2017-12-19T15:04:59Z

Agreed, it would be really nice for people looking to start supporting Pipfile if the API is stable

Refactor the pipfile API into a functional one

4ef0247

jayfk mentioned this pull request Feb 1, 2017

Support Pipfiles pyupio/pyup#197

Open

kennethreitz mentioned this pull request Feb 1, 2017

find's max_depth off by one? #65

Closed

jayfk mentioned this pull request Mar 24, 2017

Add Pipfile support pyupio/safety#47

Closed

dstufft closed this Jun 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Refactor the pipfile API into a functional one #57

[WIP] Refactor the pipfile API into a functional one #57

dstufft commented Jan 27, 2017

kennethreitz commented Feb 1, 2017

dstufft commented Feb 1, 2017

kennethreitz commented Feb 1, 2017

kennethreitz commented May 25, 2017

kennethreitz commented Aug 24, 2017

pradyunsg commented Sep 2, 2017

kennethreitz commented Sep 2, 2017

kennethreitz commented Sep 2, 2017

kennethreitz commented Sep 2, 2017

jmcarp commented Nov 4, 2017

audiolion commented Dec 19, 2017

[WIP] Refactor the pipfile API into a functional one #57

[WIP] Refactor the pipfile API into a functional one #57

Conversation

dstufft commented Jan 27, 2017

kennethreitz commented Feb 1, 2017

dstufft commented Feb 1, 2017

kennethreitz commented Feb 1, 2017

kennethreitz commented May 25, 2017

kennethreitz commented Aug 24, 2017

pradyunsg commented Sep 2, 2017

kennethreitz commented Sep 2, 2017

kennethreitz commented Sep 2, 2017

kennethreitz commented Sep 2, 2017

jmcarp commented Nov 4, 2017

audiolion commented Dec 19, 2017