Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The case for removing poetry.lock? #2835

Open
ashleysommer opened this issue Jul 24, 2024 · 3 comments
Open

The case for removing poetry.lock? #2835

ashleysommer opened this issue Jul 24, 2024 · 3 comments
Assignees

Comments

@ashleysommer
Copy link
Contributor

We've been seeing increasingly often situations where CI Jobs for PRs are failing because the poetry.lock file is out of sync with the pyproject.toml file, or where there is a hash conflict in the poetry.lock file. These are usually caused by dependabot updating only the pyproject.toml and not the poetry.lock file (though it is supposed to do both), or dependabot generating an invalid poetry.lock file, that happens sometimes.

There are lots of other ways it can get out of sync, including:

  • Merging many dependabot PRs in a row without rebasing subsequent ones
  • contributors running poetry update to bump the lockfile before creating their PR
  • merging main into an old PR before merging the PR
  • or merging an old PR without updating the lockfile.

Documentation about this on from Poetry:

Committing this file to VC is important because it will cause anyone who sets up the project to use the exact same versions of the dependencies that you are using. Your CI server, production machines, other developers in your team, everything and everyone runs on the same dependencies, which mitigates the potential for bugs affecting only some parts of the deployments. Even if you develop alone, in six months when reinstalling the project you can feel confident the dependencies installed are still working even if your dependencies released many new versions since then. (See note below about using the update command.)

And also from the Poetry docs:

For libraries it is not necessary to commit the lock file.

Again this comes down to the old question, is RDFLib an application or a library? If it is an application, we want everyone who contributes to RDFlib to be using exactly the same dependency versions, and producing exactly the same builds, and we need to build RDFLib in a reproducible way.

If RDFlib is a library, then we can simply include a compatibility-range of dependency versions in the pyproject.toml file, and it is up to the end-user's package management tool to install the right versions that work with the final application.

Personally I only ever use RDFLib as a library. All of my applications, even small scripts, have their own dependency list and start with import rdflib. To me that makes RDFLib a library. But I know there are others who use RDFLib only for the built-in cli tools. So to them, RDFLib is an application.

See this Stackoverflow Answer for a well thought out response to this same issue: https://stackoverflow.com/a/61076546

And also this comment from the stackoverflow answer:

I help maintain a number of closed and open source projects, and they all commit lockfiles, partly because I advocated in favor of it. By now I regret that choice, because it occurs quite often that someone's build is not working and the solution is to delete and re-build the lockfile, after which all of us end up having merge conflicts. –
Arne Commented Apr 7, 2020 at 12:02

Its now 2024 and it seems people (inlcuding us) are facing the same issue.

This is an argument from that thread in favor of keeping the lockfile:

Poetry's lock file is an universal lock file.

This means that Poetry doesn't care about the current environment, neither the Python version in use, nor the platform. Instead it makes sure that dependencies are resolvable within the given Python version range in pyproject.toml. This results in a lock file that is valid on any platform with a Python version within the range given in the pyproject.toml.

This difference to other tools, that produces lock file, is also the reason why Poetry is slower in resolving dependencies. This is also the reason why it is recommended to check in the poetry.lock in your vcs. Doing so, it speed up setting up your development environment and you make sure your environment is reproducible.

So we need to either
a) find a way to always keep everyone using the same lockfile, keep the lockfile up to date in main for every PR including dependabot PRs, and probably even ensure all contributors and all CI environments are using the same version of poetry.
or
b) remove the lockfile from the repo.

@ashleysommer
Copy link
Contributor Author

Note, I found this thread on the poetry issue tracker that describes the issue with content-hash, describing why it happens, and potential work-arounds.
python-poetry/poetry#496

@edmondchuc
Copy link
Contributor

What benefits do we currently gain from having the lockfile, besides ensuring a successful poetry install after cloning the repo?

Option b) does sound appealing due to less maintenance overhead.

For removing lockfile
Even with the application use case of RDFLib, as long as the dependency version constraints are specified in pyproject.toml, they should be supported and guaranteed to work. If there are any runtime issues, then we should update the dependency version constraints or have code paths that handle the dependency breaking changes in the application itself.

Any other considerations for option b)?

@nicholascar
Copy link
Member

Keep in 7.x, remove in 8.x after tidy up of pyproject.toml min versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants