Update roadmap with a clearly articulated security model & strategy #5718

glyph · 2019-04-18T23:30:32Z

What's the problem this feature will solve?
Right now, PyPI has a way to report a security issue, but no clear description of what a "security issue" might be. Efforts like #5567 will improve the security of the site, but to what end?

Meanwhile, attacks against the open source supply chain are escalating, and more typo-squatting malware gets posted to PyPI every day.

Describe the solution you'd like

I'd like https://pypi.org/security/ to describe the threat model of PyPI and what properties it attempts to provide. In particular: what constitutes a security issue that should be reported
I'd like https://warehouse.readthedocs.io/security/ to describe what properties it would like to provide in the long term. Particularly, where do efforts like the TOTP work fit into a long-term vision for the security of the site and for its users?

brainwane · 2019-11-19T00:17:04Z

This evening I gave a talk to some students in an application security class, and figured my notes could be used to start addressing this issue.

The section headings are borrowed from the textbook The art of software security assessment: identifying and preventing software vulnerabilities by Mark Dowd (Chapter 4. Application Review Process):

General application purpose—What is the application supposed to do?
Fundamental security expectations—What security expectations do legitimate users of this application have?
Assets and entry points—How does data get into the system, and what value does the system have that an attacker might be interested in?
Components and modules—What are the major divisions between the application’s components and modules?
Intermodule relationships—At a high level, how do different modules in the application (within Warehouse) communicate?
Major trust boundaries—What are the major boundaries that enforce security expectations?

General application purpose: What is PyPI/Warehouse?

Glossary.

language-specific platform for sharing packages -- both libraries and applications
part of a toolchain; https://packaging.python.org/ covers the official open source tools for uploading and downloading (most people use PyPI by downloading via pip)
Since reads are much more common than writes (much more goes out than goes in), we try to cache as much as possible.
sdists and wheels -- we are indeed hosting binaries that we haven't inspected -- more at https://packaging.python.org/
History

Fundamental security expectations: Users and what they can do
Reuse user classes from docs and owners vs maintainers.

How do you become one of these kinds of users? This is defined by project namespace. Initial project Owner is the first person to upload a project to PyPI with that project name.

What can these different owners do? See #5863 .

But also! ALL users, including people who are not logged in, can read the records of package activity.

Assets and entry points
How does data get into the system, and what value does the system have that an attacker might be interested in?

API: Packages and projects get into the system via the API (users use Twine).
Web browser: Initial user creation, a lot of privilege creation/change/deletion, and the administrative interface

Components and modules

https://warehouse.readthedocs.io/application/ goes over this a bit.

Pyramid, our web application framework
Database access (we use SQLAlchemy and Postgres)
Auth
Token generation (Macaroons)

Major trust boundaries
What are the major boundaries that enforce security expectations?

Login: API and browser-based
User privileges as defined in the database

brainwane · 2019-11-19T00:21:44Z

There are a few items in #2794 (comment) that should also be in such a document, such as release immutability.

brainwane · 2020-01-17T19:10:02Z

In this discussion thread, @tiran says:

I would like to see a general and user-oriented PEP about PyPI security to answer these questions:

How is a package owner/maintainer able to verify that PyPI is serving correct and unmodified files?
As a user of PyPI how can I make sure that pip installs correct and unmodified packages?
As a user of PyPI how can I protect myself against typo-squatting attacks or compromised versions of a package?

and Donald Stufft notes,

this feels to me more like something that should be documented either on PyPI or as part of packaging.python.org.

I think documentation of the answers to those questions ought to be incorporated into the documentation push @glyph is suggesting.

tiran · 2020-01-17T22:33:00Z

Thanks @brainwane

My thought provoking, inconvenient, and brutally honest opinion is: PyPI won't be able to deliver this in it's current shape and design. Sooner or later we have to consider a different model that works more like current app stores or Linux distributions. I'm talking about curated content.

I have been thinking about the matter for a while. All I have so far is a half-baked, handwavy proposal of a three layered index:

Standard PyPI as it works today
A filtered subset of PyPI that offers only projects that have gone through a review process.
A subset of (2) that requires each upload, release, and uploader go through vetting and verification process.

Layer (2) should get rid of typo squatting. Layer (3) requires considerable effort but might be a way to generic revenue to support maintenance of PyPI and its tooling.

ncoghlan · 2020-02-06T12:13:12Z

PyPI is a publishing platform, not a curation platform, and building a language specific curation service doesn't make sense. It's unfortunate that Red Hat chose not to fund further work on https://fedoraproject.org/wiki/Env_and_Stacks/Projects/SoftwareComponentPipeline, but that's still well outside the scope of PyPI, and it's honestly well outside the scope of the PSF as well.

PyPI's job is to make sure that users can verify that what they installed is what the publisher uploaded.

Determining whether or not a particular publisher is trustworthy is a whole different story, and the onus for that will always remain primarily on consumers.

di added the documentation label Apr 30, 2019

brainwane modified the milestone: Package signing & detection/verification Jun 19, 2019

brainwane added developer experience Anything that improves the experience for Warehouse devs needs discussion a product management/policy issue maintainers and users should discuss labels Jun 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update roadmap with a clearly articulated security model & strategy #5718

Update roadmap with a clearly articulated security model & strategy #5718

glyph commented Apr 18, 2019

brainwane commented Nov 19, 2019

brainwane commented Nov 19, 2019

brainwane commented Jan 17, 2020

tiran commented Jan 17, 2020

ncoghlan commented Feb 6, 2020

Update roadmap with a clearly articulated security model & strategy #5718

Update roadmap with a clearly articulated security model & strategy #5718

Comments

glyph commented Apr 18, 2019

brainwane commented Nov 19, 2019

brainwane commented Nov 19, 2019

brainwane commented Jan 17, 2020

tiran commented Jan 17, 2020

ncoghlan commented Feb 6, 2020