Skip to content

Latest commit

 

History

History
825 lines (661 loc) · 22.8 KB

pep-0594.rst

File metadata and controls

825 lines (661 loc) · 22.8 KB

PEP: 594 Title: Removing dead batteries from the standard library Author: Christian Heimes <christian@python.org> Status: Active Type: Process Content-Type: text/x-rst Created: 20-May-2019 Post-History:

Abstract

This PEP proposed a list of standard library modules to be removed from the standard library. The modules are mostly historic data formats and APIs that have been superseded a long time ago, e.g. Mac OS 9 and Commodore.

Rationale

Back in the early days of Python, the interpreter came with a large set of useful modules. This was often refrained to as "batteries included" philosophy and was one of the corner stones to Python's success story. Users didn't have to figure out how to download and install separate packages in order to write a simple web server or parse email.

Times have changed. The introduction of the cheese shop (PyPI), setuptools, and later pip, it became simple and straight forward to download and install packages. Nowadays Python has a rich and vibrant ecosystem of third party packages. It's pretty much standard to either install packages from PyPI or use one of the many Python or Linux distributions.

On the other hand, Python's standard library is piling up cruft, unnecessary duplication of functionality, and dispensable features. This is undesirable for several reasons.

  • Any additional module increases the maintenance cost for the Python core development team. The team has limited resources, reduced maintenance cost frees development time for other improvements.
  • Modules in the standard library are generally favored and seen as the de-facto solution for a problem. A majority of users only pick 3rd party modules to replace a stdlib module, when they have a compelling reason, e.g. lxml instead of xml. The removal of an unmaintained stdlib module increases the chances of a community contributed module to become widely used.
  • A lean and mean standard library benefits platforms with limited resources like devices with just a few hundred kilobyte of storage (e.g. BBC Micro:bit). Python on mobile platforms like BeeWare or WebAssembly (e.g. pyodide) also benefit from reduced download size.

The modules in the PEP have been selected for deprecation because their removal is either least controversial or most beneficial. For example least controversial are 30 years old multimedia formats like sunau audio format, which was used on SPARC and NeXT workstations in the late 1980ties. The crypt module has fundamental flaws that are better solved outside the standard library.

This PEP also designates some modules as not scheduled for removal. Some modules have been deprecated for several releases or seem unnecessary at first glance. However it is beneficial to keep the modules in the standard library, mostly for environments where installing a package from PyPI is not an option. This can be cooperate environments or class rooms where external code is not permitted without legal approval.

  • The usage of FTP is declining, but some files are still provided over the FTP protocol or hosters offer FTP to upload content. Therefore ftplib is going to stay.
  • The optparse and getopt module are widely used. They are mature modules with very low maintenance overhead.
  • According to David Beazley [5] the wave module is easy to teach to kids and can make crazy sounds. Making a computer generate crazy sounds is powerful and highly motivating exercise for a 9yo aspiring developer. It's a fun battery to keep.

Deprecation schedule

3.8

This PEP targets Python 3.8. Version 3.8.0 final is scheduled to be released a few months before Python 2.7 will reach its end of lifetime. We expect that Python 3.8 will be targeted by users that migrate to Python 3 in 2019 and 2020. To reduce churn and to allow a smooth transition from Python 2, Python 3.8 will neither raise DeprecationWarning nor remove any modules that have been scheduled for removal. Instead deprecated modules will just be documented as deprecated. Optionally modules may emit a PendingDeprecationWarning.

All deprecated modules will also undergo a feature freeze. No additional features should be added. Bug should still be fixed.

3.9

Starting with Python 3.9, deprecated modules will start issuing DeprecationWarning.

3.10

In 3.10 all deprecated modules will be removed from the CPython repository together with tests, documentation, and autoconf rules.

PEP acceptance process

3.8.0b1 is scheduled to be release shortly after the PEP is officially submitted. Since it's improbable that the PEP will pass all stages of the PEP process in time, I propose a two step acceptance process that is analogous Python's two release deprecation process.

The first provisionally accepted phase targets Python 3.8.0b1. In the first phase no code is changes or removed. Modules are only documented as deprecated.

The final decision, which modules will be removed and how the removed code is preserved, can be delayed for another year.

Deprecated modules

The modules are grouped as data encoding, multimedia, network, OS interface, and misc modules. The majority of modules are for old data formats or old APIs. Some others are rarely useful and have better replacements on PyPI, e.g. Pillow for image processing or NumPy-based projects to deal with audio processing.

Table 1: Proposed modules deprecations
Module Deprecated in To be removed Replacement
aifc 3.8 3.10 -
asynchat 3.8 3.10 asyncio
asyncore 3.8 3.10 asyncio
audioop 3.8 3.10 -
binhex 3.8 3.10 -
cgi 3.8 3.10 -
cgitb 3.8 3.10 -
chunk 3.8 3.10 -
colorsys 3.8? 3.10? -
crypt 3.8 3.10 -
fileinput 3.8 3.10 argparse
formatter 3.4 3.10 -
fpectl 3.7 3.7 -
getopt 3.2 keep argparse, optparse
imghdr 3.8 3.10 -
imp 3.4 3.10 importlib
lib2to3 - keep  
macpath 3.7 3.8 -
msilib 3.8 3.10 -
nntplib 3.8 3.10 -
nis 3.8 3.10 -
optparse - keep argparse
ossaudiodev 3.8 3.10 -
pipes 3.8 3.10 subprocess
smtpd 3.7 3.10 aiosmtpd
sndhdr 3.8 3.10 -
spwd 3.8 3.10 -
sunau 3.8 3.10 -
uu 3.8 3.10 -
wave - keep  
xdrlib 3.8 3.10 -

Data encoding modules

binhex

The binhex module encodes and decodes Apple Macintosh binhex4 data. It was originally developed for TSR-80. In the 1980s and early 1990s it was used on classic Mac OS 9 to encode binary email attachments.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
none

uu

The uu module provides uuencode format, an old binary encoding format for email from 1980. The uu format has been replaced by MIME. The uu codec is provided by the binascii module.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
none

xdrlib

The xdrlib module supports the Sun External Data Representation Standard. XDR is an old binary serialization format from 1987. These days it's rarely used outside specialized domains like NFS.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
none

Multimedia modules

aifc

The aifc module provides support for reading and writing AIFF and AIFF-C files. The Audio Interchange File Format is an old audio format from 1988 based on Amiga IFF. It was most commonly used on the Apple Macintosh. These days only few specialized application use AIFF.

Module type
pure Python (depends on audioop C extension)
Deprecated in
3.8
To be removed in
3.10
Substitute
none

audioop

The audioop module contains helper functions to manipulate raw audio data and adaptive differential pulse-code modulated audio data. The module is implemented in C without any additional dependencies. The aifc, sunau, and wave module depend on audioop for some operations.

Module type
C extension
Deprecated in
3.8
To be removed in
3.10
Substitute
none

colorsys

The colorsys module defines color conversion functions between RGB, YIQ, HSL, and HSV coordinate systems. The Pillow library provides much faster conversation between color systems.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
Pillow, colorspacious

chunk

The chunk module provides support for reading and writing Electronic Arts' Interchange File Format. IFF is an old audio file format originally introduced for Commodore and Amiga. The format is no longer relevant.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
none

imghdr

The imghdr module is a simple tool to guess the image file format from the first 32 bytes of a file or buffer. It supports only a limited amount of formats and neither returns resolution nor color depth.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
n/a

ossaudiodev

The ossaudiodev module provides support for Open Sound System, an interface to sound playback and capture devices. OSS was initially free software, but later support for newer sound devices and improvements were proprietary. Linux community abandoned OSS in favor of ALSA [1]. Some operation systems like OpenBSD and NetBSD provide an incomplete [2] emulation of OSS.

Module type
C extension
Deprecated in
3.8
To be removed in
3.10
Substitute
none

sndhdr

The sndhdr module is similar to the imghdr module but for audio formats. It guesses file format, channels, frame rate, and sample widths from the first 512 bytes of a file or buffer. The module only supports AU, AIFF, HCOM, VOC, WAV, and other ancient formats.

Module type
pure Python (depends on audioop C extension for some operations)
Deprecated in
3.8
To be removed in
3.10
Substitute
n/a

sunau

The sunau module provides support for Sun AU sound format. It's yet another old, obsolete file format.

Module type
pure Python (depends on audioop C extension for some operations)
Deprecated in
3.8
To be removed in
3.10
Substitute
none

Networking modules

asynchat

The asynchat module is build on top of asyncore and has been deprecated since Python 3.6.

Module type
pure Python
Deprecated in
3.6
Removed in
3.10
Substitute
asyncio

asyncore

The asyncore module was the first module for asynchronous socket service clients and servers. It has been replaced by asyncio and is deprecated since Python 3.6.

The asyncore module is also used in stdlib tests. The tests for ftplib, logging, smptd, smtplib, and ssl are partly based on asyncore. These tests must be updated to use asyncio or threading.

Module type
pure Python
Deprecated in
3.6
Removed in
3.10
Substitute
asyncio

cgi

The cgi module is a support module for Common Gateway Interface (CGI) scripts. CGI is deemed as inefficient because every incoming request is handled in a new process. PEP 206 considers the module as designed poorly and are now near-impossible to fix.

Several people proposed to either keep the cgi module for features like cgi.parse_qs() or move cgi.escape() to a different module. The functions cgi.parse_qs and cgi.parse_qsl have been deprecated for a while and are actually aliases for urllib.parse.parse_qs and urllib.parse.parse_qsl. The function cgi.quote has been deprecated in favor of html.quote with secure default values.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
none

cgitb

The cgitb module is a helper for the cgi module for configurable tracebacks.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
none

smtpd

The smtpd module provides a simple implementation of a SMTP mail server. The module documentation recommends aiosmtpd.

Module type
pure Python
Deprecated in
3.7
To be removed in
3.10
Substitute
aiosmtpd

nntplib

The nntplib module implements the client side of the Network News Transfer Protocol (nntp). News groups used to be a dominant platform for online discussions. Over the last two decades, news has been slowly but steadily replaced with mailing lists and web-based discussion platforms.

The nntplib tests have been the cause of additional work in the recent past. Python only contains client side of NNTP. The test cases depend on external news server. These servers were unstable in the past.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
none

Operating system interface

crypt

The crypt module implements password hashing based on crypt(3) function from libcrypt or libxcrypt on Unix-like platform. The algorithms are mostly old, of poor quality and insecure. Users are discouraged to use them.

  • The module is not available on Windows. Cross-platform application need an alternative implementation any way.
  • Only DES encryption is guarenteed to be available. DES has an extremely limited key space of 2**56.
  • MD5, salted SHA256, salted SHA512, and Blowfish are optional extension. SSHA256 and SSHA512 are glibc extensions. Blowfish (bcrypt) is the only algorithm that is still secure. However it's in glibc and therefore not commonly available on Linux.
  • Depending on the platform, the crypt module is not thread safe. Only implementations with crypt_r(3) are thread safe.
Module type
C extension + Python module
Deprecated in
3.8
To be removed in
3.10
Substitute
bcrypt, passlib, argon2cffi, hashlib module (PBKDF2, scrypt)

macpath

The macpath module provides Mac OS 9 implementation of os.path routines. Mac OS 9 is no longer supported

Module type
pure Python
Deprecated in
3.7
Removed in
3.8
Substitute
none

nis

The nis module provides NIS/YP support. Network Information Service / Yellow Pages is an old and deprecated directory service protocol developed by Sun Microsystems. It's designed successor NIS+ from 1992 never took off. For a long time, libc's Name Service Switch, LDAP, and Kerberos/GSSAPI are considered a more powerful and more secure replacement of NIS.

Module type
C extension
Deprecated in
3.8
To be removed in
3.10
Substitute
none

spwd

The spwd module provides direct access to Unix shadow password database using non-standard APIs. In general it's a bad idea to use the spwd. The spwd circumvents system security policies, it does not use the PAM stack, and is only compatible with local user accounts.

Module type
C extension
Deprecated in
3.8
To be removed in
3.10
Substitute
none

Misc modules

fileinput

The fileinput module implements a helpers to iterate over a list of files from sys.argv. The module predates the optparser and argparser module. The same functionality can be implemented with the argparser module.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
argparse

formatter

The formatter module is an old text formatting module which has been deprecated since Python 3.4.

Module type
pure Python
Deprecated in
3.4
To be removed in
3.10
Substitute
n/a

imp

The imp module is the predecessor of the importlib module. Most functions have been deprecated since Python 3.3 and the module since Python 3.4.

Module type
C extension
Deprecated in
3.4
To be removed in
3.10
Substitute
importlib

msilib

The msilib package is a Windows-only package. It supports the creation of Microsoft Installers (MSI). The package also exposes additional APIs to create cabinet files (CAB). The module is used to facilitate distutils to create MSI installers with bdist_msi command. In the past it was used to create CPython's official Windows installer, too.

Microsoft is slowly moving away from MSI in favor of Windows 10 Apps (AppX) as new deployment model [3].

Module type
C extension + Python code
Deprecated in
3.8
To be removed in
3.10
Substitute
none

pipes

The pipes module provides helpers to pipe the input of one command into the output of another command. The module is built on top of os.popen. Users are encouraged to use the subprocess module instead.

Module type
pure Python
Deprecated in
3.8
To be removed in
3.10
Substitute
subprocess module

Removed modules

fpectl

The fpectl module was never built by default, its usage was discouraged and considered dangerous. It also required a configure flag that caused an ABI incompatibility. The module was removed in 3.7 by Nathaniel J. Smith in bpo-29137.

Module type
C extension + CAPI
Deprecated in
3.7
Removed in
3.7
Substitute
none

Modules to keep

Some modules were originally proposed for deprecation.

lib2to3

The lib2to3 package provides the 2to3 command to transpile Python 2 code to Python 3 code.

The package is useful for other tasks besides porting code from Python 2 to 3. For example black uses it for code reformatting.

Module type
pure Python

getopt

The getopt module mimics C's getopt() option parser. Although users are encouraged to use argparse instead, the getopt module is still widely used.

Module type
pure Python

optparse

The optparse module is the predecessor of the argparse module. Although it has been deprecated for many years, it's still widely used.

Module type
pure Python
Deprecated in
3.2
Substitute
argparse

wave

The wave module provides support for the WAV sound format. The module uses one simple function from the audioop module to perform byte swapping between little and big endian formats. Before 24 bit WAV support was added, byte swap used to be implemented with the array module. To remove wave's dependency on the audioop, the byte swap function could be either be moved to another module (e.g. operator) or the array module could gain support for 24 bit (3 byte) arrays.

Module type
pure Python (depends on byteswap from audioop C extension)
Deprecated in
3.8
To be removed in
3.10
Substitute
n/a

Future maintenance of removed modules

The main goal of the PEP is to reduce the burden and workload on the Python core developer team. Therefore removed modules will not be maintained by the core team as separate PyPI packages. However the removed code, tests and documentation may be moved into a new git repository, so community members have a place from which they can pick up and fork code.

A first draft of a legacylib repository is available on my private Github account.

It's my hope that some of the deprecated modules will be picked up and adopted by users that actually care about them. For example colorsys and imghdr are useful modules, but have limited feature set. A fork of imghdr can add new features and support for more image formats, without being constrained by Python's release cycle.

Most of the modules are in pure Python and can be easily packaged. Some depend on a simple C module, e.g. audioop and crypt. Since audioop does not depend on any external libraries, it can be shipped in as binary wheels with some effort. Other C modules can be replaced with ctypes or cffi. For example I created legacycrypt with _crypt extension reimplemented with a few lines of ctypes code.

Discussions

  • Elana Hashman and Nick Coghlan suggested to keep the getopt module.
  • Berker Peksag proposed to deprecate and removed msilib.
  • Brett Cannon recommended to delay active deprecation warnings and removal of modules like imp until Python 3.10. Version 3.8 will be released shortly before Python 2 reaches end of lifetime. A delay reduced churn for users that migrate from Python 2 to 3.8.
  • Brett also came up with the idea to keep lib2to3. The package is useful for other purposes, e.g. black uses it to reformat Python code.
  • At one point, distutils was mentioned in the same sentence as this PEP. To avoid lengthy discussion and delay of the PEP, I decided against dealing with distutils. Deprecation of the distutils package will be handled by another PEP.
  • Multiple people (Gregory P. Smith, David Beazley, Nick Coghlan, ...) convinced me to keep the wave module. [4]
  • Gregory P. Smith proposed to deprecate nntplib. [4]

References

[1]https://en.wikipedia.org/wiki/Open_Sound_System#Free,_proprietary,_free
[2]https://man.openbsd.org/ossaudio
[3]https://blogs.msmvps.com/installsite/blog/2015/05/03/the-future-of-windows-installer-msi-in-the-light-of-windows-10-and-the-universal-windows-platform/
[4](1, 2) https://twitter.com/ChristianHeimes/status/1130257799475335169
[5]https://twitter.com/dabeaz/status/1130278844479545351

Copyright

This document has been placed in the public domain.