A completely incomplete guide to packaging a Python module and sharing it with the world on PyPI.

Abstract

Even with the end of 2019 looming, getting your first Python package up to PyPI2 can be a daunting task that ocasionally leaves people keeping their code to themselves.

Therefore I will be using my own project attrs as a realistic yet simple example of how to get a pure-Python 2/3 module packaged up, tested, and uploaded to PyPI. Including the binary wheel format that's fast and allows for pre-compiled binary extensions!

Note on the October 2019 update: This is another huge update after its initial release in 2013 and catches up with the latest developments (a lot happened!) since the last big update in 2017. Additionally, I have removed the parts on keyring because I stopped using it myself: it's sort of nice to double-check before uploading anything. If you want to automate the retrieval of your PyPI credentials, check out glyph‘s blog post Careful With That PyPI.

Tools Used

This is not a history lesson, therefore we will use:

  • pip to install packages,
  • setuptools, wheel, and pep517 to build your packages in isolated build environments,
  • and twine to upload your package securely to PyPI.

It's most straightforward to install them into the virtual environment in which you develop your project:

$ pip install -U pip pep517 twine

Ideally run this command before each release to ensure all your release tools are up-to-date. The remaining build tools are installed into your isolated build environment by pep517 automagically.

A Minimal Glimpse Into The Past

Forget that there ever was distribute (cordially merged into setuptools), easy_install (part of setuptools, supplanted by pip), or distutils2 aka packaging (was supposed to be the official thing from Python 3.3 on, didn't get done in time due to lack of helping hands, got ripped out by a heart-broken Éric and is abandoned now).

Be just vaguely aware that there are distutils and distlib somewhere underneath but ideally it shouldn't matter to you at all for now.

Acknowledge that the world is still full of guides that contain a step that invokes setup.py (e.g. python setup.py sdist), but be aware that it's considered an anti-pattern by parts of the Python Packaging Authority5 nowadays.

setup.py

Nowadays, the most common, most flexible, and best documented way to package a setuptools-based project is (still) having a setup.py file. It is executed whenever you build a distribution and – unless you're installing a wheel – on each installation1.

For better or for worse, the Python community largely embraced copying bits and pieces of it from one project to another. Since the average setup.py consists only of metadata with some boilerplate code, it even makes sense.

Let’s have a look at a minimal, yet functional setup.py:

import codecs
import os
import re

from setuptools import setup, find_packages


###################################################################

NAME = "attrs"
PACKAGES = find_packages(where="src")
META_PATH = os.path.join("src", "attr", "__init__.py")
KEYWORDS = ["class", "attribute", "boilerplate"]
CLASSIFIERS = [
    "Development Status :: 5 - Production/Stable",
    "Intended Audience :: Developers",
    "Natural Language :: English",
    "License :: OSI Approved :: MIT License",
    "Operating System :: OS Independent",
    "Programming Language :: Python",
    "Programming Language :: Python :: 2",
    "Programming Language :: Python :: 2.7",
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.4",
    "Programming Language :: Python :: 3.5",
    "Programming Language :: Python :: 3.6",
    "Programming Language :: Python :: 3.7",
    "Programming Language :: Python :: 3.8",
    "Programming Language :: Python :: Implementation :: CPython",
    "Programming Language :: Python :: Implementation :: PyPy",
    "Topic :: Software Development :: Libraries :: Python Modules",
]
INSTALL_REQUIRES = []

###################################################################

HERE = os.path.abspath(os.path.dirname(__file__))


def read(*parts):
    """
    Build an absolute path from *parts* and and return the contents of the
    resulting file.  Assume UTF-8 encoding.
    """
    with codecs.open(os.path.join(HERE, *parts), "rb", "utf-8") as f:
        return f.read()


META_FILE = read(META_PATH)


def find_meta(meta):
    """
    Extract __*meta*__ from META_FILE.
    """
    meta_match = re.search(
        r"^__{meta}__ = ['\"]([^'\"]*)['\"]".format(meta=meta),
        META_FILE, re.M
    )
    if meta_match:
        return meta_match.group(1)
    raise RuntimeError("Unable to find __{meta}__ string.".format(meta=meta))


if __name__ == "__main__":
    setup(
        name=NAME,
        description=find_meta("description"),
        license=find_meta("license"),
        url=find_meta("uri"),
        version=find_meta("version"),
        author=find_meta("author"),
        author_email=find_meta("email"),
        maintainer=find_meta("author"),
        maintainer_email=find_meta("email"),
        keywords=KEYWORDS,
        long_description=read("README.rst"),
        long_description_content_type="text/x-rst",
        packages=PACKAGES,
        package_dir={"": "src"},
        zip_safe=False,
        classifiers=CLASSIFIERS,
        install_requires=INSTALL_REQUIRES,
        options={"bdist_wheel": {"universal": "1"}},
    )

As you can see, I’ve accepted that most of setup.py is boilerplate and put the metadata into a separate block (enclosed by #s, lines 8 thru 34). Thus I can copy and paste everything beneath the # block between my projects as it pleases me.

I’ve also accepted that I’m but a fallible bag of meat and therefore all my metadata is saved in my __init__.py files und extracted using regular expressions. Another approach is to put this data into a special module and parse that file using Python like PyCA’s cryptography does. Which approach you take is a matter of personal preference. Importing your actual __init__.py is a bad idea though because you will run into dependency problems.

This in why I’m still using setuptools: the Python-based setup.py allows me tricks like this. None of the alternatives currently have a good way to avoid duplicate metadata between the package and the code (flit shares the __version__ string at least).


As you can see, I’m putting my packages into an un-importable src directory. I wrote down my reasons elsewhere and I encourage you to follow suit if you value correctness and clean project directories over typing four characters.

The packages field uses setuptools‘s find_packages() to detect packages underneath src and the package_dir field explains where the root package is found. Please note that find_packages() without an src directory will package and install anything that has an __init__.py underneath the current directory. This may include examples or test directories, which is almost certainly not something that you want but it happened before. Just use an src directory and avoid traps like this.


How to set and keep a project’s version is a matter of taste3 and different solutions. After using a few tools in the past, I’ve resigned to making the in-development version the next version with an .dev0 suffix (e.g. "19.4.0.dev0"). Whenever I publish a new version, I strip the suffix, push the package to PyPI, increment the version number, and add the suffix to the new version. This way in-development versions are easily discernible from both the predecessor and the final release.

The classifiers field’s usefulness is openly disputed. Nevertheless pick them from here. PyPI will refuse to accept packages with unknown classifiers. Therefore I like to use "Private :: Do Not Upload" for private packages to protect myself from my own stupidity.


The final issue is dependencies: unless you really know what you’re doing, don’t pin them to fixed version numbers or your users won’t be able to install security updates of your dependencies without your intervention.

Rule of thumb: requirements.txt should contain only ==, while setup.py everything except ==.

Non-Code Files

Every Python project has a

Fix MANIFEST.in

commit. Look it up, it’s true.

You have to add all files and directories that are not already packaged due to the packages keyword (or py_modules if your project is not a package) of your setup() call.

For attrs it’s something like this:

include LICENSE *.rst *.toml *.yml *.yaml
graft .github

# Stubs
include src/attr/py.typed
recursive-include src *.pyi

# Tests
include tox.ini .coveragerc conftest.py
recursive-include tests *.py

# Documentation
include docs/Makefile docs/docutils.conf
recursive-include docs *.png
recursive-include docs *.svg
recursive-include docs *.py
recursive-include docs *.rst
prune docs/_build

For more commands, have a look at the MANIFEST.in docs. If you would like to avoid the aforementioned commit, start your projects with check-manifest in your CI that will also give you helpful hints on how to fix the errors it reports.

Important: If you want the files and directories from MANIFEST.in to also be installed (e.g. if it’s runtime-relevant data), you will have to set include_package_data=True in your setup() call.

Configuration

setup.cfg

You don't need a setup.cfg anymore! At least not for packaging. wheel nowadays collects license files automatically, and we tell setuptools to build universal wheels (e.g. attrs-19.3.0-py2.py3-none-any.whl instead of one wheel per Python version) in the setup.py using the options keyword argument.

pyproject.toml

The biggest visible change since the first publishing of this article in 2013 is the implementation of PEP 517 and PEP 518 that gave us the pyproject.toml file and the concepts of pluggable build backends and isolated builds.

Since we’re using setuptools here, it should look like this:

[build-system]
requires = ["setuptools>=40.6.0", "wheel"]
build-backend = "setuptools.build_meta"

# Not necessary for packaging but every self-respecting Python
# package should a) use black and b) fix the WRONG default.
[tool.black]
line-length = 79

Documentation

Firstly, every open source project needs a license. If you want your package to be as widely used as possible, MIT or one of the BSD variants are a good choice. Apache License Version 2 is even better due to its patent protection, but it’s incompatible with GPLv2. As sad as it sounds, in the end choose the license you’re able and willing to enforce.

Secondly, even the simplest package needs a README that tells potential users what they’re looking at. Make it reStructuredText (reST) you can easily convert it into proper Sphinx documentation later4.

As a courtesy to your users, also keep an easily discoverable changelog so they know what to expect from your releases. I like to put them into the project root directory and call them README.rst and CHANGELOG.rst respectively. The changelog is also included as part of my Sphinx documentation in docs/changelog.rst using a simple

.. include:: ../CHANGELOG.rst

My long project description and thus PyPI text is also the README.rst.

If you host your project on GitHub, you may want to add a CONTRIBUTING.rst that gets displayed when someone wants to open a pull request. Have a look at attrs‘s if you need inspiration.

Let’s Build Already!

Now that you put everything into place, building the packages is just a matter of:

$ rm -rf build dist
$ python -m pep517.build .

The first line ensures you get a clean build with no lingering artifacts from previous builds. The second one builds a source distribution (sdist) that ends with .tar.gz and a wheel ending in .whl in the dist directory.

For attrs, it could look like this:

dist
├── attrs-19.3.0-py2.py3-none-any.whl
└── attrs-19.3.0.tar.gz

Using pep517.build to build packages somewhat controversial as of October 2019. It seems to be the consensus that this build functionality ought to be merged into either pip or twine but not when or how. Currently is appears to be the cleanest way to build a package that uses modern PEP 517 tooling and it is definitely superior to invoking your setup.py. I will update this article once this gets resolved.


You can test whether both install properly before we move on to uploading (the examples use UNIX commands; Windows would be similar):

$ rm -rf venv-sdist  # ensure clean state if ran repeatedly
$ virtualenv venv-sdist
...
$ venv-sdist/bin/pip install dist/attrs-19.3.0.tar.gz
Looking in indexes:  https://pypi.org/simple
Processing ./dist/attrs-19.3.0.tar.gz
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Building wheels for collected packages: attrs
  Building wheel for attrs (PEP 517) ... done
  Created wheel for attrs: filename=attrs-19.3.0-cp38-none-any.whl size=39472 sha256=3f1118356488a1b249bd707cfa398ab57a40eed870c12515d853a4dafac75246
  Stored in directory: /Users/hynek/Library/Caches/pip/wheels/d8/28/f6/ef381f356e240015951029acc09be24a30991527dd554db0ee
Successfully built attrs
Installing collected packages: attrs
Successfully installed attrs-19.3.0
$ venv-sdist/bin/python
Python 3.8.0 (v3.8.0:fa919fdf25, Oct 14 2019, 10:23:27)
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import attr; attr.__version__
'19.3.0'

and

$ rm -rf venv-wheel  # ensure clean state if ran repeatedly
$ virtualenv venv-wheel
...
$ venv-wheel/bin/pip install dist/attrs-19.3.0-py2.py3-none-any.whl
Looking in indexes:  https://pypi.org/simple
Processing ./dist/attrs-19.3.0-py2.py3-none-any.whl
Installing collected packages: attrs
Successfully installed attrs-19.3.0
$ venv-wheel/bin/python
Python 3.8.0 (v3.8.0:fa919fdf25, Oct 14 2019, 10:23:27)
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import attr; attr.__version__
'19.3.0'

Note that in the first code block, pip installs the source distribution by first building a wheel (which involves executing arbitrary code), and then installing that (which doesn't). By providing a pre-built wheel, you make it possible for your users to install your package without executing arbitrary code!

So you’re confident that your package is perfect? Let’s use the test PyPI server to find out!

The PyPI Staging Server

Again, I’ll be using attrs as the example project name to avoid <your project name> everywhere.

First, sign up on the test server, you will receive a user name and a password. Please note that this is independent from the live servers. Hence you’ll have to register separately. It also gets cleaned from time to time so don’t be surprised if it suddenly doesn’t know about you or your projects anymore. Just re-register.

Next, create a ~/.pypirc consisting of:

[distutils]
index-servers=
    test

[test]
repository = https://test.pypi.org/legacy/
username = <your user name goes here>

Finally, let’s use twine to safely upload our previously built distributions:

$ twine upload -r test --sign dist/attrs-19.3.0*

The --sign option tells twine to sign the package with your GnuPG key. Omit it if you do not want to do that.

Now try to install your package again:

$ pip install -i https://testpypi.python.org/pypi attrs

Everything dandy? Does the project description on the test PyPI page look correctly? Then it's time for the last step: putting it on the real PyPI!

The Final Step

First, register at production PyPI and make sure you activate multi factor authentication. Your packages are security liabilities to all of its users, hence do everything you can to keep them safe.

Then complete your ~/.pypirc:

[distutils]
index-servers=
    pypi
    test

[test]
repository = https://test.pypi.org/legacy/
username = <your test user name goes here>

[pypi]
username = __token__

Yes, we're not using our actual username/password for uploading packages but an upload token! They were introduced in July 2019 and are another imporant step made by the PyPI maintainers to increase the security for everybody. When logged in, you can manage them on your PyPI account page.

One last deep breath and let’s rock:

$ twine upload -r pypi --sign dist/attrs-19.3.0*

If you receive errors about missing repository URLs, your twine is out of date.

And thus, your package is only a pip install away for everyone! Congratulations, do more of that!

Bonus tip: You can delete releases from PyPI but you cannot re-upload them under the same version number! So be careful before uploading and deleting: You can’t just replace a release through a different file.

Next Steps

The information herein will probably get you pretty far but if you get stuck, the current canonical truths for Python packaging are:


If you want to know more about how to make the life of a Python open source maintainer easier, check out my talk Maintaining a Python Project When It’s Not Your Job. Spoiler alert: it's totally relevant even if that project is your job.

Thanks

Over the years, this article has been kindly proof-read by Lynn Root, Donald Stufft, Alex Gaynor, Thomas Heinrichsdobler, Jannis Leidel, and Paul Ganssle. The final hint on how to get rid of setup.cfg came from Daniel Holth, whom we'll also will be eternally indebted for giving us the gift of wheels.

All mistakes are mine.

Footnotes


  1. If you want to avoid writing a setup.py, there are currently two popular alternatives: flit and poetry. As of this writing, they don’t work for me for various reasons. But if they work for you, go wild.

  2. [Pronounced “pie pee eye”, or “cheese shop”, notpie pie”!](https://pypi.org/help/#pronunciation)

  3. As of PEP 0440, PyPI and the packaging ecosystem has opinions on the structure of the version string though.

  4. Both PyPI and Sphinx also support Markdown as of 2019.

  5. Aka PyPA. It's not an actual authority in the sense that they'd be a homogenous body that is elected or officially endorsed by the PSF or the Python core team.

    They do maintain most of the relevant Python packaging tools and libraries – including PyPI itself – though.