Ever since I got involved with open-source Python projects, tox has been vital for testing packages across Python versions (and other factors). However, lately, I’ve been increasingly using Nox for my projects instead. Since I’ve been asked why repeatedly, I’ll sum up my thoughts.

I can’t stress enough that I don’t want to discourage anyone from using tox. tox is amazing. Without tox, the Python open-source ecosystem wouldn’t be where it is. Its authors and maintainers have my eternal gratitude!

I viscerally dislike saying negative things about tox, but it’s impossible to explain my preference without contrasting features and behaviors.

This post is only about why I prefer Nox and my reasons may be entirely irrelevant to you.

Configuration Formats

The most visible difference between tox and Nox is that tox is a DSL on top of the venerable INI format (tox.ini)1, while Nox uses a Python file (noxfile.py).

In case you aren’t familiar with either one, a simple tox.ini like this:

[tox]
envlist = py310,py311

[testenv]
extras = tests
commands = pytest {posargs}

…would look like this in a noxfile.py:

import nox

@nox.session(python=["3.10", "3.11"])
def tests(session):
    session.install(".[tests]")

    session.run("pytest", *session.posargs)

You may notice a difference in nomenclature: What tox calls environments are sessions to Nox.


Now, if you call tox or nox, they both:

  1. Create virtual environments for Python 3.10 and Python 3.11,
  2. install the current package (.) along with its extra dependencies tests in them,
  3. and run pytest from each of them.

The “{posargs}” and “*session.posargs” bits allow you to pass command line arguments to the test runners. Therefore, to make pytest abort after the first error, you could write “nox -- -x” or “tox -- -x”, respectively.


Using Python here might look like a regression to some. Aren’t we just migrating from setup.py to pyproject.toml to get rid of Python-for-configuration?

Yes and no. The problem with setup.py is not that it’s Python, but that it runs uncontrollably on installations.

Running commands – and thus code – on demand is the raison d’être for both tox and Nox; the only difference is how they are defined. And since tox uses an own language to define those commands, you need dedicated features in its DSL to achieve anything. Meanwhile if you want to do something in Nox, you usually just have to write some Python.


Admittedly, one of the dominant reasons why I like Nox is that returning to a nontrivial tox.ini after a longer time has become a challenge for me. Just recently, I’ve noticed that in environ-config the tox environment that was supposed to check the test suite passes with the oldest-supported version of attrs doesn’t work anymore. I’ve defined it like this:

[testenv]
extras = tests
deps = oldestAttrs: attrs==17.4.0
commands = python -m pytest {posargs}

But although tox does install attrs 17.4.0 first, it overwrites it with the latest version when installing the project. Why? I never figured it out, but I’m 99.9% sure it used to work2. None of the other dependencies need a newer version and it still looks correct to me.

INI Inheritance vs. Python Functions

If you squint enough, you realize that – syntax aside – the two configuration principles are a case of code sharing via subclassing vs. code sharing via functions. In tox, you define a base testenv from which all others inherit, but can override any field. This behavior alone is already something that occasionally leaves me scratching my head.

Re-use among sub-environments (e.g. between py37 and py38) in tox is done using factor-dependent statements (like oldestAttrs: above) or substitutions like {[testenv:py37]commands} whose syntax I can never remember and always send me chasing for examples in my other projects.

In Nox, if you want to reuse, you write functions. There’s no other language to learn, just an API. For instance, to run the oldest and newest Python versions under Coverage.py, the rest without3, and additionally run the oldest with a pinned attrs dependency, I’ve come up with the following:

OLDEST = "3.7"

def _cov(session):
    session.run("coverage", "run", "-m", "pytest", *session.posargs)

@nox.session(python=[OLDEST, "3.11"], tags=["tests"])
def tests_cov(session):
    session.install(".[tests]")

    _cov(session)

@nox.session(python=OLDEST, tags=["tests"])
def tests_oldestAttrs(session):
    session.install(".[tests]", "attrs==17.4.0")

    _cov(session)

@nox.session(python=["3.7", "3.8", "3.9", "3.10"], tags=["tests"])
def tests(session):
    session.install(".[tests]")

    session.run("pytest", *session.posargs)

Now, if there were other environments (like Mypy or docs), I could run only tests using nox --tags tests.

In terms of the number of lines, this is longer than the tox equivalent. But that’s because it’s more explicit and anyone with a passing understanding of Python can deduce what’s happening here – including myself, looking at it in a year. Explicit can be good, actually.

The Power of the Snake

Of course, Nox is a lot more powerful than tox out-of-the-box, courtesy of Python. In the end, you’ve got the whole standard library at your disposal! You can read and write files, create temporary directories, format strings, make HTTP requests, … all without relying on platform features.

With tox, these are things that you often need to write a shell script (that probably doesn’t work on Windows) and call it from your tox.ini. Because tox doesn’t wrap its calls in a shell (unlike, say Hatch), you’re pretty limited in what you can do: no pipes, no subcommands, no output redirection.

The only (purely ergonomic) downside of Nox is that it forces you to use the non-shell version of subprocess.run(). This can sometimes lead to rather brutalist command lines:

@nox.session(python="3.10")
def docs(session: nox.Session) -> None:
    session.install(".[docs]")

    for cmd in ["html", "doctest"]:
        session.run(
            # fmt: off
            "python", "-m", "sphinx",
            "-T", "-E",
            "-W", "--keep-going",
            "-b", cmd,
            "-d", "docs/_build/doctrees",
            "-D", "language=en",
            "docs",
            "docs/_build/html",
            # fmt: on
        )

    session.run("python", "-m", "doctest", "README.md")

But given the problems of shell-wrapping (c.f. Docker or variable name sanitization), this is probably a net positive nevertheless. Even if I had to switch Black off (# fmt: off) so it doesn’t get too gross.

Bonus Tip: Python Versions as First-Class Selectors

As James Bennett astutely observed, one cool feature of Nox is that Python versions are first-class selectors for sessions – while for tox it’s just a factor like any other.

That means that you can call “nox --python 3.10” and all sessions that are marked for Python 3.10 run. That’s super useful in CI where you don’t need to map from setup-python’s version numbers (“3.11”) to tox’s environments (py311 – either by hand or using one of tox-gh or tox-gh-actions). For instance with GitHub Actions, you could write:

jobs:
  tests:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.10", "3.11"]

    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: ${{ matrix.python-version }}
      - run: python -m pip install --upgrade nox

      - run: python -Im nox --python ${{ matrix.python-version }}

There’s only one tiny quibble that only affects a select few, but since I’m one of them as usual, I solved it anyway: It’s about Python pre-releases that can be be installed using an version specification like “~3.12.0-0”.

Nox doesn’t understand that syntax, so let’s add a little Bash magic and extract characters 1–4 if ${{ matrix.python-version }} starts with a “~”, and save the result into an environment variable that we then use when calling Nox:

       - name: Determine Python version for Nox
         run: |
           V=${{ matrix.python-version }}
           if [[ "$V" = ~* ]]; then
             V=${V:1:4}
           fi

           echo NOX_PYTHON=$V >>$GITHUB_ENV           

       - run: python -Im nox --python ${{ env.NOX_PYTHON }}

tox 4 has the concept of selecting single factors using -f; therefore you can run all your 3.10 environments using “tox -f py310”. The version numbers don’t match that well, but can be made.

However, it’s not the same! “tox -f py310” will only run environments starting with py310 (e.g. py310 or py310-foo), not everything that is defined to use Python 3.10 (e.g. an environment called docs that sets “basepython = python3.10”).

Conclusion

Again, this article is not a call to abandon tox and move all your projects to Nox – I haven’t done that myself and I don’t plan to. But if my issues resonate with you, there’s an option!


  1. At least it’s not YAML. ↩︎

  2. I did try to pin tox<4 to no avail. ↩︎

  3. Running code under Coverage slows it down – sometimes considerably↩︎