Codecov’s unreliability breaking CI on my open source projects has been a constant source of frustration for me for years. I have found a way to enforce coverage over a whole GitHub Actions build matrix that doesn’t rely on third-party services. has had an option to fail the call to coverage report for a while: either the setting fail_under=XXX or the command line option --fail-under=XXX, where XXX can be whatever percentage you like. The only reason why I’ve traditionally used Codecov (including in my Python in GitHub Actions guide), is because I need to measure coverage over multiple Python versions that run in different containers. Therefore I can’t just run coverage combine, I need to store them somewhere between the various build matrix items.

Unfortunately, Codecov has grown very flaky. I have lost any confidence in the fact when it fails a build and my first reaction is always to restart the build and only then investigate. Sometimes the upload fails, sometimes Codecov fails to report its status back to GitHub, sometimes it can’t find the build, and sometimes it reports an outdated status. What a waste of computing power. What a waste of my time, clicking through their web application, seeing everything green, yet the build is failing due to missing coverage.

When I complained about this once again and even sketched out my idea how it could work, I’ve been told that the cookiecutter-hypermodern-python project has already been doing it1 and there’s a GitHub Action taking the same approach!

So I removed Codecov from structlog and it’s glorious! Not only did I get rid of a flaky dependency, it also simplified my workflow. The interesting parts are the following:

After running the tests under coverage in --parallel mode, upload the files as an artifact:

      - name: Upload coverage data
        uses: actions/upload-artifact@v2
          name: coverage-data
          path: ".coverage.*"
          if-no-files-found: ignore

You need this for every item in your build matrix whose coverage you want to take into account.

I use if-no-files-found: ignore, because I don’t run all Python versions under coverage. It’s much slower and I don’t need every Python version to ensure 100% coverage.

After all tests passed, add a new job:

    name: Combine & check coverage.
    runs-on: ubuntu-latest
    needs: tests
      - uses: actions/checkout@v2
      - uses: actions/setup-python@v2
          # Use latest, so it understands all syntax.
          python-version: "3.10"

      - run: python -m pip install --upgrade coverage[toml]

      - name: Download coverage data.
        uses: actions/download-artifact@v2
          name: coverage-data

      - name: Combine coverage & fail if it's <100%.
        run: |
          python -m coverage combine
          python -m coverage html --skip-covered --skip-empty
          python -m coverage report --fail-under=100          

      - name: Upload HTML report if check failed.
        uses: actions/upload-artifact@v2
          name: html-report
          path: htmlcov
        if: ${{ failure() }}

It sets “needs: tests” to ensure all tests are done. If your job that runs tests has a different name, you will have to adapt this. Check out the full workflow if you’re unsure where exactly put the snippets.

It downloads the coverage data that the tests uploaded to artifacts, combines it, creates an HTML report, and finally checks if coverage is 100% – failing the job if it is not. If – and only if! – this steps fails (presumably due to a lack of coverage), it also uploads the HTML report as an artifact.

Once the workflow is done2, you can download the HTML report from the bottom of the workflow summary page:

The workflow summary with the HTML report at the bottom.
The workflow summary with the HTML report at the bottom.

If you’d like a coverage badge, check out Ned’s guide: Making a coverage badge.

  1. They still upload the combined coverage to Codecov which can be useful for the web interface. I personally don’t think I want this part of my CIs. ↩︎

  2. And only then! I’ve spent a non-significant amount of time hunting for artifacts that didn’t appear, because some job was still running. ↩︎