Proper cleanup when terminating your application isn’t less important when it’s running inside of a Docker container. Although it only comes down to making sure signals reach your application and handling them, there’s a bunch of things that can go wrong.
In principle it’s really simple: when you – or your cluster management tool – run
docker stop, Docker sends a configurable signal to the entrypoint of your application; with
SIGTERM being the default. While dockerizing my applications I’ve identified four traps (but stepped only into three of them!) that I’d like to share in order to save you some time when investigating.
1. You’re Using the Wrong ENTRYPOINT Form
Dockerfiles allow you to define your entrypoint using a seductively convenient string called the shell form:
ENTRYPOINT "/app/bin/your-app arg1 arg2"
or the slightly more obnoxious exec form that makes you supply the command line as a JSON array:
ENTRYPOINT ["/app/bin/your-app", "arg1", "arg2"]
Long story short: always use the latter exec form. The shell form runs your entrypoint as a subcommand of
/bin/sh -c which comes with a load of problems, one of them notably that you’ll never see a signal in your application.
2. Your Entrypoint Is a Shell Script and You Didn’t exec
If you run your application from a shell script the regular way, your shell spawns your application in a new process and you won’t receive signals from Docker.
What you need to do is to tell your shell to replace itself with your application. For that exact purpose, shells have the exec command (the similarity to the exec form before is not an accident: exec syscall).
So instead of
Corollary: You Did exec But You Tricked Yourself By Starting a Subshell
I was always a big fan of runit’s logging behavior where you just log without timestamps to stdout, and runit will prepend all log entries with a tai64n timestamp1.
A naïve implementation in a shell entrypoint might look like this:
exec /app/bin/your-app | tai64n
However, this causes your application to be executed in a subshell with the usual consequence: no signals for you.
Bonus Best Practice: Let Someone Else Be PID 1
If your application is the entrypoint for the container, it becomes PID 1. While that’s certainly glamorous, it comes with a bunch of responsibilities and edge cases that you may have not anticipated and that are closely related to signal handling – notably the handling of
Fortunately for the rest of us, people who do anticipate them, wrote some software so we don’t have to.
The most notable ones of are tini which has been knighted by being merged into Docker 1.13 and is packaged by Ubuntu as of 20.04 LTS (Focal Fossa). Technically, you can use it transparently by passing
--init to your
docker run command. In practice you often can’t because your cluster manager doesn’t support it.
The other one is dumb-init by Yelp. A nice touch for Python developers is that they can install it from PyPI too.
tini and dumb-init are also able to proxy signals to process groups which technically allows you to pipe your output. However, your pipe target receives that signal at the same time so you can’t log anything on cleanup lest you crave race conditions and
SIGPIPEs. Thus it’s better to leave this can of worms closed. Turns out, writing a properly working process manager is far from trivial.
All that to say that my entrypoints look like this:
ENTRYPOINT ["/tini", "-v", "--", "/app/bin/docker-entrypoint.sh"]
3. You’re Listening for the Wrong Signal
Despite the prevalence of
SIGTERM in process managers, many frameworks expect the application to be stopped using
SIGINT aka Control-C. Especially in the Python ecosystem it’s common to do a
try: do_work() except KeyboardInterrupt: cleanup()
and call it a day. Without any further action,
cleanup() is never called if your application receives a
SIGTERM. To add insult to injury, if you’re PID 1, literally nothing happens until Docker loses its patience2 with your container and sends a
SIGKILL to the entrypoint.
So if you’ve ever wondered why your
docker stop takes so long – this might be the reason: you didn’t listen for
SIGTERM and the signal bounced off your process because it’s PID 1. No cleanup, slow shutdown.
The easiest fix is a single line in your Dockerfile:
But you should really try to support both signals and avoid being PID 1 in the first place.
- Use the exec/JSON array form of
execin shell entrypoints.
- Don’t pipe your application’s output.
- Avoid being PID 1.
- Listen for
STOPSIGNALin your Dockerfile.
Sadly I’ve learned that tai64n is slightly problematic because timezones are even slightly worse than I thought (
The amount of patience is configurable using the