Without orientation, deployments of Python applications can be tiresome and even painful. This talk attempts to replace anxiety and pain through informed annoyance.

This talk is from 2013 and while I stand with everything I say, the world of deployments has changed.

Please have a look at my 2018 talk How to Write Deployment-friendly Applications that was written after the invention of Docker and 12 Factor App.

So far, I’ve held it at PyCon US 2013. And again in a revised and expanded form at EuroPython 2013 in Florence.

Slides: PyCon US 2013, EuroPython 2013.

Solid Python Deployments for Everybody – EuroPython 2013

Intro

Is all fun and game until you are need of put it in production.

DevOps Borat, Tweet
  • Me on Twitter and GitHub.
  • My generous employer Variomedia AG. Get your domains there if you speak German.
  • The concept of “Simple != Easy” is excellently elaborated by Rich Hickey – the inventor of Clojure – in this InfoQ-Talk. He did a shorter version at RailsConf 2012.
  • The origin of Dijkstra’s quote is EWD 498.

Development

  • If you don’t know about Vagrant, you have to look into it now. PyCharm even has direct support for this style of development.

  • If you’re on a Mac, you need Dash.app. It’s all your APIs just a key press away.

    You press your global hotkey and start typing. Dash searches your docsets and presents you the results:

    Dash.app in action.
    Dash.app in action.

    I seriously couldn’t live without it anymore. And for all your Python API docs needs there’s doc2dash by yours truly: convert any Sphinx or pydoctor (Twisted!) docs into a dash docset.

    There are alternatives for Linux, Windows, and Emacs.

Ancient Python

  • The mentioned requests “bug” on GitHub.
  • If you don’t want to build yourself, others have already.
  • Red Hat even officially unveiled a collection of up-to-date software for RHEL 6. Including Python 2.7 and 3.3.
  • If you can’t install anything on your servers but don’t want to change your job, you may want to look into PyRun.
  • The solution to corporate bullshit.

Virtual Environments

Before you start, add

[install]
download-cache = ~/.pip/download_cache

to your ~/.pip/pip.conf so each version gets downloaded only once. Alternatively use the --download/--find-links combo. Since PyPI moved to a CDN, do not use the pip option --use-mirrors/-M anymore, it’s slower.

  • Your tools of trade: virtualenv, virtualenvwrapper and pip.
  • I used activate to keep my slides short, but don’t use it yourself. Just use direct paths to the Python tools inside of the virtualenv directory.
  • Do run a private PyPI mirror. For example devpi. Don’t make the success of your deployments dependent on a third party.
  • py.test is my favorite testing tool.
  • Vincent goes into further detail, why you should pin you packages hard.
  • He also wrote pip-tools to keep your virtualenv up-to-date. Recently, pip gained the list -o option that serves a similar purpose.
  • Semantic Versioning, or SemVer is a set of promises how version numbers affect compatibility. It’s well-intentioned but nothing to rely on.
  • Mozilla Services doesn’t like virtualenvs, but they still don’t use packages by their distribution, pin their dependencies and have only one Python app per server.

Packaging

  • My blog post on Python deployments with native packages.
  • Fabric & git are awesome. But IMHO not for deployments.
  • FPM makes building native packages a non-issue.
  • It’s awesome to abuse your building pipeline and compile LESS and SASS to CSS, and CoffeeScript to JavaScript. How about collecting your Django static files and building your translation catalogs?
  • Afterward, use YUI Compressor to make it smaller and add cache busting tags to load it only once.
  • parcel tries to come up with a generic solution based on my blog article.

Alternative Approaches

  • Dan Bravender disagrees politely with me.
  • Some people swear by buildout. It seems like a valid alternative for building virtualenvs before packaging them. The usage in-place on production servers suffers from the usual shortcomings of too many moving parts.
  • YADT is also an ambitious approach to deployments.

Configuration Management

Pick your poison, they’re all complex and painful. You’ll have to plan ahead. And then realize you planned wrong and refactor.

  • Ansible, Python-based, gaining a lot of traction lately.
  • CFEngine
  • BCFG2, Python-based but XML-infested.
  • Chef, config reeks of Ruby. Kate Heddleston did a talk about it in the PyCon US slot right after mine.
  • Puppet, config reeks of Ruby.
  • Salt, Python-based.

Be Paranoid

  • A 0day is something that can hit you any day. Be prepared.
  • The PSF MoinMoin attack.
  • authbind allows user processes to bind to privileged ports.
  • There’s a plethora of worker APIs, many are light weighted. Put your privileged code into small single-purpose chunks of code: celery, RQ, zerorpc, AMP, Perspective Broker.
  • Make sure your SSL is configured adequately.
  • UFW makes iptables configuration more approachable. See CVE-2013-1899 why you want to be very strict about the accessibility of your ports.
  • UNIX file sockets with restrictive permissions are your friends. And you can stop coming up with port numbers.
  • Be as strict as possible with database permissions.
  • Use fail2ban to block out brute forcers. You’ll be surprised how many that are.

screen Is Not a Process Manager

  • Obviously, use tmux. But not for your server processes.
  • We run Ubuntu and thus Upstart serves us well. RHEL 6 and its clones ship it too. If you’re on Fedora for some weird reason, you should look into systemd which will also be part of RHEL 7.
  • supervisord is a pure Python solution that works mostly fine. Useful in heterogenous environments.
  • Mozilla’s/Tarek’s Circus is a new take on process management with additional features like socket sharing.
  • A comprehensive comparison of various process/daemon managers. They even mention screen/tmux!
  • Log to stderr, let your WSGI container/twistd take care of sending it to syslog with your program name, filter using syslog, rotate using logrotate. Consider centralized logging. I wrote more about it on my blog.

You May Not Need an Attack Helicopter

Using Apache…eh…Apache httpd with mod_wsgi is fine. But:

  • nginx will probably solve all your web server needs. And:
  • gunicorn or uwsgi give you great lightweight WSGI serving.
  • If you go for gunicorn, you may want to look into Unicorn Herder too.
  • nginx has some ghetto load balancing built in. But if you’re serious about load balancing, you’ll want to look into HAProxy.
  • uwsgi is what we use and it Emperor Mode is awesome. It allows app reloading without losing requests.

If you want to use Apache for good reasons, Graham did a great talk on the topic.

Monitoring

  • There is a whole meme about monitoring’s sucking. Suck it up.
  • Pingdom is your hosted entry-drug into monitoring. It’s also great later on to monitor the reachability of your network.
  • nagios is the open source goto-solutions of monitoring. It is flexible and pretty terrible to configure. You will need something that runs in your network though on the long run to monitor your private services.
  • Icinga is a shiny fork of nagios, the differences are summed up in this video.
  • Riemann is a monitoring solution that’s also intended to collect metrics.

Measure

Hosted

Easier to set up, but pricey on the long run if you collect a lot of metrics – which you should.

  • Librato Metrics are a hosted and easy to use and their graphs look delightful.
  • StatHat is similar but not as shiny.

Self-Hosted

  • We like to collect server stats with collectd and put them into Graphite.
  • scales makes it easy to add metrics to any Python app. Later you can introspect them using a web interface or forward them to Graphite.
  • etsy’s statsd can be used to aggregate your stats before sending them to Graphite.
  • tasseo helps you to create dashboards with Graphite.
  • I have written a simple shell script to package Graphite using fpm.
  • If you wrangle clusters and grids, you should look into Ganglia.

Additional Resources

  • A lot of the talk is based on two of my blog posts: the aforementioned one on native packages and a second one on Python Deployment Anti-Patterns which made it even to Hacker Monthly. Yeah, I’m proud.
  • I really like the book Release It by Michael T. Nygard which talks about solid systems in general.
  • Jacob Kaplan-Moss who brought us Django and an accomplished gentleman of awesomeness in general did a Django deployment workshop on PyCon 2010 which is full of valuable information.
  • He also did a comprehensive talk called “How I Learned To Stop Worrying and Love Deployment” at Django LA. It is similar to mine in some parts and can be found on YouTube in 6 parts: 1, 2, 3, 4, 5 and 6.
  • Prolific as he is, he also did an excellent introductionary talk on web security at PyCon AU 2013 called “Building secure web apps: Python vs the OWASP Top 10”. I sort of disagree in only one point: I don’t think it’s ಠ_ಠ-worthy that there’s nothing like flask-sslify for Pyramid: that kind of logic belongs into the frontend proxy and Pyramid just tends to force one to follow best practices.
  • Another esteemed gentleman of great wisdom is Glyph Lefkowitz of Twisted fame and he did a DjangoCon US keynote in 2011 which disagrees with some of my points but offers a really interesting perspective.
  • A lot of things I say go along with The Twelve-Factor App.

Design Credits