Skip Navigation

Do your routine backend code releases require downtime?

Curious to know how many people do zero-downtime deployment of backend code and how many people regularly take their service down, even if very briefly, to roll out new code.

Zero-downtime deployment is valuable in some applications and a complete waste of effort in others, of course, but that doesn't mean people do it when they should and skip it when it's not useful.

32 comments
  • For our batch workflows, we do have downtime on deploys. It's by design because 0 downtime doesn't add any value. Downtime is usually 5 to 10 minutes. For our services, we rely on lambdas or kubernetes rolling deployments so no downtime.

  • Whenever possible, I've run projects to have zero downtime deployments. Multiple stateless instances behind a load balancer. Deploy one instance at a time, run a health check and move traffic to the fresh instances. Most cloud providers often have these out of the box. Database migrations are run well in advance. New functionality is hidden behind feature flags.

    Zero downtime is nice, but the real benefit is that you force the teams to really think about deployments as migrations to accomplish this policy.

    Your instrumentation and alerting need to be top-shelf you need to automate deployments fully, which means you can fully automate rollbacks.

    The downside is that you have to build everything twice, deployments are slower and there is a significant descaffolding.

    But that's a small price to pay not to be on call outside of business hours to deploy.

  • I write data pipeline code and there is zero downtime. We use kafka to buffer messages from dozens of producers to dozens of consumers on kubernetes.

  • Yeah zero downtime. You ship out the new features but gate them using some system you can control. When all the new features are shipped you turn up the new features until it gets to 100%. This lets you observe the real world behavior of the new features if they don’t cache well or cause 500s or what have you you can turn it off without having to ship new code.

    Also if you keep all these feature flags, if you have a situation where you have capacity problems you can turn down features for the survival of the service as a whole.

32 comments