Safe Deployments with Temporal Worker Versioning on Kubernetes

April 7, 2026 · 60 min
Temporal Kubernetes Worker Versioning

What We’re Building

I recently refactored some personal projects to use Temporal as the orchestrator. When I was updating my pipelines that handle the deployments, and while testing things (ie: playing whack-a-mole with bugs I created as I learned a new SDK), I ran into a fairly common problem in Temporal when making changes to your code base: determinism errors for in-flight workflows. This lead to diving into learning how to reset the event history, terminating my workflows before deployments, code patching / versioning…

Crazy mental connections I now needed to maintain..

Eventually though I started digging to see if there was a way to manage this without juggling multiple live versions of the workflow code. Enter the temporal worker controller. This post walks through an example exercise to help understand, at least at a high level, the problem the controller solves and the basic deployment pattern.

Picture this: you have an agentic workflow making expensive LLM calls. Each research step costs real money and takes real time. Three steps in, your workflow has burned through a few dollars of compute. Then someone deploys a code change. The in-flight workflow hits a non-determinism error and stops dead.

Your options? Reset the workflow to a point before the change. Those LLM calls you already paid for? Gone. That work? Wasted.

There has to be a better way. And there is.

In this lab we’ll build a research agent workflow, break it with a naive deployment, feel that pain firsthand, and then fix it with Temporal’s Worker Versioning on Kubernetes. By the end you’ll understand how to safely roll out workflow code changes without patching, without losing in-flight work, and without any of the usual deployment anxiety.

Here’s what we’ll cover:

SectionWhat You’ll Learn
The Research AgentBuild and deploy a multi-step agentic workflow
The Breaking ChangeSee a non-determinism error destroy a running workflow
Worker VersioningPin workflows to their version so old and new coexist
Long-Running WorkflowsMigrate a polling workflow to a new version with continue-as-new
Sleeping WorkflowsHandle the caveat where idle workflows don’t notice version changes
Scaling & RampingUnderstand the full HA story with progressive rollouts

What You’ll Need

  • Docker (make sure it’s running)

That’s it. Everything else (Python, uv, kubectl, Helm, k3d, Temporal CLI) runs inside a dev environment container. Nothing gets installed on your machine.

Setup

Clone the starter branch:

git clone -b starter https://github.com/mikeacjones/temporal-safe-deploys-lab
cd temporal-safe-deploys-lab

Start the Dev Environment

The repo includes a containerized dev environment with everything pre-installed: Python, uv, kubectl, Helm, k3d, and the Temporal CLI. One command gets you a fully working environment:

./scripts/devenv.sh

This builds the dev container, then automatically:

  1. Installs Python dependencies
  2. Creates a k3d cluster (Kubernetes running inside Docker) with a local container registry
  3. Starts the Temporal dev server in a background screen session

When it’s done, you’ll be dropped into a bash shell with everything ready. The Temporal UI is at localhost:8233.

To check on the Temporal server logs, run screen -r temporal (Ctrl+A then D to detach back to your shell).

When you’re done with the lab, just exit the container. The k3d cluster, registry, and Temporal server are cleaned up automatically. Your system goes back to how it was.

Quick sanity check that the cluster is up:

kubectl get nodes

You should see a single node in Ready state.

The Research Agent

Let’s walk through what the starter code gives you.

The Workflow

src/workflows.py defines a ResearchWorkflow that runs a multi-step research pipeline:

@workflow.defn
class ResearchWorkflow:
    @workflow.run
    async def run(self, topic: str) -> str:
        # Phase 1: Plan the research
        plan = await workflow.execute_activity(
            plan_research, topic,
            start_to_close_timeout=timedelta(seconds=30),
        )

        # Phase 2: Execute each research step (the expensive part)
        findings: list[StepFindings] = []
        for step in plan.steps:
            result = await workflow.execute_activity(
                execute_research_step, step,
                start_to_close_timeout=timedelta(seconds=30),
            )
            findings.append(result)

        # Wait for proceed signal before synthesis
        await workflow.wait_condition(lambda: self._proceed)

        # Phase 3: Synthesize findings
        summary = await workflow.execute_activity(
            synthesize_findings, args=[topic, findings],
            start_to_close_timeout=timedelta(seconds=30),
        )

        return summary.summary

The activities in src/activities.py wrap mocked LLM calls in src/mock_llm.py. Each execute_research_step call sleeps for 5 seconds to simulate an expensive API call. Three steps means about 15 seconds of “LLM compute” per workflow. After the research steps complete, the workflow waits for a proceed signal before running synthesis. This signal gate lets you pause the workflow at a controlled point, which is going to be important in the next section.

There’s also a fact_check activity already defined in the starter code. We’re not using it yet, but it’ll be important later.

The Worker

src/worker.py is a straightforward Temporal worker. It registers both workflows and all four activities (including fact_check) on the research-agent task queue. Nothing fancy here.

Deploy V1

Build the Docker image and deploy to your k3d cluster:

./scripts/build-and-deploy.sh v1

This builds the image, pushes it to your local registry, and deploys it as a standard Kubernetes Deployment via Helm. Check it’s running:

kubectl get pods

You should see a pod for research-agent in a Running state.

Run a Workflow

Start a research workflow:

temporal workflow start \
  --task-queue research-agent \
  --type ResearchWorkflow \
  --input '"quantum computing"' \
  --workflow-id research-test-1

Head to the Temporal UI at localhost:8233. You’ll see the workflow has completed the plan activity and is working through the research steps (~5 seconds each). Once all three steps finish, the workflow waits for the proceed signal. Send it:

temporal workflow signal \
  --workflow-id research-test-1 \
  --name proceed

Now watch the workflow complete with the synthesis activity.

The Breaking Change

Product wants each research step fact-checked before moving on to the next one. Totally reasonable. Let’s add it.

Open src/workflows.py and add the fact_check activity call inside the research loop, right after each step. First, update the import at the top of the file to include fact_check:

with workflow.unsafe.imports_passed_through():
    from src.activities import (
        execute_research_step,
        fact_check,
        plan_research,
        synthesize_findings,
    )

Then modify the research loop to fact-check each step’s findings before moving on:

        findings: list[StepFindings] = []
        for step in plan.steps:
            result = await workflow.execute_activity(
                execute_research_step, step,
                start_to_close_timeout=timedelta(seconds=30),
            )
            await workflow.execute_activity(
                fact_check, result,
                start_to_close_timeout=timedelta(seconds=30),
            )
            findings.append(result)

Now here’s where things go wrong. Start a V1 workflow and let it run through the research steps:

temporal workflow start \
  --task-queue research-agent \
  --type ResearchWorkflow \
  --input '"artificial intelligence"' \
  --workflow-id research-test-2

Check the Temporal UI. The workflow runs the plan, then the three research steps (~15 seconds), and then waits for the proceed signal. All that expensive LLM work is done and recorded in the history.

Now, while it’s paused at the signal gate, deploy V2:

./scripts/build-and-deploy.sh v2

Wait for the new pod to be running (kubectl get pods), then send the proceed signal:

temporal workflow signal \
  --workflow-id research-test-2 \
  --name proceed

What happens: the V2 worker picks up the workflow task and replays the history. Plan matches. Step 1 matches. Then V2’s code says “schedule fact_check” but the V1 history says “schedule step 2” at that position. The commands don’t match the events. Non-determinism error.

Picard Facepalm

The workflow is stuck. Your options:

  1. Reset the workflow to a point before the code divergence. But those three execute_research_step activities already ran. In production, those were real LLM API calls. Real money. Gone.
  2. Use patching to branch the code so old and new workflows both replay correctly. Patching is a solid tool when you need it, but it requires you to maintain both code paths until all old workflows complete. For frequent deploys on long-running workflows, that’s a lot of branches to juggle.

Both options work. But there’s a third option that sidesteps the problem entirely: what if old workflows could just finish on old workers, while new workflows start on new workers? That’s Worker Versioning.

Worker Versioning

The core idea is simple: multiple versions of your worker run simultaneously on the same task queue. The Temporal server routes tasks to the right version. Workflows that started on V1 stay on V1 workers. New workflows go to V2. No conflicts, no non-determinism errors, no lost work.

One thing to call out: all versions share the same task queue. You don’t need separate queues. The Temporal server handles the routing based on the build ID each worker registers with.

On Kubernetes, the Temporal Worker Controller automates all of this. It’s a Kubernetes operator that watches TemporalWorkerDeployment resources and manages the full version lifecycle: spinning up new versions, draining old ones, and cleaning up when they’re done. The source is at temporalio/temporal-worker-controller and the Helm charts are published to Docker Hub as OCI artifacts.

When you update the image tag on a TemporalWorkerDeployment, the controller spins up new pods alongside old pods. Old pods drain their work and eventually get cleaned up. This is sometimes called “rainbow deployments” because you can have many versions running at once.

Install the Worker Controller

Let’s get it set up. Install the CRDs and controller:

helm upgrade --install temporal-worker-controller-crds \
  oci://docker.io/temporalio/temporal-worker-controller-crds \
  --namespace temporal-system \
  --create-namespace \
  --wait

helm upgrade --install temporal-worker-controller \
  oci://docker.io/temporalio/temporal-worker-controller \
  --namespace temporal-system \
  --wait

Next, deploy a TemporalConnection resource so the controller knows how to reach our dev server. There’s a Helm chart for this in the repo:

helm upgrade --install lab-infra ./charts/infra --namespace default

Take a look at charts/infra/templates/temporal-connection.yaml if you’re curious. It points at temporal-lab-devenv:7233 which is the dev environment container’s hostname on the shared Docker network. The k3d cluster and devenv container are on the same network, so the pods can reach Temporal by container name.

Verify everything is in place:

kubectl get crds | grep temporal
kubectl get temporalconnections

You should see temporalworkerdeployments.temporal.io and temporalconnections.temporal.io in the CRDs, and a lab-temporal connection resource.

Configure the Worker

Let’s add versioning to the worker. Open src/worker.py and add the deployment config. First, the new imports:

from temporalio.common import VersioningBehavior, WorkerDeploymentVersion  
from temporalio.worker import Worker, WorkerDeploymentConfig               
from temporalio.worker import Worker                                       

Then add these constants and the deployment_config parameter to the Worker constructor:

TASK_QUEUE = "research-agent"
DEPLOYMENT_NAME = os.environ.get("TEMPORAL_DEPLOYMENT_NAME", "research-agent")
BUILD_ID = os.environ.get("TEMPORAL_WORKER_BUILD_ID", "local-dev")

# ... inside main():

worker = Worker(
    client,
    task_queue=TASK_QUEUE,
    workflows=[ResearchWorkflow, ResearchPollingWorkflow],
    activities=[
        plan_research,
        execute_research_step,
        synthesize_findings,
        fact_check,
    ],
    deployment_config=WorkerDeploymentConfig(
        version=WorkerDeploymentVersion(
            deployment_name=DEPLOYMENT_NAME,
            build_id=BUILD_ID,
        ),
        use_worker_versioning=True,
        default_versioning_behavior=VersioningBehavior.PINNED,
    ),
)

The Worker Controller injects TEMPORAL_DEPLOYMENT_NAME and TEMPORAL_WORKER_BUILD_ID as environment variables when running inside Kubernetes. The "local-dev" fallbacks are for running outside the cluster.

Notice the default_versioning_behavior=VersioningBehavior.PINNED on the config. This sets the default for all workflows registered on this worker. PINNED means: once a workflow starts on a particular build ID, all of its tasks get routed exclusively to workers running that same build. The code never changes underneath a running workflow.

You can also set the behavior per-workflow with @workflow.defn(versioning_behavior=VersioningBehavior.PINNED) if you need different behaviors for different workflow types. See the Python SDK versioning reference for the full set of options.

The TemporalWorkerDeployment Chart

Take a look at charts/worker/templates/temporal-worker-deployment.yaml to see what the Worker Controller manages:

apiVersion: temporal.io/v1alpha1
kind: TemporalWorkerDeployment
metadata:
  name: research-agent
spec:
  replicas: 1
  workerOptions:
    connectionRef:
      name: lab-temporal
    temporalNamespace: default
  rollout:
    strategy: AllAtOnce
  sunset:
    scaledownDelay: 5s
    deleteDelay: 10s
  template:
    spec:
      containers:
        - name: worker
          image: "k3d-registry.localhost:5050/research-agent:v1-versioned"

The key fields: rollout.strategy controls how new versions are introduced (we’ll cover Progressive later in the ramping section). sunset controls how long old version pods stick around after they stop receiving new work. The controller handles creating versioned Kubernetes Deployments, registering build IDs with Temporal, and cleaning up drained versions automatically.

Test It

First, uninstall the old simple deployment since we’re switching to the Worker Controller:

helm uninstall research-agent -n default

Now rebuild and deploy V1 with versioning enabled:

./scripts/build-and-deploy.sh v1-versioned --versioned

The --versioned flag tells the deploy script to use the TemporalWorkerDeployment chart instead of a plain Kubernetes Deployment.

Head to the Temporal UI at localhost:8233 and check the Deployments tab. You should see your research-agent deployment with build ID v1-versioned listed. This is where you’ll be able to see multiple versions coexisting once we deploy V2.

Now start a workflow, let the research steps complete, then deploy V2 mid-flight:

temporal workflow start \
  --task-queue research-agent \
  --type ResearchWorkflow \
  --input '"machine learning"' \
  --workflow-id research-test-3

Wait for the workflow to reach the signal gate (research steps done), then deploy V2:

./scripts/build-and-deploy.sh v2-versioned --versioned

Check kubectl get pods. You’ll see pods for both versions running simultaneously. Head over to the deployments tab again, and then click on the deployment to see the versions - you’ll see a current and draining deployment. Clicking into the draining deployment, you’ll see the in-flight workflow is currently “on” this deployment. Until the workflow completes successfully, this deployment version will stay in the draining state, allowing the workflow to complete. If you were to start a new workflow now, you’d see the new workflow route to the current deployment version.

Now signal the V1 workflow:

temporal workflow signal \
  --workflow-id research-test-3 \
  --name proceed

This time the V1 workflow finishes on the V1 pod. If we had made a change that broke determinism before deploying V2, there would have been no determinism errors - the V1 worker would continue handling the workflow.

Without VersioningWith Versioning
In-flight V1 workflowsNon-determinism errorComplete normally on V1
New workflowsStart on V2Start on V2
LLM spendLost on event history resetPreserved
Old workersReplaced immediatelyDrain, then removed

Long-Running Workflows

So far we’ve been working with the ResearchWorkflow, which starts, does its thing, and completes. Worker Versioning handles that beautifully: V1 workflows finish on V1 workers, V2 workflows start on V2 workers, everyone’s happy.

But what about the ResearchPollingWorkflow? It runs indefinitely, processing research requests as they come in via signals. It never completes. Since our worker is configured with PINNED behavior, this workflow stays on V1 forever. That means the V1 worker pod sticks around forever too, which defeats the purpose of draining old versions.

We need a way for long-running workflows to migrate to new versions. There are two pieces to this, and we’ll tackle them one at a time.

Part 1: Continue-as-New with AUTO_UPGRADE

The starter code already has a basic continue_as_new for history management:

if workflow.info().is_continue_as_new_suggested():
    workflow.continue_as_new(
        PollingState(
            pending_requests=list(self._pending),
            completed_count=state.completed_count,
        )
    )

This is standard Temporal practice: when the event history gets large, the workflow restarts itself with fresh history. But with PINNED behavior, the continued execution stays on the same version. It doesn’t migrate.

The fix is one line. Add initial_versioning_behavior=AUTO_UPGRADE to the continue-as-new call:

if workflow.info().is_continue_as_new_suggested():
    workflow.continue_as_new(
        PollingState(
            pending_requests=list(self._pending),
            completed_count=state.completed_count,
        ),
        initial_versioning_behavior=(
            workflow.ContinueAsNewVersioningBehavior.AUTO_UPGRADE
        ),
    )

Now when the history gets large enough to trigger a continue-as-new, the new execution starts on the current (latest) version instead of staying pinned to the old one. The state carries over. Pending requests and the completed count survive the migration. From the outside it looks like the workflow just kept running, but under the hood it jumped from V1 to V2.

This is the simplest version migration path. It works, but it’s passive. The workflow only migrates when Temporal suggests a continue-as-new, which might not happen for a while if the workflow isn’t doing much. Let’s see that in action.

Try It Out

Start the polling workflow on the current versioned deployment:

temporal workflow start \
  --task-queue research-agent \
  --type ResearchPollingWorkflow \
  --input '{"pending_requests": [], "completed_count": 0}' \
  --workflow-id research-polling

Check the Deployments tab in the Temporal UI. The polling workflow is running on the current version. Now deploy a new version:

./scripts/build-and-deploy.sh v3-versioned --versioned

Check the Deployments tab again. The previous version is now Draining because the polling workflow is still pinned to it. Now send it some work:

temporal workflow signal \
  --workflow-id research-polling \
  --name submit_request \
  --input '{"topic": "quantum computing", "request_id": "poll-test-1"}'

The polling workflow wakes up, processes the research request on the V2 worker, and goes back to sleep. Check the Deployments tab again. It’s still on v2-versioned. It processed the work just fine, but it didn’t migrate because the history isn’t large enough to trigger is_continue_as_new_suggested().

The V2 pod is going to stick around until the history grows enough. For a workflow that processes a few requests a day, that could be a long time.

Part 2: Detecting Version Changes

What if you want the workflow to migrate as soon as it has a chance, not just whenever the history happens to get large?

Temporal provides workflow.info().is_target_worker_deployment_version_changed(). This returns True when the server has set a newer “current” version for this deployment. We can add it to the wait condition so that when the workflow receives new work and wakes up, it checks for a version change and migrates immediately.

First, terminate the running polling workflow so we can redeploy with the updated code:

temporal workflow terminate --workflow-id research-polling

Open src/workflows.py and update the ResearchPollingWorkflow. Add a field to track draining state:

class ResearchPollingWorkflow:
    def __init__(self) -> None:
        self._pending: list[ResearchRequest] = []
        self._draining: bool = False

Now update the main loop to check for version changes:

    @workflow.run
    async def run(self, state: PollingState) -> None:
        self._pending = list(state.pending_requests)

        while True:
            await workflow.wait_condition(
                lambda: (
                    len(self._pending) > 0
                    or workflow.info().is_continue_as_new_suggested()
                    or workflow.info().is_target_worker_deployment_version_changed()
                )
            )

            if workflow.info().is_continue_as_new_suggested():
            if (
                workflow.info().is_continue_as_new_suggested()
                or workflow.info().is_target_worker_deployment_version_changed()
            ):
                self._draining = True
                await workflow.wait_condition(workflow.all_handlers_finished)
                workflow.continue_as_new(
                    PollingState(
                        pending_requests=list(self._pending),
                        completed_count=state.completed_count,
                    ),
                    initial_versioning_behavior=(
                        workflow.ContinueAsNewVersioningBehavior.AUTO_UPGRADE
                    ),
                )

            if self._pending:
                request = self._pending.pop(0)
                await workflow.execute_child_workflow(
                    ResearchWorkflow.run,
                    args=[request.topic, False],
                    id=f"research-{request.request_id}",
                )
                state.completed_count += 1

See It Work

Deploy the updated code and start the polling workflow:

Wait for the 30 second sleep here; the new pod needs to get promoted to the current version

./scripts/build-and-deploy.sh v4-versioned --versioned && sleep 30 && \
temporal workflow start \
  --task-queue research-agent \
  --type ResearchPollingWorkflow \
  --input '{"pending_requests": [], "completed_count": 0}' \
  --workflow-id research-polling

Now deploy another new version:

./scripts/build-and-deploy.sh v5-versioned --versioned

Check the Deployments tab. The polling workflow is still on v4-versioned. Now send it a research request:

temporal workflow signal \
  --workflow-id research-polling \
  --name submit_request \
  --input '{"topic": "quantum computing", "request_id": "poll-test-2"}'

This time the workflow wakes up, sees the version change, drains, and continues-as-new onto v5-versioned. Check the Deployments tab again. The workflow has migrated. The old deployment should transition to Drained.

The Draining Pattern

Let’s unpack what happens when a version change is detected:

  1. We set self._draining = True. This is a flag you can check in signal handlers to stop accepting new work during migration.
  2. We wait for workflow.all_handlers_finished(). This ensures any in-flight signal handlers or update handlers complete before we migrate.
  3. We call continue_as_new with initial_versioning_behavior=AUTO_UPGRADE. The new execution starts on the current version.

The combination of these two pieces gives you full control: the AUTO_UPGRADE on continue-as-new ensures the workflow actually moves to the new version (instead of staying pinned), and the is_target_worker_deployment_version_changed() check ensures the workflow migrates promptly when it has work to do, rather than waiting for the history to grow.

Sleeping Workflows

Here’s a gotcha that’s easy to miss.

If the polling workflow is idle, sitting in wait_condition with no pending requests, it’s completely evicted from the worker’s memory. The is_target_worker_deployment_version_changed() flag only gets checked when the workflow executes a Workflow Task. An idle workflow isn’t executing anything. It doesn’t know a new version exists.

You could just wait. The next time a signal comes in (a new research request), the workflow will wake up, check the condition, see the version change, and migrate. Any workflow task will trigger this. But if requests are infrequent, the old worker pod could be sitting around for hours or days waiting for that to happen.

The Wake-Up Signal

The solution: add a dedicated signal that wakes up idle workflows so they can check for version changes.

class ResearchPollingWorkflow:
    def __init__(self) -> None:
        self._pending: list[ResearchRequest] = []
        self._draining: bool = False

    @workflow.signal
    async def wake_up(self) -> None:
        pass #we just need the workflow to wake up            

After deploying a new version, send the signal to nudge the workflow:

temporal workflow signal \
  --workflow-id research-polling \
  --name wake_up

The signal causes a Workflow Task. The workflow wakes up, checks the wait condition, sees the version change, and migrates.

In production, you could automate this with a post-deploy script that signals all long-running workflows.

Another Angle: Replay-Gated AUTO_UPGRADE

There’s a third option I’ve been thinking about. Fair warning: I’m about three weeks into learning Temporal, so take this with a grain of salt. I’m not sure if this is an anti-pattern or a legitimate strategy. But it’s interesting enough to discuss.

The idea: what if your CI/CD pipeline decides the versioning strategy based on whether the code change is replay-compatible?

Temporal provides a Replayer that can download workflow histories from the server and replay them against your new code. If replay succeeds, there are no determinism issues. The new code is safe to run against existing workflow histories. If replay fails, you have a breaking change.

Here’s where it gets interesting. With AUTO_UPGRADE behavior, the server routes Workflow Tasks to the current version’s workers. When a sleeping workflow eventually wakes up (timer fires, signal arrives), its next task gets dispatched to the new version. If the code is replay-compatible, the new worker replays the history successfully and the workflow continues on V2. No continue-as-new needed. No wake-up signal needed. It just works.

So the CI/CD pattern could look like this: keep PINNED as the default on the workflow definition (safest default), but as part of the deploy pipeline, if the replayer passes, batch-update all running workflow executions to AUTO_UPGRADE:

  1. Run the Replayer against live workflow histories with the new code
  2. Replayer passes (no determinism errors) → deploy the new version, then flip running workflows to AUTO_UPGRADE:
temporal workflow update-options \
  --query 'WorkflowType="ResearchPollingWorkflow" AND ExecutionStatus="Running"' \
  --versioning-override-behavior auto_upgrade

Sleeping workflows naturally drain off the old version even without waking up; a sleeping workflow isn’t actually running on the worker, and with the AUTO_UGPRADE flag set, Temporal knows it can now route that workflow, and any new workflow tasks, to the new worker immediately. No signals, no continue-as-new.

  1. Replayer fails (breaking change) → deploy with PINNED. Use the continue-as-new migration pattern we covered in the previous section.

One thing to know: this override is sticky. Once you set it on a running execution, it persists for the lifetime of that workflow and takes precedence over whatever behavior the SDK declared. You can clear it later with --versioning-override-behavior unspecified to revert to the original SDK behavior, but you don’t have to. So in practice, your deploy pipeline would need to handle both cases: set the override when the replayer passes, and leave it alone (letting PINNED + continue-as-new do its thing) when it doesn’t.

This gives you the safety of PINNED by default with the convenience of AUTO_UPGRADE when the code is compatible. The old version drains naturally without extra tooling.

Like I said, I’m not sure if this is a blessed pattern or if there are edge cases I’m not seeing. But the pieces are all there in the platform. If you try this, I’d love to hear how it goes.

Scaling and Ramping

Everything so far has been hands-on. This section is conceptual, covering the broader HA story that Worker Versioning enables.

Version Lifecycle

Every version goes through a lifecycle:

StateWhat’s Happening
ActiveCurrently receiving new workflows and/or ramping traffic
DrainingNo new work, but existing pinned workflows are still running
DrainedAll work complete, safe to remove

The Worker Controller manages this automatically. When a version enters the Drained state, the controller deletes its pods and Kubernetes resources based on the sunset configuration:

sunset:
  scaledownDelay: 5m   # scale down 5 min after draining starts
  deleteDelay: 30m      # fully remove 30 min later

Progressive Rollouts

Instead of deploying all-at-once, you can ramp traffic to a new version gradually. The Worker Controller supports this via the Progressive rollout strategy:

rollout:
  strategy: Progressive
  steps:
    - rampPercentage: 10
      pauseDuration: 5m
    - rampPercentage: 50
      pauseDuration: 10m

This sends 10% of new workflow starts to the new version, waits 5 minutes, bumps to 50%, waits 10 more, then promotes to full current. If something goes wrong at 10%, you can roll back before most of your traffic is affected.

You can also manage this manually with the CLI:

# Set a ramping version at 5%
temporal worker-deployment set-ramping-version \
  --deployment-name research-agent \
  --build-id v3 \
  --percentage 5

# Once verified, promote to current
temporal worker-deployment set-current-version \
  --deployment-name research-agent \
  --build-id v3

Per-Version Autoscaling

The Worker Controller can attach separate HPAs to each version’s pods using the WorkerResourceTemplate CRD. This means V1 can scale down as its workload drains while V2 scales up to handle new traffic. The controller can even use Temporal-specific metrics like approximate_backlog_count for smarter scaling decisions.

This is the full HA picture: versioned deployments with independent scaling, progressive rollouts, and automatic cleanup. No more all-or-nothing deploys.

Wrapping Up

Here’s what we covered:

ConceptKey Takeaway
Non-determinism errorsChanging workflow code breaks in-flight workflows during replay
PINNED versioningWorkflows complete on the version they started on
Rainbow deploymentsMultiple worker versions run simultaneously on the same task queue
Continue-as-new migrationLong-running workflows detect version changes and migrate with AUTO_UPGRADE
Sleeping workflow caveatIdle workflows need a signal or timer to notice version changes
Progressive rolloutsRamp traffic gradually for safer deployments

The finished code is on the main branch. The starter branch has everything you need to follow along from scratch.

For a real-world example of all these patterns in production, check out Building a Reddit Bot on Temporal where I use worker versioning, continue-as-new migration, and the wake-up signal pattern in a 24/7 bot.

Further Reading