ZhangZhihui's Blog  

 

"""
Example DAG demonstrating the usage of setup and teardown tasks.
"""

from __future__ import annotations

import pendulum

from airflow.providers.standard.operators.bash import BashOperator
from airflow.sdk import DAG, TaskGroup


with DAG(
    dag_id='example_setup_teardown',
    schedule=None,
    start_date=pendulum.datetime(2021, 1, 1, tz='UTC'),
    catchup=False,
    tags=['example']
) as dag:
    root_setup = BashOperator(task_id='root_setup', bash_command="echo 'Hello from root_setup'").as_setup()
    root_normal = BashOperator(task_id='normal', bash_command="echo 'I am just a normal task'")
    root_teardown = BashOperator(
        task_id='root_teardown',
        bash_command="echo 'Goodbye from root_teardown'"
    ).as_teardown(setups=root_setup)

    root_setup >> root_normal >> root_teardown

    with TaskGroup('section_1') as section_1:
        inner_setup = BashOperator(
            task_id='taskgroup_setup',
            bash_command="echo 'Hello from taskgroup_setup'"
        ).as_setup()
        inner_normal = BashOperator(task_id='normal', bash_command="echo 'I am just a normal task'")
        inner_teardown = BashOperator(
            task_id='taskgroup_teardown',
            bash_command="echo 'Hello from taskgroup_teardown'"
        ).as_teardown(setups=inner_setup)
        inner_setup >> inner_normal >> inner_teardown
    
    root_normal >> section_1

 

Your code uses Airflow’s Setup & Teardown tasks, a feature introduced in Airflow 2.5+ to make resource-management patterns first-class citizens in DAGs.


✅ 1. What are Setup and Teardown tasks?

In Airflow, setup tasks run before dependent tasks to prepare resources.
Teardown tasks run after dependent tasks to clean up resources.

Examples:

  • Setup: create temp directory, open connection, prepare environment

  • Normal tasks: the main DAG work

  • Teardown: delete temp files, close connection, notify shutdown

Importantly:

✔ Teardown tasks always try to run

—even if upstream tasks fail—
because teardown is meant for cleanup.

✔ Setup and teardown create an implicit lifecycle

Airflow ensures the order and failure handling are correct even once the DAG gets complex.


🔧 2. What .as_setup() and .as_teardown() do

.as_setup()

Marks a task as a “setup task”.
Airflow registers it internally so teardown tasks know to wait for it and respect its lifecycle.

.as_teardown(setups=...)

Marks the task as a teardown task, and links it to the setup task(s).

It means:

  • Teardown will run after all tasks that depend on the setup

  • Teardown will run even if those tasks fail

  • Airflow guarantees teardown always runs exactly once per setup


🗂️ 3. Walkthrough of Your DAG

Let’s go piece by piece.


🌳 4. Root-level setup, normal, teardown

root_setup = BashOperator(...).as_setup()
root_normal = BashOperator(...)
root_teardown = BashOperator(...).as_teardown(setups=root_setup)

root_setup >> root_normal >> root_teardown

✔ Execution flow

  1. root_setup runs first
    Output: “Hello from root_setup”

  2. root_normal runs
    Output: “I am just a normal task”

  3. root_teardown runs
    Output: “Goodbye from root_teardown”

✔ Behavior guarantees:

  • root_teardown always runs even if root_normal fails

  • It runs only after:

    • root_setup was done

    • All tasks that depend on root_setup (here: root_normal) have finished

So Airflow guarantees:

root_setup → root_normal → root_teardown

Regardless of upstream failures.


📦 5. TaskGroup setup, normal, teardown

Inside TaskGroup:

inner_setup = BashOperator(...).as_setup()
inner_normal = BashOperator(...)
inner_teardown = BashOperator(...).as_teardown(setups=inner_setup)

inner_setup >> inner_normal >> inner_teardown

This is the same lifecycle but scoped inside the group:

✔ Life cycle inside TaskGroup

taskgroup_setup → normal → taskgroup_teardown

✔ Setup/Teardown in a group does NOT interact with the root-level ones

Each group manages its own setup/teardown boundary.


🔗 6. Cross-level dependency

root_normal >> section_1

This means:

  • All tasks in section_1 start after root_normal finishes.

So overall order becomes:

1

Important subtlety:

root_teardown does NOT wait for taskgroup_teardown

Because root_teardown is only linked to root_setup, not to inner setup.

This is intentional:
Setup/teardown relationships are scoped and isolated.


📌 7. Summary of Setup/Teardown Behavior in Your DAG

Root setup/teardown pair:

  • root_setup prepares something

  • root_teardown cleans up that same thing

  • It runs after everything depending on root_setup

TaskGroup setup/teardown pair:

  • taskgroup_setup prepares something for the group

  • taskgroup_teardown cleans up

  • It runs even if inner_normal fails

Independent life cycles:

  • Root setup/teardown does not control the TaskGroup

  • TaskGroup setup/teardown does not affect root teardown


🤓 8. Why use setup & teardown?

They make DAGs safer and more reliable:

  • Cleanup is guaranteed

  • Cleanup runs even if something crashes

  • DAGs become easier to reason about

  • Teardown tasks never accidentally run too early

  • Lifecycle boundaries are explicit and enforced

This is especially useful in:

    • Managing temporary clusters / VMs

    • Allocating / releasing cloud resources

    • Creating / deleting temp directories or batch environments

    • Opening / closing pools or exclusive resources

 

posted on 2025-12-12 16:31  ZhangZhihuiAAA  阅读(0)  评论(0)    收藏  举报