ZhangZhihui's Blog  

 

"""
Example DAG demonstrating a workflow with nested branching. The join tasks are created with
``none_failed_min_one_success`` trigger rule such that they are skipped whenever their corresponding
branching tasks are skipped.
"""

from __future__ import annotations

import pendulum

from airflow.providers.standard.operators.empty import EmptyOperator
from airflow.sdk import DAG, TriggerRule, task

with DAG(
    dag_id="example_nested_branch_dag",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    catchup=False,
    schedule="@daily",
    tags=["example"],
) as dag:

    @task.branch()
    def branch(task_id_to_return: str) -> str:
        return task_id_to_return

    branch_1 = branch.override(task_id="branch_1")(task_id_to_return="true_1")
    join_1 = EmptyOperator(task_id="join_1", trigger_rule=TriggerRule.NONE_FAILED_MIN_ONE_SUCCESS)
    true_1 = EmptyOperator(task_id="true_1")
    false_1 = EmptyOperator(task_id="false_1")

    branch_2 = branch.override(task_id="branch_2")(task_id_to_return="true_2")
    join_2 = EmptyOperator(task_id="join_2", trigger_rule=TriggerRule.NONE_FAILED_MIN_ONE_SUCCESS)
    true_2 = EmptyOperator(task_id="true_2")
    false_2 = EmptyOperator(task_id="false_2")
    false_3 = EmptyOperator(task_id="false_3")

    branch_1 >> true_1 >> join_1
    branch_1 >> false_1 >> branch_2 >> [true_2, false_2] >> join_2 >> false_3 >> join_1

 

1

 

2

 

1️⃣ What override() does

  • branch is a TaskFlow task (Python function decorated with @task.branch()).

  • .override() is a method on TaskFlow tasks that lets you override some parameters of the task without changing the original function.

  • Common things you can override:

    • task_id (unique identifier of the task in the DAG)

    • retries

    • retry_delay

    • execution_timeout

    • doc_md, etc.

Essentially, it creates a new task object with the same Python callable but updated configuration.


2️⃣ Why it's used here

branch.override(task_id='branch_1')
  • The original function is branch(task_id_to_return: str).

  • We want to instantiate it as a DAG task with a specific task_id called 'branch_1'.

  • Without override(), the task would take its default function name as the task_id (here, 'branch'), which may conflict if you create multiple instances.


3️⃣ Putting it together

branch_1 = branch.override(task_id='branch_1')(task_id_to_return='true_1')

Step by step:

  1. branch.override(task_id='branch_1') → creates a TaskFlow task object with task_id='branch_1'.

  2. (task_id_to_return='true_1')calls the task with the argument 'true_1' and returns a TaskInstance that Airflow can schedule.

So branch_1 is a DAG task instance with a custom task_id, not just a Python function call.


Analogy

Think of override() as:

“I like this function, but for this DAG task, I want it to have a different name or different execution settings.”

  • Original function: branch

  • DAG task: branch_1 (with overridden task_id)

 

posted on 2025-12-11 20:28  ZhangZhihuiAAA  阅读(1)  评论(0)    收藏  举报