Airflow Max_active_runs

Introduction to Airflow max_active_runs parameter for efficient task scheduling

One of the critical aspects of Airflow is efficient task scheduling, This is where the airflow max_active_runs parameter comes into play. The max_active_runs parameter limits the maximum number of active DAG runs that can be running simultaneously, preventing the scheduler from overloading the system.

In this blog post, we will explore the max_active_runs parameter and how to use it to optimize the scheduling of your workflows. We will discuss how to set the max_active_runs parameter in a DAG’s default_args dictionary and how to set it globally in the Airflow configuration file.

What is airflow max_active_runs ?

In Airflow, the maximum number of active runs is the maximum number of tasks that can run simultaneously across all DAGs.This can be controlled using the max_active_runs parameter in the default_args dictionary of a DAG or by setting it globally in the Airflow configuration file.

The max_active_runs parameter determines the maximum number of active DAG runs that can be running simultaneously. This means that Airflow won’t start a new DAG run if the maximum number of active runs has already been reached. In such a scenario, Airflow will wait until one of the running DAGs completes before starting a new one.

It is important to set the max_active_runs parameter appropriately to avoid overloading the system. Setting a high can lead to resource contention while setting a low can result in longer wait times for DAG runs to start. The appropriate value for max_active_runs depends on the available resources and the complexity of the tasks in the DAGs.

Syntax of airflow max_active_runs parameter

In Apache Airflow, the max_active_runs parameter can be set in the default_args dictionary of a DAG. Below is the syntax for setting max_active_runs:

default_args = {
    'owner': 'Naiveskill',
    'depends_on_past': False,
    start_date=datetime(2023, 2, 1),
    'max_active_runs': 3  # set max_active_runs to 3
}

my_dag = DAG(
    'my_dag',
    default_args=default_args,
    start_date=datetime(2023, 2, 1),
)

In the above example, max_active_runs is set to 3, which means that only 3 active runs of this DAG’s tasks can run at the same time

The max_active_runs can also be set globally in the Airflow configuration file. The syntax for setting max_active_runs in the Airflow configuration file is as follows:

[core]
max_active_runs = 3

This will set the maximum number of active runs for all DAG in the Airflow instance to 3. However, if max_active_runs is also set in a DAG default_args, the DAG-specific setting will take precedence over the global setting.

Airflow max_active_runs example

Below is an example of how to set the max_active_runs parameter in an Airflow DAG:

from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.bash_operator import BashOperator

default_args = {
    'owner': 'Naiveskill',
    'depends_on_past': False,
    'start_date': datetime(2023, 2, 15)),
    'max_active_runs': 3  # set max_active_runs to 3
}

my_dag = DAG(
    'example_dag',
    default_args=default_args,
)

task_1 = BashOperator(
    task_id='task_1',
    bash_command='echo "Task 1"',
    dag=my_dag,
)

task_2 = BashOperator(
    task_id='task_2',
    bash_command='echo "Task 2"',
    dag=my_dag,
)

task_3 = BashOperator(
    task_id='task_3',
    bash_command='echo "Task 3"',
    dag=my_dag,
)

task_4 = BashOperator(
    task_id='task_4',
    bash_command='echo "Task 4"',
    dag=my_dag,
)

# define task dependencies
task_1 >> task_2 >> task_3 >> task_4

This DAG consists of four BashOperators, where each task just prints a string message to the console using the echo command. The default_args dictionary sets the owner, start date, and max_active_runs parameter to 3, which limits the number of concurrent DAG runs to 3

More to explore

Airflow BashOperator

Airflow Schedular

Airflow database

Airflow DummyOperator

Airflow PythonOperator

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top