In this blog, we will learn about airflow BaseOperator.We will understand airflow BaseOperator with several examples. So let’s get started:
What is Bashoperator in airflow?
The Airflow BashOperator is a basic operator in Apache Airflow that allows you to execute a Bash command or shell script within an Airflow DAG. The BashOperator is very simple and can run various shell commands, scripts, and other commands.
Import a Bashoperator in airflow
The Bashoperator in airflow can be imported by typing the below command
from airflow.operators.bash import BashOperator
The BashOperator class from the airflow.operators.bash_operator module, which is part of the airflow core package. You can create an instance of BashOperator and use it in your DAG once you have imported the operator.
Airflow BashOperator arguments
The BashOperator in Airflow supports a number of arguments that you can use for customization. Some of the most commonly used arguments are:
- task_id: (Required)The ID of the task.
- bash_command: (Required)The shell command or script to execute.
- dag: (Required)The DAG to which the task belongs.
- xcom_push: Whether to push the output of the shell command to the Airflow metadata database for use by other tasks. The default is False.
- env: A dictionary of environment variables to set when running the shell command. The default is None.
- output_encoding: The character encoding to use for the output of the shell command. The default is ‘utf-8’.
- output_log_filename: The name of the file to which to redirect the output of the shell command. The default is ‘airflow-task.log’.
- output_log_prefix: A prefix to prepend to each line of output written to the log file. The default is empty.
- output_log_suffix: A suffix to append to each line of output written to the log file. The default is empty.
- output_log_cleanup: Whether to delete the log file after the task completes. The default is True.
Below is a simple example of airflow BashOperator
from airflow.operators.bash_operator import BashOperator test_bash_task = BashOperator( task_id='test_task', bash_command='echo "Hello World"', dag=my_dag )
Airflow BashOperator example
In this session, we will understand the airflow BashOperator with several examples.
Airflow BashOperator to run a shell command
Below is an example of a simple BashOperator in an airflow DAG to execute a bash command:
from airflow import DAG from datetime import datetime, timedelta from airflow.operators.bash import BashOperator from datetime import datetime test_dag = DAG('test_bash_dag', start_date=datetime(2023, 2, 15)) bash_task = BashOperator( task_id='bash_task', bash_command='echo "Hello world"', dag=test_dag )
The above code is a simple DAG definition using Airflow’s BashOperator to execute a bash command. The DAG is named “test_bash_dag” and is scheduled to start on February 15th, 2023. The DAG has only one task, which is the “bash_task”.
The task executes a bash command using the BashOperator. The bash command simply echoes the string “Hello world” to the console.
Goto graph view in airflow and check the logs for output
Airflow bash operator run shell script
A user can also run the shell script using the airflow bashOperator.let’s take the below example
from airflow import DAG from airflow.operators.bash import BashOperator from datetime import datetime default_args = { 'owner': 'airflow', 'depends_on_past': False, 'start_date': datetime(2023, 2, 15), 'retries': 0, } test_dag = DAG( 'test_bash_script_dag', default_args=default_args, schedule_interval=timedelta(days=1) ) # Define the BashOperator task bash_task = BashOperator( task_id='bash_task_execute_script', bash_command='./test_script.sh', dag=test_dag ) # Set task dependencies bash_task
Create a file called test_script.sh and paste the below line
#! /bin/bash
# print hello world message
echo "Hello World"
Now run the DAG and observe the output
The task within the DAG is a BashOperator that executes a shell script named “test_script.sh” using the bash_command parameter. The task is named “bash_task_execute_script”.
Airflow BashOperator with multiple shell commands
You can use the Airflow BashOperator to execute multiple shell commands by simply passing a multiline string as the value of the bash_command parameter. Let’s take the below example:
from airflow import DAG from datetime import datetime, timedelta from airflow.operators.bash import BashOperator default_args = { 'owner': 'airflow', 'depends_on_past': False, 'start_date': datetime(2023, 2, 15), 'retries': 0, } test_dag = DAG( 'test_Multiple_bash_dag', default_args=default_args, schedule_interval=timedelta(days=1) ) # Define the BashOperator task bash_task = BashOperator( task_id='bash_task_multiple_execute', bash_command=""" echo "Start"; echo "completed" """, dag=test_dag ) # Set task dependencies bash_task
The task within the DAG is a BashOperator that executes multiple shell commands using the bash_command parameter. The task is named “bash_task_multiple_execute”. The bash_command parameter includes three commands:
- The first command echoes the string “Start”
- The second command executes a shell script named “test_script.sh”
- The third command echoes the string “completed”
This allows you to chain multiple shell commands within a single task, which can be useful when you have a series of commands that need to be executed in a specific order.
Airflow BashOperator exit code
The BashOperator in Airflow returns the exit code of the shell command that it executes. If the command succeeds, the exit code will be 0. If the command fails, the exit code will be a non-zero integer, typically 1.
You can use the exit code to determine whether the command executed successfully or not. Users can also check the logs to get more details.
Conclusion
In conclusion, the BashOperator is a powerful operator in Airflow that allows you to execute bash commands or shell scripts within your DAGs. The BashOperator is very easy to use, and you can pass any bash command or script using the bash_command parameter. You can also set the task dependencies in a DAG to determine the order of execution of the tasks.
Overall, the BashOperator is a valuable tool in Airflow for automating a wide range of tasks in a DAG, and it’s a great way to integrate shell scripts into your workflow.
More to Explore
How to send email from airflow
How to integrate airflow with slack