Apache Airflow Flashcards

๐Ÿ›ซ Apache Airflow Flashcards
๐Ÿ›ซ What is Apache Airflow?
Apache Airflow is an open-source platform to programmatically author, schedule, and monitor workflows (DAGs).

๐Ÿ“… What is a DAG?
A DAG (Directed Acyclic Graph) represents a workflow where tasks are arranged with dependencies and run in sequence or parallel.

โš™๏ธ What are Operators?
Operators define the type of work to be done. Examples: BashOperator, PythonOperator, EmailOperator, etc.

๐Ÿ“ฆ What is a Task?
A Task is a single unit of execution in a DAG, created using Operators and configured with parameters.

๐Ÿ” What are Sensors?
Sensors are special operators that wait for a condition to be true before running downstream tasks.

๐Ÿ—“๏ธ What is a Schedule Interval?
Defines how often a DAG should run. Can be cron expressions or presets like `@daily`, `@hourly`, etc.

๐Ÿ–ฅ๏ธ What is the Airflow Web UI?
A rich web interface for monitoring DAGs, viewing logs, triggering tasks, and managing configurations.

๐Ÿ“‚ What is a Task Instance?
A Task Instance is a specific run of a task for a particular DAG run, with a unique execution date.

๐Ÿ”„ What is Task Retry?
Airflow can retry failed tasks a specified number of times with a configurable delay using `retries` and `retry_delay` params.

๐Ÿ“Œ What is XCom?
XCom (Cross Communication) is used for sharing small pieces of data between tasks in a DAG.

๐Ÿ” How does Airflow handle authentication?
Airflow supports multiple auth backends like LDAP, OAuth, or password-based auth using Flask AppBuilder.

๐Ÿงช How to test DAGs?
Use `airflow dags test` command or write unit tests to simulate task execution locally without the scheduler.