Dagster Flashcards | Uplatz Blog

🧩 Dagster Flashcards

Data orchestrator for building, scheduling, and observing data assets

An open-source data orchestrator focused on assets, reliability, and developer productivity for data platforms.

Declare data assets as first-class objects with code-defined dependencies, metadata, and lineage.

Ops are reusable computation units. Jobs assemble ops/assets into executable graphs.

Inject external systems (DBs, APIs, warehouses) as typed resources for clean, testable I/O.

Partition assets by time or keys, run selective backfills, and track progress with granular observability.

Pluggable storage for passing data between ops/assets (e.g., files, object stores, dataframes, tables).

Trigger jobs on cron-like schedules or react to external events (new files, table updates, custom signals).

Typed inputs/outputs, schema-validated run config, and rich metadata for runtime safety and clarity.

Asset materializations, checks, run logs, and lineage views in the UI for debugging and governance.

Visualize graphs, kick off runs, watch logs, inspect partitions, and manage schedules/sensors from the web UI.

Works with dbt, Spark, Pandas/Polars, Snowflake, BigQuery, Redshift, Airbyte/Fivetran, Kafka, and more.

Run locally, on Kubernetes, with Dagster Cloud (managed), or hybrid. CI/CD-friendly with code locations.