{"id":4749,"date":"2025-08-23T15:41:41","date_gmt":"2025-08-23T15:41:41","guid":{"rendered":"https:\/\/uplatz.com\/blog\/?p=4749"},"modified":"2025-08-27T02:54:10","modified_gmt":"2025-08-27T02:54:10","slug":"databricks-pocket-book","status":"publish","type":"post","link":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/","title":{"rendered":"Databricks Pocket Book"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15-1024x576.png\" alt=\"Databricks Pocket Book\" width=\"840\" height=\"473\" class=\"alignnone size-large wp-image-4847\" srcset=\"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15-1024x576.png 1024w, https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15-300x169.png 300w, https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15-768x432.png 768w, https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15.png 1280w\" sizes=\"auto, (max-width: 840px) 100vw, 840px\" \/><br \/>\n<!-- Databricks Pocket Book \u2014 Uplatz (50 Cards, Single-Column Layout, Readable Code) --><\/p>\n<div style=\"margin: 16px 0;\">\n<style>\n    .wp-nodejs-pb { font-family: Arial, sans-serif; max-width: 1320px; margin:0 auto; }\n    .wp-nodejs-pb .heading{\n      background: linear-gradient(135deg, #e0f2fe, #ccfbf1);\n      color:#0f172a; padding:22px 24px; border-radius:14px;\n      text-align:center; margin-bottom:18px; box-shadow:0 8px 20px rgba(0,0,0,.08);\n      border:1px solid #cbd5e1;\n    }\n    .wp-nodejs-pb .heading h2{ margin:0; font-size:2.1rem; letter-spacing:.2px; }\n    .wp-nodejs-pb .heading p{ margin:6px 0 0; font-size:1.02rem; opacity:.9; }<\/p>\n<p>    \/* Single-column grid *\/\n    .wp-nodejs-pb .grid{\n      display:grid; gap:14px;\n      grid-template-columns: 1fr !important;\n    }<\/p>\n<p>    .wp-nodejs-pb .section-title{\n      grid-column:1\/-1; background:#f8fafc; border-left:8px solid #0ea5e9;\n      padding:12px 16px; border-radius:10px; font-weight:700; color:#0f172a; font-size:1.08rem;\n      box-shadow:0 2px 8px rgba(0,0,0,.05); border:1px solid #e2e8f0;\n    }\n    .wp-nodejs-pb .card{\n      background:#ffffff; border-left:6px solid #0ea5e9;\n      padding:18px; border-radius:12px;\n      box-shadow:0 6px 14px rgba(0,0,0,.06);\n      transition:transform .12s ease, box-shadow .12s ease;\n      border:1px solid #e5e7eb;\n    }\n    .wp-nodejs-pb .card:hover{ transform: translateY(-3px); box-shadow:0 10px 22px rgba(0,0,0,.08); }\n    .wp-nodejs-pb .card h3{ margin:0 0 10px; font-size:1.12rem; color:#0f172a; }\n    .wp-nodejs-pb .card p{ margin:0; font-size:.96rem; color:#334155; line-height:1.62; }<\/p>\n<p>    \/* Color helpers *\/\n    .bg-blue { border-left-color:#0ea5e9 !important; background:#f0f9ff !important; }\n    .bg-green{ border-left-color:#10b981 !important; background:#f0fdf4 !important; }\n    .bg-amber{ border-left-color:#f59e0b !important; background:#fffbeb !important; }\n    .bg-violet{ border-left-color:#8b5cf6 !important; background:#f5f3ff !important; }\n    .bg-rose{ border-left-color:#ef4444 !important; background:#fff1f2 !important; }\n    .bg-cyan{ border-left-color:#06b6d4 !important; background:#ecfeff !important; }\n    .bg-lime{ border-left-color:#16a34a !important; background:#f0fdf4 !important; }\n    .bg-orange{ border-left-color:#f97316 !important; background:#fff7ed !important; }\n    .bg-indigo{ border-left-color:#6366f1 !important; background:#eef2ff !important; }\n    .bg-emerald{ border-left-color:#22c55e !important; background:#ecfdf5 !important; }\n    .bg-slate{ border-left-color:#334155 !important; background:#f8fafc !important; }<\/p>\n<p>    \/* Utilities *\/\n    .tight ul{ margin:0; padding-left:18px; }\n    .tight li{ margin:4px 0; }\n    .mono{ font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace; }\n    .kbd{ background:#e5e7eb; border:1px solid #cbd5e1; padding:1px 6px; border-radius:6px; font-family:ui-monospace,monospace; font-size:.88em; }\n    .muted{ color:#64748b; }\n    .wp-nodejs-pb code{ background:#f1f5f9; padding:0 4px; border-radius:4px; border:1px solid #e2e8f0; }\n    .wp-nodejs-pb pre{\n      background:#f5f5f5; color:#111827; border:1px solid #e5e7eb;\n      padding:12px; border-radius:8px; overflow:auto; font-size:.92rem; line-height:1.55;\n    }\n    .q{font-weight:700;}\n    .qa p{ margin:8px 0; }\n    .qa b{ color:#0f172a; }\n  <\/style>\n<div class=\"wp-nodejs-pb\">\n<div class=\"heading\">\n<h2>Databricks Pocket Book \u2014 Uplatz<\/h2>\n<p>50 deep-dive flashcards \u2022 Single column \u2022 Fewer scrolls \u2022 20+ Interview Q&amp;A \u2022 Readable code examples<\/p>\n<\/p><\/div>\n<div class=\"grid\">\n      <!-- ===================== SECTION 1 ===================== --><\/p>\n<div class=\"section-title\">Section 1 \u2014 Fundamentals<\/div>\n<div class=\"card bg-blue\">\n<h3>1) What is Databricks?<\/h3>\n<p>Databricks is a unified analytics platform (Lakehouse) built around Apache Spark that combines data engineering, data science, ML, and BI on one managed cloud service. It offers collaborative notebooks, scalable compute, Delta Lake ACID storage, SQL endpoints, MLflow, and governance via Unity Catalog.<\/p>\n<pre><code class=\"mono\"># Install & configure Databricks CLI (v0.x style)\r\npip install databricks-cli\r\ndatabricks configure --token<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-green\">\n<h3>2) Why Databricks? Strengths &amp; Tradeoffs<\/h3>\n<p>Strengths: scalable Spark, Delta Lake reliability, Lakehouse unification, strong ML lifecycle, collaborative UX, and multi-cloud. Tradeoffs: cost management, Spark expertise needed, governance setup complexity. Mitigate with governance (Unity Catalog), cluster policies, and workload-aware optimization.<\/p>\n<pre><code class=\"mono\"># Minimal PySpark in a notebook\r\nfrom pyspark.sql import SparkSession\r\nspark = SparkSession.builder.appName(\"demo\").getOrCreate()<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-amber\">\n<h3>3) Workspace &amp; Notebooks<\/h3>\n<p>Workspaces house notebooks (Python, SQL, Scala, R), repos, dashboards, and jobs. Notebooks support Delta visualization, widgets, dbutils, Git integration, and collaborative commenting\/versioning.<\/p>\n<pre><code class=\"mono\"># Read a sample dataset (in a Python notebook)\r\ndf = spark.read.csv(\"\/databricks-datasets\/flights\/departuredelays.csv\", header=True, inferSchema=True)\r\ndisplay(df.limit(5))<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-violet\">\n<h3>4) Clusters &amp; Compute<\/h3>\n<p>Compute options: Interactive (dev), Jobs (automation), SQL warehouses (BI), and Serverless (where available). Configure node types, autoscaling, spot\/preemptible, photon acceleration, and termination.<\/p>\n<pre><code class=\"mono\"># Create a cluster via API (classic)\r\ndatabricks clusters create --json-file cluster.json<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-rose\">\n<h3>5) Lakehouse Architecture<\/h3>\n<p>Lakehouse sits on open data formats with Delta Lake on object storage, enabling ACID transactions, schema enforcement, and time travel. It unifies streaming + batch with SQL and ML on the same tables.<\/p>\n<pre><code class=\"mono\"># Read a Delta table\r\ndf = spark.read.format(\"delta\").load(\"\/mnt\/delta\/events\")<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-cyan\">\n<h3>6) Databricks Runtime<\/h3>\n<p>Pre-built runtimes (DBR) ship optimized Spark, Delta Lake, ML libs, GPU support, and Photon engine. Use ML runtimes for MLflow\/scikit-learn; pick photon-enabled runtimes for SQL speedups.<\/p>\n<pre><code class=\"mono\"># Check versions (Python notebook)\r\nspark.version, spark.conf.get(\"spark.databricks.clusterUsageTags.sparkVersion\")<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-lime\">\n<h3>7) Data Access &amp; Mounts<\/h3>\n<p>Use DBFS for paths, or mount cloud storage (S3\/ADLS\/GCS) via <code>dbutils.fs.mount<\/code>. Prefer direct paths with secure credentials &amp; IAM passthrough when possible.<\/p>\n<pre><code class=\"mono\">dbutils.fs.mount(\r\n  source=\"s3a:\/\/my-bucket\/data\",\r\n  mount_point=\"\/mnt\/data\",\r\n  extra_configs={\"fs.s3a.aws.credentials.provider\":\"com.amazonaws.auth.InstanceProfileCredentialsProvider\"}\r\n)<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-orange\">\n<h3>8) DBFS Essentials<\/h3>\n<p>DBFS is a virtual filesystem exposing object storage. Use it for libraries, checkpoints, temporary outputs. Prefer external locations for production tables (managed by Unity Catalog).<\/p>\n<pre><code class=\"mono\">dbutils.fs.mkdirs(\"\/mnt\/checkpoints\")\r\ndbutils.fs.ls(\"\/databricks-datasets\")<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-indigo\">\n<h3>9) Jobs &amp; Workflows<\/h3>\n<p>Jobs chain notebooks, SQL, and Python tasks with retries, parameters, and dependencies. Use task libraries, cluster policies, and alerts. Workflows adds orchestration UI with conditionals and task values.<\/p>\n<pre><code class=\"mono\"># Create job (JSON spec)\r\ndatabricks jobs create --json-file job.json<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-emerald\">\n<h3>10) Q&amp;A \u2014 \u201cWhen should I choose Databricks?\u201d<\/h3>\n<p><span class=\"q\">Answer:<\/span> Choose Databricks when you need reliable large-scale data engineering &amp; ML on open formats, with unified governance (Unity Catalog) and strong collaboration, rather than separate lake + warehouse + ML stacks.<\/p>\n<\/p><\/div>\n<p>      <!-- ===================== SECTION 2 ===================== --><\/p>\n<div class=\"section-title\">Section 2 \u2014 Core APIs &amp; Modules<\/div>\n<div class=\"card bg-blue\">\n<h3>11) Delta Lake: ACID Tables<\/h3>\n<p>Delta Lake provides ACID transactions, schema enforcement\/evolution, time travel, and efficient upserts (MERGE). It stores table versions via transaction logs (JSON) alongside parquet data.<\/p>\n<pre><code class=\"mono\"># Create Delta table (SQL)\r\nCREATE TABLE sales_delta (id BIGINT, amount DOUBLE) USING delta LOCATION '\/mnt\/delta\/sales';<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-green\">\n<h3>12) Delta: Writes &amp; Updates<\/h3>\n<p>Use <code>MERGE INTO<\/code> for upserts; <code>DELETE<\/code>\/<code>UPDATE<\/code> for mutations. Optimize files with <code>OPTIMIZE<\/code> and cleanup with <code>VACUUM<\/code>.<\/p>\n<pre><code class=\"mono\">MERGE INTO tgt USING src ON tgt.id = src.id\r\nWHEN MATCHED THEN UPDATE SET amount = src.amount\r\nWHEN NOT MATCHED THEN INSERT (id, amount) VALUES (src.id, src.amount);<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-amber\">\n<h3>13) Delta: Time Travel<\/h3>\n<p>Query older versions for audits or rollbacks using <code>VERSION AS OF<\/code> or <code>TIMESTAMP AS OF<\/code>. Use <code>RESTORE<\/code> to revert a table to a prior version.<\/p>\n<pre><code class=\"mono\">SELECT * FROM sales_delta VERSION AS OF 5;\r\nRESTORE TABLE sales_delta TO VERSION AS OF 5;<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-violet\">\n<h3>14) Autoloader (CloudFiles)<\/h3>\n<p>Autoloader incrementally ingests new files from object storage. It tracks discovered files with scalable file notification modes and supports schema inference &amp; evolution.<\/p>\n<pre><code class=\"mono\">df = (spark.readStream.format(\"cloudFiles\")\r\n  .option(\"cloudFiles.format\",\"json\")\r\n  .load(\"s3:\/\/raw\/topic\/\"))\r\ndf.writeStream.format(\"delta\").option(\"checkpointLocation\",\"\/mnt\/ckpt\/topic\").start(\"\/mnt\/delta\/topic\")<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-rose\">\n<h3>15) Structured Streaming<\/h3>\n<p>Process streams with Spark SQL\/DataFrames. Sources: Kafka, files, Delta. Sinks: Delta, console, memory, Kafka. Exactly-once semantics with Delta sink and checkpointing.<\/p>\n<pre><code class=\"mono\">query = (df.writeStream\r\n  .format(\"delta\")\r\n  .option(\"checkpointLocation\",\"\/mnt\/ckpt\/stream\")\r\n  .outputMode(\"append\").start(\"\/mnt\/delta\/stream_out\"))<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-cyan\">\n<h3>16) Delta Live Tables (DLT)<\/h3>\n<p>DLT defines reliable pipelines with declarative semantics, auto-testing, expectations, and lineage. Supports streaming + batch with automatic recovery.<\/p>\n<pre><code class=\"mono\">-- DLT SQL example\r\nCREATE STREAMING LIVE TABLE bronze AS\r\nSELECT * FROM cloud_files(\"\/raw\/data\",\"json\");<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-lime\">\n<h3>17) Unity Catalog (UC)<\/h3>\n<p>UC centralizes governance: catalogs &rarr; schemas &rarr; tables\/views\/functions\/storage. Offers fine-grained permissions, data lineage, audit logs, row\/column-level security, and external locations.<\/p>\n<pre><code class=\"mono\">-- Grant on a UC table\r\nGRANT SELECT ON TABLE main.sales.orders TO `analyst-group`;<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-orange\">\n<h3>18) Databricks SQL (DBSQL)<\/h3>\n<p>DBSQL provides SQL Warehouses for BI\/analytics with Photon acceleration, dashboards, alerts, and query scheduling. Connect via JDBC\/ODBC to BI tools (Power BI, Tableau, Looker).<\/p>\n<pre><code class=\"mono\">-- Example SQL\r\nSELECT customer_id, SUM(amount) AS total\r\nFROM main.sales.orders\r\nGROUP BY customer_id\r\nORDER BY total DESC;<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-indigo\">\n<h3>19) MLflow Integration<\/h3>\n<p>MLflow tracks experiments (params, metrics, artifacts), packages models, and deploys to serving. Databricks tightly integrates MLflow with runs, model registry, and permissions.<\/p>\n<pre><code class=\"mono\">import mlflow\r\nwith mlflow.start_run():\r\n    mlflow.log_metric(\"rmse\", 2.1)\r\n    mlflow.sklearn.log_model(model, \"model\")<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-emerald\">\n<h3>20) Q&amp;A \u2014 \u201cDelta vs Parquet?\u201d<\/h3>\n<p><span class=\"q\">Answer:<\/span> Delta adds ACID transactions, schema enforcement\/evolution, time travel, efficient merges, and Z-Ordering on top of parquet files. It\u2019s parquet-plus-transaction-log, not a new proprietary format.<\/p>\n<\/p><\/div>\n<p>      <!-- ===================== SECTION 3 ===================== --><\/p>\n<div class=\"section-title\">Section 3 \u2014 Async, Patterns &amp; Concurrency<\/div>\n<div class=\"card bg-blue\">\n<h3>21) Job Orchestration Patterns<\/h3>\n<p>Use Jobs\/Workflows to chain tasks (notebooks, JARs, Python, SQL). Pass task values, add retries\/backoff, and schedule with CRON. Separate dev vs prod clusters via policies.<\/p>\n<pre><code class=\"mono\"># Submit a run (CLI)\r\ndatabricks runs submit --json-file run.json<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-green\">\n<h3>22) Streaming Triggers &amp; Checkpointing<\/h3>\n<p>Control micro-batch cadence with processing-time triggers. Checkpoints track offsets &amp; state for exactly-once sinks. Use <code>availableNow<\/code> for catch-up batch on directories.<\/p>\n<pre><code class=\"mono\">df.writeStream.trigger(processingTime=\"30 seconds\").option(\"checkpointLocation\",\"\/mnt\/ckpt\").start(...)<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-amber\">\n<h3>23) Concurrency &amp; Isolation<\/h3>\n<p>Use High Concurrency clusters for multi-tenant SQL\/BI. Ensure query isolation, pool configs, and SQL warehouse scaling. For notebooks, prefer per-user or small shared dev clusters.<\/p>\n<pre><code class=\"mono\">-- SQL warehouse scaling set in UI\/API; use small min\/max for bursty BI<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-violet\">\n<h3>24) Caching &amp; Photon<\/h3>\n<p>Cache frequently read tables; Photon accelerates vectorized execution for SQL. Use Delta <code>OPTIMIZE<\/code> + Z-ORDER to improve pruning and read performance.<\/p>\n<pre><code class=\"mono\">CACHE SELECT * FROM main.sales.orders;\r\nOPTIMIZE main.sales.orders ZORDER BY (customer_id);<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-rose\">\n<h3>25) Cost Control<\/h3>\n<p>Right-size clusters, enable autoscaling &amp; auto-terminate, use spot\/preemptible nodes where safe, optimize file sizes, and cache hot data. Monitor with tags, budgets, and system tables.<\/p>\n<pre><code class=\"mono\">-- Use cluster policies to cap nodes, runtime versions, and autotermination<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-cyan\">\n<h3>26) Performance Tuning<\/h3>\n<p>Use <code>EXPLAIN<\/code>, AQE, broadcast hints, bucketing on stable keys, and correct shuffle partitions. Keep files 128\u2013512 MB, avoid tiny files, and reduce skew via salting.<\/p>\n<pre><code class=\"mono\">SET spark.sql.shuffle.partitions=auto;\r\nEXPLAIN SELECT ...;<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-lime\">\n<h3>27) Reliability &amp; Idempotency<\/h3>\n<p>Design idempotent pipelines (upserts, overwrite partitions), use checkpoints and exactly-once sinks. Store run parameters &amp; versions with MLflow or table audit columns.<\/p>\n<pre><code class=\"mono\">-- Partition overwrite\r\nINSERT OVERWRITE TABLE t PARTITION (dt='2025-08-01') SELECT ...;<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-orange\">\n<h3>28) Z-Order &amp; File Compaction<\/h3>\n<p>Z-Order co-locates related data to reduce reads for common filters. Compact small files with <code>OPTIMIZE<\/code>; clean old data with <code>VACUUM<\/code> respecting retention policies.<\/p>\n<pre><code class=\"mono\">OPTIMIZE main.sales.events ZORDER BY (user_id, event_date);\r\nVACUUM main.sales.events RETAIN 168 HOURS;<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-indigo\">\n<h3>29) Multi-Hop Medallion<\/h3>\n<p>Bronze (raw) \u2192 Silver (cleaned) \u2192 Gold (aggregated\/serving). Enforce contracts between layers, track lineage, and promote via DLT\/Workflows. Keep transformations idempotent.<\/p>\n<pre><code class=\"mono\">-- Example: silver from bronze\r\nCREATE OR REPLACE TABLE main.sales.silver AS\r\nSELECT * FROM main.sales.bronze WHERE _is_valid;<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-emerald\">\n<h3>30) Q&amp;A \u2014 \u201cBatch vs Streaming in Delta?\u201d<\/h3>\n<p><span class=\"q\">Answer:<\/span> With Delta, both use the same table. Streaming writes append with checkpoints; batch jobs can read consistent snapshots. Unifying tables simplifies serving and governance.<\/p>\n<\/p><\/div>\n<p>      <!-- ===================== SECTION 4 ===================== --><\/p>\n<div class=\"section-title\">Section 4 \u2014 Frameworks, Data &amp; APIs<\/div>\n<div class=\"card bg-blue\">\n<h3>31) Databricks SQL + BI<\/h3>\n<p>Expose tables\/views to BI via SQL Warehouses. Use service principals and catalog grants. Materialize views for dashboards; schedule refresh queries and alerts.<\/p>\n<pre><code class=\"mono\">CREATE VIEW main.sales.top_customers AS\r\nSELECT customer_id, SUM(amount) AS total\r\nFROM main.sales.orders GROUP BY customer_id;<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-green\">\n<h3>32) Power BI, Tableau, Looker<\/h3>\n<p>Connect through ODBC\/JDBC to SQL Warehouses. For Power BI, use the Databricks connector; for Tableau\/Looker, set OAuth\/SCIM where available and align roles with UC.<\/p>\n<pre><code class=\"mono\">-- Keep semantic layers in BI, governance in UC; cache extracts strategically<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-amber\">\n<h3>33) Feature Store<\/h3>\n<p>Centralize and reuse ML features with lineage and point-in-time correctness. Serve features online with low-latency stores; ensure training\/serving parity.<\/p>\n<pre><code class=\"mono\">from databricks.feature_store import FeatureStoreClient\r\nfs = FeatureStoreClient()\r\nfs.create_table(name=\"main.fs.customer_features\", primary_keys=[\"customer_id\"], schema=...) <\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-violet\">\n<h3>34) AutoML<\/h3>\n<p>AutoML generates baseline models, code, and MLflow tracking for tabular\/time-series problems. Use as a starting point; productionize best candidates with guardrails.<\/p>\n<pre><code class=\"mono\"># In UI: AutoML &gt; New experiment &gt; Select dataset\/target<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-rose\">\n<h3>35) Model Registry &amp; Serving<\/h3>\n<p>Promote MLflow models through Staging \u2192 Production with approvals. Serve models via Databricks Model Serving or external endpoints. Track versions &amp; rollbacks.<\/p>\n<pre><code class=\"mono\">import mlflow\r\nclient = mlflow.tracking.MlflowClient()\r\nclient.transition_model_version_stage(\"churn-model\",\"3\",\"Production\")<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-cyan\">\n<h3>36) Vector Search \/ Retrieval<\/h3>\n<p>Embed data and store vectors in a managed index or Delta table. Use for semantic search\/RAG, with pipelines to refresh embeddings and maintain consistency.<\/p>\n<pre><code class=\"mono\">-- Pseudocode: store embeddings in Delta; serve via SQL\/UDTF or API<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-lime\">\n<h3>37) UDFs &amp; UDAFs (Py\/Scala\/SQL)<\/h3>\n<p>Define UDFs for custom logic; prefer SQL functions or builtin expressions for performance. Use Pandas UDFs (vectorized) for better throughput on Python code.<\/p>\n<pre><code class=\"mono\">import pyspark.sql.functions as F\r\n@F.pandas_udf(\"double\")\r\ndef zscore(col: pd.Series) -&gt; pd.Series: return (col - col.mean())\/col.std()<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-orange\">\n<h3>38) Libraries &amp; Repos<\/h3>\n<p>Use Repos to sync with Git, manage notebooks + packages. Install wheel\/egg libraries on clusters or use init scripts. Keep dependency graphs slim and pinned.<\/p>\n<pre><code class=\"mono\"># Install a wheel at cluster start (UI) or %pip in notebook:\r\n%pip install mypkg==1.2.3<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-indigo\">\n<h3>39) Data Quality &amp; Expectations<\/h3>\n<p>Use DLT expectations, Deequ, Great Expectations, or SQL constraints. Fail or quarantine bad records; log metrics to tables and dashboards.<\/p>\n<pre><code class=\"mono\">-- DLT expectation\r\nCREATE STREAMING LIVE TABLE silver\r\n  TBLPROPERTIES (\"quality\"=\"silver\")\r\n  AS SELECT * FROM LIVE.bronze WHERE amount &gt;= 0;<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-emerald\">\n<h3>40) Q&amp;A \u2014 \u201cHow to avoid tiny files?\u201d<\/h3>\n<p><span class=\"q\">Answer:<\/span> Use Auto Loader with <code>cloudFiles.maxFilesPerTrigger<\/code>, set proper batch sizes, write using <code>foreachBatch<\/code> with coalesce\/repartition, and run periodic <code>OPTIMIZE<\/code> compaction.<\/p>\n<\/p><\/div>\n<p>      <!-- ===================== SECTION 5 ===================== --><\/p>\n<div class=\"section-title\">Section 5 \u2014 Security, Testing, Deployment, Observability &amp; Interview Q&amp;A<\/div>\n<div class=\"card bg-blue\">\n<h3>41) Unity Catalog Security<\/h3>\n<p>Define catalogs\/schemas\/tables with grants to groups, service principals, and users. Use data masking, row\/column-level filters, and external locations with storage credentials.<\/p>\n<pre><code class=\"mono\">GRANT USAGE ON CATALOG main TO `analytics-team`;\r\nGRANT SELECT ON TABLE main.sales.orders TO `bi-readers`;<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-green\">\n<h3>42) Secrets &amp; Credentials<\/h3>\n<p>Manage secrets with Databricks Secrets (backed by cloud KMS). Access via <code>dbutils.secrets.get<\/code>. Use SCIM for identity, and service principals for automation.<\/p>\n<pre><code class=\"mono\">token = dbutils.secrets.get(scope=\"prod\", key=\"api-key\")<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-amber\">\n<h3>43) Table Lineage &amp; Audit<\/h3>\n<p>Unity Catalog captures lineage across notebooks, jobs, and SQL. Combine with audit logs\/system tables for governance, incident response, and cost analysis.<\/p>\n<pre><code class=\"mono\">-- View lineage in UC UI; query system tables for usage &amp; query_history<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-violet\">\n<h3>44) Testing &amp; CI\/CD<\/h3>\n<p>Unit test PySpark (pytest), use Databricks Repos for GitOps, and promote via Branch &rarr; Job in environments. Validate SQL with lightweight checks and data contracts per layer.<\/p>\n<pre><code class=\"mono\"># Run tests in job with small dev cluster; block PR on pass<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-rose\">\n<h3>45) Deployment Strategies<\/h3>\n<p>Parameterize jobs per env, pin runtimes, and use cluster policies. Package Python libs in wheels; manage notebooks via Repos. Export\/import via Terraform\/Workspace APIs where applicable.<\/p>\n<pre><code class=\"mono\"># Example: %pip install .  (from repo root) to install your package<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-cyan\">\n<h3>46) Observability &amp; Monitoring<\/h3>\n<p>Track cluster metrics, Spark UI, query profiles, MLflow runs, and system tables. Emit custom logs\/metrics to cloud monitoring. Set SLOs for latency, freshness, and cost.<\/p>\n<pre><code class=\"mono\">-- Query system.information_schema for usage; schedule cost dashboards<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-lime\">\n<h3>47) Reliability Runbooks<\/h3>\n<p>Create playbooks for streaming backfills, checkpoint resets, schema evolution, and hotspot mitigation. Automate with Workflows and safe toggles in parameters.<\/p>\n<pre><code class=\"mono\">-- Keep \"safe backfill\" notebooks with parameterized date ranges<\/code><\/pre>\n<\/p><\/div>\n<div class=\"card bg-orange tight\">\n<h3>48) Production Checklist<\/h3>\n<ul>\n<li>Unity Catalog governance &amp; least privilege<\/li>\n<li>Autoscaling &amp; auto-termination enabled<\/li>\n<li>Delta OPTIMIZE\/Z-ORDER; VACUUM retention set<\/li>\n<li>Streaming checkpoints &amp; idempotent writes<\/li>\n<li>Cost tags, budgets, and monitoring dashboards<\/li>\n<li>Runbooks &amp; on-call escalation paths<\/li>\n<\/ul><\/div>\n<div class=\"card bg-indigo\">\n<h3>49) Common Pitfalls<\/h3>\n<p>Tiny files explosion, missing Z-ORDER, under\/over-partitioning, ungoverned mounts, no autotermination, lack of schema contracts, and ad-hoc notebooks bypassing CI\/CD.<\/p>\n<\/p><\/div>\n<div class=\"card bg-emerald qa\">\n<h3>50) Interview Q&amp;A \u2014 20 Practical Questions (Expanded)<\/h3>\n<p><b>1) Why Lakehouse?<\/b> Unifies lake flexibility with warehouse performance via Delta, cutting duplicate stacks.<\/p>\n<p><b>2) Delta advantages?<\/b> ACID, schema handling, time travel, MERGE, OPTIMIZE\/Z-ORDER.<\/p>\n<p><b>3) Unity Catalog value?<\/b> Central governance, lineage, fine-grained permissions, external locations.<\/p>\n<p><b>4) Avoid tiny files?<\/b> Batch writes, coalesce\/repartition, Auto Loader tuning, periodic OPTIMIZE.<\/p>\n<p><b>5) Batch vs streaming?<\/b> Same Delta tables; streaming uses checkpoints &amp; exactly-once sinks.<\/p>\n<p><b>6) Photon when?<\/b> SQL-heavy analytics and BI workloads; significant speedups.<\/p>\n<p><b>7) Optimize joins?<\/b> Broadcast small tables, partition\/bucket large ones, Z-ORDER on filters.<\/p>\n<p><b>8) Cost controls?<\/b> Autoscaling, autoterminate, spot nodes, policies, monitoring with tags.<\/p>\n<p><b>9) DLT benefits?<\/b> Declarative pipelines, expectations, lineage, auto-recovery.<\/p>\n<p><b>10) Schema evolution?<\/b> Use Delta schema evolution flags; validate in silver before gold.<\/p>\n<p><b>11) Data quality?<\/b> DLT expectations, Great Expectations, constraints &amp; quarantine.<\/p>\n<p><b>12) ML lifecycle?<\/b> MLflow for tracking, registry, serving, and A\/B rollouts.<\/p>\n<p><b>13) BI connectivity?<\/b> Use SQL Warehouses + JDBC\/ODBC connectors; manage grants in UC.<\/p>\n<p><b>14) Medallion design?<\/b> Raw bronze, cleaned silver, aggregated gold; contracts between layers.<\/p>\n<p><b>15) Streaming recovery?<\/b> Durable checkpoints, replayable sources, <code>availableNow<\/code> for catch-up.<\/p>\n<p><b>16) Multi-env promotion?<\/b> Repos + branches, jobs per env, pinned runtimes, Terraform\/API.<\/p>\n<p><b>17) Row\/column security?<\/b> UC row filters &amp; column masks; test with least-privilege roles.<\/p>\n<p><b>18) File layout?<\/b> Partition by selective columns, target 128\u2013512MB file sizes, avoid skew.<\/p>\n<p><b>19) Debugging performance?<\/b> Use Spark UI, query profile, AQE, and <code>EXPLAIN<\/code>.<\/p>\n<p><b>20) When not Databricks?<\/b> Tiny datasets, single-node ELT, or when a simpler DB\/warehouse suffices.<\/p>\n<\/p><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Databricks Pocket Book \u2014 Uplatz 50 deep-dive flashcards \u2022 Single column \u2022 Fewer scrolls \u2022 20+ Interview Q&amp;A \u2022 Readable code examples Section 1 \u2014 Fundamentals 1) What is Databricks? <span class=\"readmore\"><a href=\"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/\">Read More &#8230;<\/a><\/span><\/p>\n","protected":false},"author":2,"featured_media":4847,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2415,2462],"tags":[],"class_list":["post-4749","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-databricks","category-pocket-book"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Databricks Pocket Book | Uplatz Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Databricks Pocket Book | Uplatz Blog\" \/>\n<meta property=\"og:description\" content=\"Databricks Pocket Book \u2014 Uplatz 50 deep-dive flashcards \u2022 Single column \u2022 Fewer scrolls \u2022 20+ Interview Q&amp;A \u2022 Readable code examples Section 1 \u2014 Fundamentals 1) What is Databricks? Read More ...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/\" \/>\n<meta property=\"og:site_name\" content=\"Uplatz Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/Uplatz-1077816825610769\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-08-23T15:41:41+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-08-27T02:54:10+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1280\" \/>\n\t<meta property=\"og:image:height\" content=\"720\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"uplatzblog\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@uplatz_global\" \/>\n<meta name=\"twitter:site\" content=\"@uplatz_global\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"uplatzblog\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/\"},\"author\":{\"name\":\"uplatzblog\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#\\\/schema\\\/person\\\/8ecae69a21d0757bdb2f776e67d2645e\"},\"headline\":\"Databricks Pocket Book\",\"datePublished\":\"2025-08-23T15:41:41+00:00\",\"dateModified\":\"2025-08-27T02:54:10+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/\"},\"wordCount\":1612,\"publisher\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/08\\\/15.png\",\"articleSection\":[\"Databricks\",\"Pocket Book\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/\",\"url\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/\",\"name\":\"Databricks Pocket Book | Uplatz Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/08\\\/15.png\",\"datePublished\":\"2025-08-23T15:41:41+00:00\",\"dateModified\":\"2025-08-27T02:54:10+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/#primaryimage\",\"url\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/08\\\/15.png\",\"contentUrl\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/08\\\/15.png\",\"width\":1280,\"height\":720,\"caption\":\"Databricks Pocket Book\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/databricks-pocket-book\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Databricks Pocket Book\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/\",\"name\":\"Uplatz Blog\",\"description\":\"Uplatz is a global IT Training &amp; Consulting company\",\"publisher\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#organization\",\"name\":\"uplatz.com\",\"url\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2016\\\/11\\\/Uplatz-Logo-Copy-2.png\",\"contentUrl\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2016\\\/11\\\/Uplatz-Logo-Copy-2.png\",\"width\":1280,\"height\":800,\"caption\":\"uplatz.com\"},\"image\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/Uplatz-1077816825610769\\\/\",\"https:\\\/\\\/x.com\\\/uplatz_global\",\"https:\\\/\\\/www.instagram.com\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/7956715?trk=tyah&amp;amp;amp;amp;trkInfo=clickedVertical:company,clickedEntityId:7956715,idx:1-1-1,tarId:1464353969447,tas:uplatz\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#\\\/schema\\\/person\\\/8ecae69a21d0757bdb2f776e67d2645e\",\"name\":\"uplatzblog\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g\",\"caption\":\"uplatzblog\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Databricks Pocket Book | Uplatz Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/","og_locale":"en_US","og_type":"article","og_title":"Databricks Pocket Book | Uplatz Blog","og_description":"Databricks Pocket Book \u2014 Uplatz 50 deep-dive flashcards \u2022 Single column \u2022 Fewer scrolls \u2022 20+ Interview Q&amp;A \u2022 Readable code examples Section 1 \u2014 Fundamentals 1) What is Databricks? Read More ...","og_url":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/","og_site_name":"Uplatz Blog","article_publisher":"https:\/\/www.facebook.com\/Uplatz-1077816825610769\/","article_published_time":"2025-08-23T15:41:41+00:00","article_modified_time":"2025-08-27T02:54:10+00:00","og_image":[{"width":1280,"height":720,"url":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15.png","type":"image\/png"}],"author":"uplatzblog","twitter_card":"summary_large_image","twitter_creator":"@uplatz_global","twitter_site":"@uplatz_global","twitter_misc":{"Written by":"uplatzblog","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/#article","isPartOf":{"@id":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/"},"author":{"name":"uplatzblog","@id":"https:\/\/uplatz.com\/blog\/#\/schema\/person\/8ecae69a21d0757bdb2f776e67d2645e"},"headline":"Databricks Pocket Book","datePublished":"2025-08-23T15:41:41+00:00","dateModified":"2025-08-27T02:54:10+00:00","mainEntityOfPage":{"@id":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/"},"wordCount":1612,"publisher":{"@id":"https:\/\/uplatz.com\/blog\/#organization"},"image":{"@id":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/#primaryimage"},"thumbnailUrl":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15.png","articleSection":["Databricks","Pocket Book"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/","url":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/","name":"Databricks Pocket Book | Uplatz Blog","isPartOf":{"@id":"https:\/\/uplatz.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/#primaryimage"},"image":{"@id":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/#primaryimage"},"thumbnailUrl":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15.png","datePublished":"2025-08-23T15:41:41+00:00","dateModified":"2025-08-27T02:54:10+00:00","breadcrumb":{"@id":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/uplatz.com\/blog\/databricks-pocket-book\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/#primaryimage","url":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15.png","contentUrl":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/15.png","width":1280,"height":720,"caption":"Databricks Pocket Book"},{"@type":"BreadcrumbList","@id":"https:\/\/uplatz.com\/blog\/databricks-pocket-book\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/uplatz.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Databricks Pocket Book"}]},{"@type":"WebSite","@id":"https:\/\/uplatz.com\/blog\/#website","url":"https:\/\/uplatz.com\/blog\/","name":"Uplatz Blog","description":"Uplatz is a global IT Training &amp; Consulting company","publisher":{"@id":"https:\/\/uplatz.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/uplatz.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/uplatz.com\/blog\/#organization","name":"uplatz.com","url":"https:\/\/uplatz.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/uplatz.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2016\/11\/Uplatz-Logo-Copy-2.png","contentUrl":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2016\/11\/Uplatz-Logo-Copy-2.png","width":1280,"height":800,"caption":"uplatz.com"},"image":{"@id":"https:\/\/uplatz.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/Uplatz-1077816825610769\/","https:\/\/x.com\/uplatz_global","https:\/\/www.instagram.com\/","https:\/\/www.linkedin.com\/company\/7956715?trk=tyah&amp;amp;amp;amp;trkInfo=clickedVertical:company,clickedEntityId:7956715,idx:1-1-1,tarId:1464353969447,tas:uplatz"]},{"@type":"Person","@id":"https:\/\/uplatz.com\/blog\/#\/schema\/person\/8ecae69a21d0757bdb2f776e67d2645e","name":"uplatzblog","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g","caption":"uplatzblog"}}]}},"_links":{"self":[{"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/posts\/4749","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/comments?post=4749"}],"version-history":[{"count":2,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/posts\/4749\/revisions"}],"predecessor-version":[{"id":4876,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/posts\/4749\/revisions\/4876"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/media\/4847"}],"wp:attachment":[{"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/media?parent=4749"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/categories?post=4749"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/tags?post=4749"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}