{"id":4490,"date":"2025-08-09T16:57:54","date_gmt":"2025-08-09T16:57:54","guid":{"rendered":"https:\/\/uplatz.com\/blog\/?p=4490"},"modified":"2025-08-28T13:10:23","modified_gmt":"2025-08-28T13:10:23","slug":"apache-hudi-pocket-book","status":"publish","type":"post","link":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/","title":{"rendered":"Apache Hudi Pocket Book"},"content":{"rendered":"<p><!-- Apache Hudi Pocket Book (Wide Layout, Readable Code, Scoped Styles) --><\/p>\n<div style=\"margin: 16px 0;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-4950\" src=\"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi-1024x576.jpg\" alt=\"\" width=\"840\" height=\"473\" srcset=\"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi-1024x576.jpg 1024w, https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi-300x169.jpg 300w, https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi-768x432.jpg 768w, https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi.jpg 1280w\" sizes=\"auto, (max-width: 840px) 100vw, 840px\" \/><\/p>\n<style>\n    .wp-hudi-pb { font-family: Arial, sans-serif; max-width: 1320px; margin:0 auto; }<br \/>\n    .wp-hudi-pb .heading{<br \/>\n      background: linear-gradient(135deg, #e0f2fe, #ccfbf1);<br \/>\n      color:#0f172a; padding:22px 24px; border-radius:14px;<br \/>\n      text-align:center; margin-bottom:18px; box-shadow:0 8px 20px rgba(0,0,0,.08);<br \/>\n      border:1px solid #cbd5e1;<br \/>\n    }<br \/>\n    .wp-hudi-pb .heading h2{ margin:0; font-size:2.1rem; letter-spacing:.2px; }<br \/>\n    .wp-hudi-pb .heading p{ margin:6px 0 0; font-size:1.02rem; opacity:.9; }<\/p>\n<p>    .wp-hudi-pb .grid{<br \/>\n      display:grid; gap:14px;<br \/>\n      grid-template-columns: repeat(auto-fill, minmax(400px, 1fr));<br \/>\n    }<br \/>\n    @media (min-width:1200px){<br \/>\n      .wp-hudi-pb .grid{ grid-template-columns: repeat(3, 1fr); }<br \/>\n    }<\/p>\n<p>    .wp-hudi-pb .section-title{<br \/>\n      grid-column:1\/-1; background:#f8fafc; border-left:8px solid #0ea5e9;<br \/>\n      padding:12px 16px; border-radius:10px; font-weight:700; color:#0f172a; font-size:1.08rem;<br \/>\n      box-shadow:0 2px 8px rgba(0,0,0,.05); border:1px solid #e2e8f0;<br \/>\n    }<br \/>\n    .wp-hudi-pb .card{<br \/>\n      background:#ffffff; border-left:6px solid #0ea5e9;<br \/>\n      padding:18px; border-radius:12px;<br \/>\n      box-shadow:0 6px 14px rgba(0,0,0,.06);<br \/>\n      transition:transform .12s ease, box-shadow .12s ease;<br \/>\n      border:1px solid #e5e7eb;<br \/>\n    }<br \/>\n    .wp-hudi-pb .card:hover{ transform: translateY(-3px); box-shadow:0 10px 22px rgba(0,0,0,.08); }<br \/>\n    .wp-hudi-pb .card h3{ margin:0 0 10px; font-size:1.12rem; color:#0f172a; }<br \/>\n    .wp-hudi-pb .card p{ margin:0; font-size:.96rem; color:#334155; line-height:1.62; }<\/p>\n<p>    \/* Color helpers *\/<br \/>\n    .bg-blue { border-left-color:#0ea5e9 !important; background:#f0f9ff !important; }<br \/>\n    .bg-green{ border-left-color:#10b981 !important; background:#f0fdf4 !important; }<br \/>\n    .bg-amber{ border-left-color:#f59e0b !important; background:#fffbeb !important; }<br \/>\n    .bg-violet{ border-left-color:#8b5cf6 !important; background:#f5f3ff !important; }<br \/>\n    .bg-rose{ border-left-color:#ef4444 !important; background:#fff1f2 !important; }<br \/>\n    .bg-cyan{ border-left-color:#06b6d4 !important; background:#ecfeff !important; }<br \/>\n    .bg-indigo{ border-left-color:#6366f1 !important; background:#eef2ff !important; }<br \/>\n    .bg-slate{ border-left-color:#334155 !important; background:#f8fafc !important; }<\/p>\n<p>    \/* Utilities *\/<br \/>\n    .tight ul{ margin:0; padding-left:18px; }<br \/>\n    .tight li{ margin:4px 0; }<br \/>\n    .mono{ font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace; }<br \/>\n    .muted{ color:#64748b; }<br \/>\n    .wp-hudi-pb code{ background:#f1f5f9; padding:0 4px; border-radius:4px; border:1px solid #e2e8f0; }<br \/>\n    .wp-hudi-pb pre{<br \/>\n      background:#f5f5f5; color:#111827; border:1px solid #e5e7eb;<br \/>\n      padding:12px; border-radius:8px; overflow:auto; font-size:.92rem; line-height:1.55;<br \/>\n    }<br \/>\n    .q{font-weight:700;}<br \/>\n    .qa p{ margin:8px 0; }<br \/>\n    .qa b{ color:#0f172a; }<br \/>\n  <\/style>\n<div class=\"wp-hudi-pb\">\n<div class=\"heading\">\n<h2>Apache Hudi Pocket Book<\/h2>\n<p>Upserts on data lakes \u2022 COW vs MOR \u2022 Timeline \u2022 Indexing \u2022 Incremental pulls \u2022 Compaction &amp; clustering \u2022 Spark\/Flink\/Presto\/Trino<\/p>\n<\/div>\n<div class=\"grid\"><!-- ===================== SECTION 1 ===================== --><\/p>\n<div class=\"section-title\">Section 1 \u2014 Fundamentals<\/div>\n<div class=\"card bg-blue\">\n<h3>1) What is Apache Hudi?<\/h3>\n<p>Apache Hudi brings database-like primitives (upsert, delete, change capture) to data lakes on object stores (S3\/GCS\/ADLS\/HDFS). It maintains a <b>timeline<\/b> of commits and supports fast incremental processing and near-real-time ingestion.<\/p>\n<pre><code class=\"mono\"># Spark quick start (PySpark)\r\npip install pyspark apache-hudi<\/code><\/pre>\n<\/div>\n<div class=\"card bg-green\">\n<h3>2) Table Types<\/h3>\n<ul class=\"tight\">\n<li><b>Copy-on-Write (COW)<\/b>: writes create new Parquet files; simpler, great for read-optimized analytics.<\/li>\n<li><b>Merge-on-Read (MOR)<\/b>: appends to log files, compacts later; lower write latency, good for near real-time.<\/li>\n<\/ul>\n<\/div>\n<div class=\"card bg-amber\">\n<h3>3) Core Keys<\/h3>\n<ul class=\"tight\">\n<li><b>recordKey<\/b>: unique id per row (e.g., order_id).<\/li>\n<li><b>partitionPath<\/b>: directory layout (e.g., dt=YYYY-MM-DD, region=US).<\/li>\n<li><b>preCombineField<\/b>: choose latest record when duplicates land within a batch (e.g., ts).<\/li>\n<\/ul>\n<\/div>\n<div class=\"card bg-violet\">\n<h3>4) Operations<\/h3>\n<p><b>insert<\/b>, <b>upsert<\/b>, <b>bulk_insert<\/b> (initial load), <b>delete<\/b>, <b>insert_overwrite<\/b> (replace partitions), <b>delta_commit<\/b> (MOR).<\/p>\n<\/div>\n<div class=\"card bg-rose\">\n<h3>5) Query Types<\/h3>\n<ul class=\"tight\">\n<li><b>Snapshot<\/b>: latest view (COW\/MOR).<\/li>\n<li><b>Read Optimized<\/b>: Parquet only (MOR without merging logs).<\/li>\n<li><b>Incremental<\/b>: rows changed since a commit time.<\/li>\n<\/ul>\n<\/div>\n<p><!-- ===================== SECTION 2 ===================== --><\/p>\n<div class=\"section-title\">Section 2 \u2014 Write &amp; Query (Spark\/Flink)<\/div>\n<div class=\"card bg-blue\">\n<h3>6) Spark Write \u2014 Upsert (Python)<\/h3>\n<pre><code class=\"mono\">from pyspark.sql import SparkSession\r\nspark = (SparkSession.builder\r\n  .config(\"spark.serializer\",\"org.apache.spark.serializer.KryoSerializer\")\r\n  .getOrCreate())\r\n\r\ndf = spark.read.json(\"s3:\/\/raw\/orders\/*.json\")\r\n\r\n(base_path, table) = (\"s3:\/\/lake\/bronze\/orders_hudi\", \"orders_hudi\")\r\n\r\n(df.write.format(\"hudi\")\r\n  .option(\"hoodie.table.name\", table)\r\n  .option(\"hoodie.datasource.write.recordkey.field\", \"order_id\")\r\n  .option(\"hoodie.datasource.write.partitionpath.field\", \"dt\")\r\n  .option(\"hoodie.datasource.write.precombine.field\", \"updated_at\")\r\n  .option(\"hoodie.datasource.write.operation\", \"upsert\")\r\n  .mode(\"append\")\r\n  .save(base_path))<\/code><\/pre>\n<\/div>\n<div class=\"card bg-green\">\n<h3>7) Spark Read \u2014 Snapshot \/ Incremental<\/h3>\n<pre><code class=\"mono\"># Snapshot\r\nsnap = (spark.read.format(\"hudi\").load(\"s3:\/\/lake\/bronze\/orders_hudi\"))\r\n# Incremental since a commit time\r\ninc = (spark.read.format(\"hudi\")\r\n  .option(\"hoodie.datasource.query.type\",\"incremental\")\r\n  .option(\"hoodie.datasource.read.begin.instanttime\",\"20250701000000\")\r\n  .load(\"s3:\/\/lake\/bronze\/orders_hudi\"))<\/code><\/pre>\n<\/div>\n<div class=\"card bg-amber\">\n<h3>8) Flink Streaming Ingest (Java\/SQL)<\/h3>\n<pre><code class=\"mono\">-- Flink SQL example (simplified)\r\nCREATE TABLE orders_hudi (\r\n  order_id STRING,\r\n  amount DOUBLE,\r\n  updated_at TIMESTAMP(3),\r\n  dt STRING,\r\n  PRIMARY KEY (order_id) NOT ENFORCED\r\n) PARTITIONED BY (dt)\r\nWITH ('connector'='hudi','table.type'='MERGE_ON_READ');<\/code><\/pre>\n<\/div>\n<div class=\"card bg-violet\">\n<h3>9) Query Engines<\/h3>\n<p>Use Presto\/Trino\/Athena\/Hive\/Spark SQL for interactive queries. For MOR, <b>snapshot<\/b> gives merged view; <b>read_optimized<\/b> is faster but may be slightly stale.<\/p>\n<\/div>\n<div class=\"card bg-rose\">\n<h3>10) Indexing<\/h3>\n<p>Hudi maintains indexes (Bloom, HBase, Simple, Bucket) to find file groups during upsert. Choose based on scale and key distribution; bucket index improves large-scale upserts.<\/p>\n<\/div>\n<p><!-- ===================== SECTION 3 ===================== --><\/p>\n<div class=\"section-title\">Section 3 \u2014 Timeline, Maintenance &amp; Performance<\/div>\n<div class=\"card bg-blue\">\n<h3>11) Timeline &amp; Commits<\/h3>\n<p>Hudi records actions as instants (<code>commit<\/code>, <code>delta_commit<\/code>, <code>clean<\/code>, <code>compaction<\/code>, <code>restore<\/code>). Use the timeline to run incremental ETL and CDC.<\/p>\n<\/div>\n<div class=\"card bg-green\">\n<h3>12) Compaction (MOR)<\/h3>\n<p>Merges log files into Parquet to keep reads fast. Schedule and run compaction during low-traffic windows or continuously with Flink.<\/p>\n<pre><code class=\"mono\"># Spark compaction trigger\r\n.option(\"hoodie.compact.inline\",\"true\")\r\n.option(\"hoodie.compact.inline.max.delta.commits\",\"10\")<\/code><\/pre>\n<\/div>\n<div class=\"card bg-amber\">\n<h3>13) Clustering (File Layout)<\/h3>\n<p>Rewrites file layout for better query performance (e.g., sort columns, larger files). Can be async and incremental.<\/p>\n<pre><code class=\"mono\">.option(\"hoodie.clustering.inline\",\"true\")\r\n.option(\"hoodie.clustering.plan.strategy.sort.columns\",\"dt,region\")<\/code><\/pre>\n<\/div>\n<div class=\"card bg-violet\">\n<h3>14) Cleaning &amp; Retention<\/h3>\n<p>Automatically removes obsolete file versions to control storage.<\/p>\n<pre><code class=\"mono\">.option(\"hoodie.cleaner.policy\",\"KEEP_LATEST_COMMITS\")\r\n.option(\"hoodie.cleaner.commits.retained\",\"20\")<\/code><\/pre>\n<\/div>\n<div class=\"card bg-rose\">\n<h3>15) Schema Evolution<\/h3>\n<p>Supports add\/drop nullable columns and compatible type changes. Keep evolution consistent across producers; validate with a schema registry when possible.<\/p>\n<\/div>\n<p><!-- ===================== SECTION 4 ===================== --><\/p>\n<div class=\"section-title\">Section 4 \u2014 Design, Integrations &amp; Patterns<\/div>\n<div class=\"card bg-blue\">\n<h3>16) Partitioning &amp; File Sizes<\/h3>\n<ul class=\"tight\">\n<li>Prefer hierarchical partitions with manageable cardinality (e.g., dt=YYYY-MM-DD, region).<\/li>\n<li>Tune target file size (e.g., 128\u2013512 MB) for scan efficiency.<\/li>\n<\/ul>\n<\/div>\n<div class=\"card bg-green\">\n<h3>17) Lakehouse Interop<\/h3>\n<p>Hudi competes\/interop with Iceberg &amp; Delta Lake. Choose based on: <b>upsert latency (Hudi\/MOR)<\/b>, <b>catalog\/ACID features<\/b>, <b>engine support<\/b>, and org standardization.<\/p>\n<\/div>\n<div class=\"card bg-amber\">\n<h3>18) CDC &amp; Incremental ETL<\/h3>\n<p>Ingest source CDC (Debezium\/Kafka) into Hudi; downstream jobs read <b>incrementally<\/b> using the begin instant to avoid full scans.<\/p>\n<\/div>\n<div class=\"card bg-violet\">\n<h3>19) Common Pitfalls<\/h3>\n<ul class=\"tight\">\n<li>Skewed keys \u2192 slow upserts; consider bucket index or repartitioning.<\/li>\n<li>Unmanaged compaction \u2192 growing log files \u2192 slow reads.<\/li>\n<li>Over-partitioning \u2192 too many small files; tune clustering\/file size.<\/li>\n<li>Reading MOR without merge when freshness is required.<\/li>\n<\/ul>\n<\/div>\n<div class=\"card bg-indigo qa\">\n<h3>20) Interview Q&amp;A \u2014 8 Quick Ones<\/h3>\n<p><b>1)<\/b> <i>Why Hudi over plain Parquet on S3?<\/i> Upserts, deletes, incremental queries, and consistent snapshots on a lake.<\/p>\n<p><b>2)<\/b> <i>COW vs MOR?<\/i> COW = simpler, great read perf; MOR = lower write latency with log + compaction.<\/p>\n<p><b>3)<\/b> <i>How does incremental pull work?<\/i> Use commit times on the timeline to read only changed rows.<\/p>\n<p><b>4)<\/b> <i>What is preCombineField?<\/i> Break ties among duplicate keys in a batch, keeping the latest version.<\/p>\n<p><b>5)<\/b> <i>How to handle late data?<\/i> Use preCombine timestamp and partition design; MOR helps absorb late arrivals.<\/p>\n<p><b>6)<\/b> <i>Speeding up upserts?<\/i> Proper indexing (bucket\/Bloom), co-locate data, tune parallelism &amp; file sizes.<\/p>\n<p><b>7)<\/b> <i>Compaction vs Clustering?<\/i> Compaction merges logs; clustering rewrites file layout for query efficiency.<\/p>\n<p><b>8)<\/b> <i>Query engines?<\/i> Spark\/Hive\/Presto\/Trino\/Athena support snapshot\/read-optimized modes.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Apache Hudi Pocket Book Upserts on data lakes \u2022 COW vs MOR \u2022 Timeline \u2022 Indexing \u2022 Incremental pulls \u2022 Compaction &amp; clustering \u2022 Spark\/Flink\/Presto\/Trino Section 1 \u2014 Fundamentals 1) <span class=\"readmore\"><a href=\"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/\">Read More &#8230;<\/a><\/span><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2482,2462],"tags":[],"class_list":["post-4490","post","type-post","status-publish","format-standard","hentry","category-apache-hudi","category-pocket-book"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Apache Hudi Pocket Book | Uplatz Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Apache Hudi Pocket Book | Uplatz Blog\" \/>\n<meta property=\"og:description\" content=\"Apache Hudi Pocket Book Upserts on data lakes \u2022 COW vs MOR \u2022 Timeline \u2022 Indexing \u2022 Incremental pulls \u2022 Compaction &amp; clustering \u2022 Spark\/Flink\/Presto\/Trino Section 1 \u2014 Fundamentals 1) Read More ...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/\" \/>\n<meta property=\"og:site_name\" content=\"Uplatz Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/Uplatz-1077816825610769\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-08-09T16:57:54+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-08-28T13:10:23+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1280\" \/>\n\t<meta property=\"og:image:height\" content=\"720\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"uplatzblog\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@uplatz_global\" \/>\n<meta name=\"twitter:site\" content=\"@uplatz_global\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"uplatzblog\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/\"},\"author\":{\"name\":\"uplatzblog\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#\\\/schema\\\/person\\\/8ecae69a21d0757bdb2f776e67d2645e\"},\"headline\":\"Apache Hudi Pocket Book\",\"datePublished\":\"2025-08-09T16:57:54+00:00\",\"dateModified\":\"2025-08-28T13:10:23+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/\"},\"wordCount\":571,\"publisher\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/08\\\/Apache-Hudi-1024x576.jpg\",\"articleSection\":[\"Apache Hudi\",\"Pocket Book\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/\",\"url\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/\",\"name\":\"Apache Hudi Pocket Book | Uplatz Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/08\\\/Apache-Hudi-1024x576.jpg\",\"datePublished\":\"2025-08-09T16:57:54+00:00\",\"dateModified\":\"2025-08-28T13:10:23+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/#primaryimage\",\"url\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/08\\\/Apache-Hudi.jpg\",\"contentUrl\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/08\\\/Apache-Hudi.jpg\",\"width\":1280,\"height\":720},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/apache-hudi-pocket-book\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Apache Hudi Pocket Book\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/\",\"name\":\"Uplatz Blog\",\"description\":\"Uplatz is a global IT Training &amp; Consulting company\",\"publisher\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#organization\",\"name\":\"uplatz.com\",\"url\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2016\\\/11\\\/Uplatz-Logo-Copy-2.png\",\"contentUrl\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/wp-content\\\/uploads\\\/2016\\\/11\\\/Uplatz-Logo-Copy-2.png\",\"width\":1280,\"height\":800,\"caption\":\"uplatz.com\"},\"image\":{\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/Uplatz-1077816825610769\\\/\",\"https:\\\/\\\/x.com\\\/uplatz_global\",\"https:\\\/\\\/www.instagram.com\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/7956715?trk=tyah&amp;amp;amp;amp;trkInfo=clickedVertical:company,clickedEntityId:7956715,idx:1-1-1,tarId:1464353969447,tas:uplatz\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/uplatz.com\\\/blog\\\/#\\\/schema\\\/person\\\/8ecae69a21d0757bdb2f776e67d2645e\",\"name\":\"uplatzblog\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g\",\"caption\":\"uplatzblog\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Apache Hudi Pocket Book | Uplatz Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/","og_locale":"en_US","og_type":"article","og_title":"Apache Hudi Pocket Book | Uplatz Blog","og_description":"Apache Hudi Pocket Book Upserts on data lakes \u2022 COW vs MOR \u2022 Timeline \u2022 Indexing \u2022 Incremental pulls \u2022 Compaction &amp; clustering \u2022 Spark\/Flink\/Presto\/Trino Section 1 \u2014 Fundamentals 1) Read More ...","og_url":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/","og_site_name":"Uplatz Blog","article_publisher":"https:\/\/www.facebook.com\/Uplatz-1077816825610769\/","article_published_time":"2025-08-09T16:57:54+00:00","article_modified_time":"2025-08-28T13:10:23+00:00","og_image":[{"width":1280,"height":720,"url":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi.jpg","type":"image\/jpeg"}],"author":"uplatzblog","twitter_card":"summary_large_image","twitter_creator":"@uplatz_global","twitter_site":"@uplatz_global","twitter_misc":{"Written by":"uplatzblog","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/#article","isPartOf":{"@id":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/"},"author":{"name":"uplatzblog","@id":"https:\/\/uplatz.com\/blog\/#\/schema\/person\/8ecae69a21d0757bdb2f776e67d2645e"},"headline":"Apache Hudi Pocket Book","datePublished":"2025-08-09T16:57:54+00:00","dateModified":"2025-08-28T13:10:23+00:00","mainEntityOfPage":{"@id":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/"},"wordCount":571,"publisher":{"@id":"https:\/\/uplatz.com\/blog\/#organization"},"image":{"@id":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/#primaryimage"},"thumbnailUrl":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi-1024x576.jpg","articleSection":["Apache Hudi","Pocket Book"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/","url":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/","name":"Apache Hudi Pocket Book | Uplatz Blog","isPartOf":{"@id":"https:\/\/uplatz.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/#primaryimage"},"image":{"@id":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/#primaryimage"},"thumbnailUrl":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi-1024x576.jpg","datePublished":"2025-08-09T16:57:54+00:00","dateModified":"2025-08-28T13:10:23+00:00","breadcrumb":{"@id":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/#primaryimage","url":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi.jpg","contentUrl":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2025\/08\/Apache-Hudi.jpg","width":1280,"height":720},{"@type":"BreadcrumbList","@id":"https:\/\/uplatz.com\/blog\/apache-hudi-pocket-book\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/uplatz.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Apache Hudi Pocket Book"}]},{"@type":"WebSite","@id":"https:\/\/uplatz.com\/blog\/#website","url":"https:\/\/uplatz.com\/blog\/","name":"Uplatz Blog","description":"Uplatz is a global IT Training &amp; Consulting company","publisher":{"@id":"https:\/\/uplatz.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/uplatz.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/uplatz.com\/blog\/#organization","name":"uplatz.com","url":"https:\/\/uplatz.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/uplatz.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2016\/11\/Uplatz-Logo-Copy-2.png","contentUrl":"https:\/\/uplatz.com\/blog\/wp-content\/uploads\/2016\/11\/Uplatz-Logo-Copy-2.png","width":1280,"height":800,"caption":"uplatz.com"},"image":{"@id":"https:\/\/uplatz.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/Uplatz-1077816825610769\/","https:\/\/x.com\/uplatz_global","https:\/\/www.instagram.com\/","https:\/\/www.linkedin.com\/company\/7956715?trk=tyah&amp;amp;amp;amp;trkInfo=clickedVertical:company,clickedEntityId:7956715,idx:1-1-1,tarId:1464353969447,tas:uplatz"]},{"@type":"Person","@id":"https:\/\/uplatz.com\/blog\/#\/schema\/person\/8ecae69a21d0757bdb2f776e67d2645e","name":"uplatzblog","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/7f814c72279199f59ded4418a8653ad15f5f8904ac75e025a4e2abe24d58fa5d?s=96&d=mm&r=g","caption":"uplatzblog"}}]}},"_links":{"self":[{"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/posts\/4490","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/comments?post=4490"}],"version-history":[{"count":2,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/posts\/4490\/revisions"}],"predecessor-version":[{"id":4951,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/posts\/4490\/revisions\/4951"}],"wp:attachment":[{"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/media?parent=4490"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/categories?post=4490"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/uplatz.com\/blog\/wp-json\/wp\/v2\/tags?post=4490"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}