Skip to content

[spark][doc] Add Spark batch union read#3142

Open
Yohahaha wants to merge 2 commits intoapache:mainfrom
Yohahaha:spark-union-read-doc
Open

[spark][doc] Add Spark batch union read#3142
Yohahaha wants to merge 2 commits intoapache:mainfrom
Yohahaha:spark-union-read-doc

Conversation

@Yohahaha
Copy link
Copy Markdown
Contributor

Purpose

Linked issue: close #xxx

Brief change log

Tests

API and Format

Documentation

@Yohahaha Yohahaha marked this pull request as ready for review April 20, 2026 13:58
@Yohahaha
Copy link
Copy Markdown
Contributor Author

@wuchong @YannByron @luoyuxia please take a look, thank you!

Comment thread website/docs/engine-spark/reads.md Outdated
The union read works for both **log tables** and **primary key tables**:

- **Log tables**: Combines Fluss log data with lake historical data
- **Primary key tables**: Merges the latest Fluss snapshot with log changes and lake history to provide the most up-to-date view
Copy link
Copy Markdown
Contributor

@beryllw beryllw Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combines lake snapshot data with recent KV log changes using sort-merge to provide the most up-to-date view

The phrase "latest Fluss snapshot" may cause confusion, as Fluss has its own internal snapshot concept (used for KV compaction).

-- Returns complete view combining Fluss and lake data
SELECT * FROM fluss_order_with_lake ORDER BY order_key;
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we could add a note:

Union read requires `scan.startup.mode = full` (default). Non-FULL modes (e.g., `earliest`, `latest`) bypass the lake path and read only from Fluss.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scan.startup.mode was not used in batch read actually, will fix related codes in another pr.

@beryllw
Copy link
Copy Markdown
Contributor

beryllw commented Apr 22, 2026

Thanks for the PR! Overall LGTM, with a few minor comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants