New crate: vortex-clickhouse (ClickHouse integration via C FFI) #6420
+14,732
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Does this PR closes an open issue or discussion?
No existing issue — this introduces a new integration crate.
What changes are included in this PR?
Add a new vortex-clickhouse crate that enables ClickHouse to natively read and write Vortex files through FORMAT Vortex. The crate compiles to a C static library (staticlib) and exposes an opaque-handle-based C FFI that ClickHouse links against via its IInputFormat / IOutputFormat framework.
Crate structure:
Supported ClickHouse types
Int8–Int256, UInt8–UInt256, Float32/64, Decimal32/64/128/256, String, FixedString(N), Bool, Date, Date32, DateTime, DateTime64, Array(T), Tuple(...), Map(K,V), Nullable(T), LowCardinality(T), Enum8/16, IPv4, IPv6, UUID, and Geo types.
Types without native Vortex equivalents are modeled as Vortex extension types with custom metadata, enabling lossless round-trip through the file format.
What is the rationale for this change?
ClickHouse is one of the most widely deployed analytical databases. Adding native Vortex format support allows ClickHouse users to directly query and produce Vortex files, benefiting from Vortex's adaptive encoding and compression without requiring format conversion pipelines.
The C FFI approach was chosen because:
ClickHouse's format system requires implementing C++ interfaces (IInputFormat, IOutputFormat), so a thin C++ shim calling into Rust via FFI is the natural integration point.
This follows the same pattern as other Rust integrations already in ClickHouse (e.g., BLAKE3, skim).
Opaque handles with _new/_free pairs provide a safe, simple ownership model across the language boundary.
How is this change tested?
225 unit tests covering:
Bidirectional type conversion for all supported ClickHouse types (primitives, strings, decimals, nested, nullable, extension types)
Column data construction and round-trip via VortexColumnBuilder
Extension type registration and metadata serialization
Scanner and writer FFI interface contracts
End-to-end file read/write cycle (e2e_test.rs)
All tests pass: cargo test -p vortex-clickhouse → 225 passed, 0 failed.
Are there any user-facing changes?
No breaking changes to existing APIs. This is a new, additive crate (vortex-clickhouse) with publish = false. It adds the crate to the workspace members and [workspace.dependencies] in the root Cargo.toml.