To add write support for Delta Lake in Velox backend.
Work items:
Add native write support for Spark 3.5 + Delta 3.3
[VL] Delta: Add Delta Lake write unit test for Spark 3.5 + Delta 3.3 #10802
[GLUTEN-10215][VL] Delta: Native write support for Delta 3.3.1 / Spark 3.5 #10801
(TODO) Add native write support for Spark 4.0 + Delta 4.0
(PR pending) Native statistics tracker to avoid C2R overhead
[GLUTEN-10215][VL] Delta write: Native statistics tracker to eliminate C2R overhead #11419
(TODO, optional) Add native write support for lower versions (Spark 3.4 + Delta 2.4)
Offload DeltaOptimizedWriterExec for optimize options
[GLUTEN-10215][VL] Delta Write: Offload DeltaOptimizedWriterExec #11461
PoC at #10216 .
Gaps so far:
Constraint expression needs to offload (C2R / R2C added otherwise).
The overwritten Delta classes in Gluten should be avoided (DeltaParquetFileFormat for example).
Copied test code from Delta, if there's a way to avoid the practice.
To add write support for Delta Lake in Velox backend.
Work items:
[VL] Delta: Add Delta Lake write unit test for Spark 3.5 + Delta 3.3 #10802
[GLUTEN-10215][VL] Delta: Native write support for Delta 3.3.1 / Spark 3.5 #10801
[GLUTEN-10215][VL] Delta write: Native statistics tracker to eliminate C2R overhead #11419
DeltaOptimizedWriterExecfor optimize options[GLUTEN-10215][VL] Delta Write: Offload DeltaOptimizedWriterExec #11461
PoC at #10216.
Gaps so far:
DeltaParquetFileFormatfor example).