Write lineage metadata to centralized file on hive #856

zacaydcloudera · 2025-01-28T09:44:40Z

Hi @wajda
I used on the HDFS on cloudera this config
spark.jars=hdfs:///tmp/spark-2.4-spline-agent-bundle_2.11-2.2.1.jar
spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener
spark.spline.mode=ENABLED
spark.spline.lineageDispatcher=hdfs
spark.spline.lineageDispatcher.hdfs.outputDir=hdfs:///tmp/spline/lineage/
spark.spline.lineageDispatcher.hdfs.fileNamePrefix=lineage_
spark.spline.lineageDispatcher.hdfs.fileBufferSize=4096
spark.spline.lineageDispatcher.hdfs.filePermissions=777
spark.driver.memory=4g

But it wrote the lineage to the target file on the script, seems to ignore the spark.spline.lineageDispatcher.hdfs.outputDir=hdfs:///tmp/spline/lineage/
Is there a way to set it to a centralized place and also to give each json file the execution_plan_id
thanks

cerveada added this to Spline Jan 28, 2025

github-project-automation bot moved this to New in Spline Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write lineage metadata to centralized file on hive #856

Write lineage metadata to centralized file on hive #856

zacaydcloudera commented Jan 28, 2025

Write lineage metadata to centralized file on hive #856

Write lineage metadata to centralized file on hive #856

Comments

zacaydcloudera commented Jan 28, 2025