Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark Lineage Generation Fails with Task Serialization Error #24

Open
tranhan02 opened this issue Feb 10, 2025 · 0 comments
Open

Spark Lineage Generation Fails with Task Serialization Error #24

tranhan02 opened this issue Feb 10, 2025 · 0 comments

Comments

@tranhan02
Copy link

I am currently testing the Spark lineage feature in OpenMetadata with Mysql as the database. I have successfully created some simple Spark lineage pipelines. However, I’ve noticed that if my job contains complex transformations such as GROUP BY, window functions, etc., the lineage pipeline fails to generate, and I encounter an error similar to the one below:

Py4JJavaError: An error occurred while calling o156.save.  
: org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.lang.StackOverflowError  
java.lang.StackOverflowError  
    at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1428)  
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)  
    at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1553)  
    at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)  
    at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)  
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)  
    ...  

I’m wondering whether there are any specific constraints or best practices that Spark jobs must follow when using Spark lineage in OpenMetadata. I couldn’t find any detailed documentation regarding this.

Has anyone encountered this issue before? Any insights on how to resolve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant