This is a public repository for Apache Spark™ 4.0.0, Spark Connect, and other Spark related examples, includ Spark Declarative Pipelines, for testing common Python packages. Partial code snippets for testing were generated using a combination of Cursor, Goose, ChatGPT, CodePilot, Learning Spark 2nd Ed, and PySpark documentation and tutorial examples.
Spark Connect Documentation
Spark Connect Technical Talks
- Python with Spark Connect
- Spark Connect: Apache Spark 3.4 & Beyond
- Use Spark from anywhere - A Spark client in Python powered by Spark Connect
Spark Data Source Talks
- Breaking Barriers: Building Custom Spark 4.0 Data Connectors with Python
- Bridging Big Data and AI: Empowering PySpark With Lance Format for Multi-Modal AI Data Pipelines
- Apache Spark™ 4.0 for Data Engineering
Cheers,
Jules