Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an API for manually clearing the Region Cache in long-running Spark tasks. #2809

Open
lizhenhuan opened this issue Mar 18, 2025 · 0 comments

Comments

@lizhenhuan
Copy link

Enhancement

Users desire to expedite task processing and thus initiate a long-running Spark process, which processes approximately one billion data points every ten minutes. Initially, within the first 24 hours, the task can complete the processing of around one billion data points within ten minutes. However, as time progresses, a significant number of Regions become invalid, leading to retries and consequently causing the task's runtime to exceed ten minutes. Therefore, there is a demand to incorporate an API that allows for the manual clearing of the Region Cache after each ten-minute cycle of task completion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant