-
Notifications
You must be signed in to change notification settings - Fork 247
Introduce CachedSupplier for BasePersistence objects #1765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
cc @dimas-b (as you are looking at the similar issue at #1758), @eric-maynard , @collado-mike edit: sorry, wrong PR number |
@adnanhemani thanks for bringing my attention to this PR.
Hmm I looked at your code snippets but I don't see the connection between the private void initializeForRealm(
RealmContext realmContext, RootCredentialsSet rootCredentialsSet, boolean isBootstrap) {
String realmId = realmContext.getRealmIdentifier(); // resolve realm ID eagerly
DatasourceOperations databaseOperations = getDatasourceOperations(isBootstrap);
sessionSupplierMap.put(
realmId,
() ->
new JdbcBasePersistenceImpl(
databaseOperations,
secretsGenerator(() -> realmId, rootCredentialsSet),
storageIntegrationProvider,
realmId));
PolarisMetaStoreManager metaStoreManager = createNewMetaStoreManager();
metaStoreManagerMap.put(realmId, metaStoreManager);
} |
@adutra thanks for taking a look :)
The connection is that the TokenBroker bean is RequestScoped and it does create a BasePersistence Supplier object as part of the bean initialization using the
Yes, this was my original idea - but was hard for me to construct a test case for this type of fix. Maybe this is something you've had more experience with - but using a request-scoped As a result, I'm promoting the CachedSupplier as our preferred way to solve this issue instead. But I'm not heavily tied to this approach if we have a better way to test the way that you suggested. |
I still don't see any @adnanhemani as it stands, this PR is imo not mergeable: it has no clear error description, no stack trace that we can investigate, no reproducer, and no test case ( |
@adutra - I've reproduced the issue on a branch in my fork: https://github.com/adnanhemani/polaris/tree/ahemani/show_failure_1765 You can read the full diff here, but I made a really simple case here that creates a task when you create a catalog. The task only tries to get the Steps to reproduce the error using the code linked above:
You can then apply this PR on top of that code and retry these steps and see that you will no longer see this issue. More on how the TokenBroker creates the poisoned cache:
And that call is where the Again, your suggestion above to change this behavior by materializing the |
I came across an interesting bug yesterday that we need to fix to ensure that tasks can use the BasePersistence object, as they run outside of user call contexts.
What I was trying to do:
metaStoreManagerFactory.getOrCreateSessionSupplier(CallContext.getCurrentContext().getRealmContext()).get();
.get()
call:When digging deeper into why this is happening, I realized that due to the Supplier's lazy-loading at https://github.com/apache/polaris/blob/main/extension/persistence/relational-jdbc/src/main/java/org/apache/polaris/extension/persistence/relational/jdbc/JdbcMetaStoreManagerFactory.java#L100-L105, the
.get()
was actually using a RequestScoped realmContext bean given by the previously-ranTokenBroker
initialization (which is aRequestScoped
object here: https://github.com/apache/polaris/blob/main/quarkus/service/src/main/java/org/apache/polaris/service/quarkus/config/QuarkusProducers.java#L290-L299. Given this is a relatively-new addition, this may be why we haven't seen this bug previously.As Tasks run asynchronously, likely after the original request was already completed, this error actually makes sense - we should not be able to use a request scoped bean inside of a Task execution. But upon further looking, we do not actually need
realmContext
for anything other than resolving therealmIdentifier
once during the BasePersistence object initialization - as a result, we can cache the BasePersistence object using a supplier that caches the original result instead of constantly making new objects. This will also solve our issue, as the original request scoped RealmContext bean will not be used again during the Task's call to get a BasePersistence object.I've added a test case that shows the difference between the OOTB supplier and my ideal way to solve this problem using a CachedSupplier. If there is significant concern that we cannot cache the BasePersistence object, we can materialize the RealmContext object prior to the supplier so that at a minimum the RequestScoped RealmContext object is not being used - but I'm not sure if there's an easy way to test this, given that the MetastoreFactories are Quarkus
ApplicationScoped
objects.Please note, this is an issue in both EclipseLink and JDBC, as they have almost identical code paths here.
Many thanks to @singhpk234 for being my debugging rubber ducky :)