Skip to content

Hibernate batch processing #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

NathanQingyangXu
Copy link
Contributor

@NathanQingyangXu NathanQingyangXu commented Feb 4, 2025

https://jira.mongodb.org/browse/HIBERNATE-35

Hibernate uses JDBC batch feature only for PreparedStatement, so the code logic changes are centralized in MongoPreparedStatement, not in MongoStatement. On top of JDBC's batch feature, Hibernate builds some higher abstraction (see BatchImpl class: https://github.com/hibernate/hibernate-orm/blob/main/hibernate-core/src/main/java/org/hibernate/engine/jdbc/batch/internal/BatchImpl.java), so we need to code our logic in the context of Hibernate's usage.

Internally bulk writing API methods are used as implementation. ReplaceOne write model is not added for it is out of scope of milestone1.

It is a pity there is no possibility to showcase the batch processing using high-level Hibernate's CRUD processing context.
I tested it on top of our first CRUD PR and confirmed that it works as expected (by checking logs and explore in debugging mode).

The pattern of batch processing in PreparedStatement is as follows:

for (var i = 0;  i < batch_size; i++) {
    pstmt.setXXX(1, ...);
    pstmt.setXXX(2, ...);
    ... ...
    pstmt.addBatch();
    pstmt.executeBatch();
}

Only when PreparedStatement is to be released (but not closed), should the clearBatch() method be invoked. As the pattern above shows, the internal statement batch should have been emptied after executeBatch() invocation (Hibernate would guarantee there is no dangling last batch without executing), so in theory there is not much to be done in the clearBatch() method (its name might be misleading into thinking that it is in charge of emptying the batch storage internally).

throw new ConfigurationException("Could not determine how to handle configuration value [name=" + name
+ ", value=" + value + "(" + value.getClass().getName() + ")] as int");
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the above util class was copied from Hibernate's internal class.

@NathanQingyangXu NathanQingyangXu force-pushed the HIBERNATE-35-batch-processing branch from b0b4bc5 to f643b35 Compare February 4, 2025 14:02
@jyemin jyemin removed request for jyemin and stIncMale February 4, 2025 14:06
@NathanQingyangXu NathanQingyangXu marked this pull request as ready for review February 4, 2025 14:06
Copy link
Collaborator

@jyemin jyemin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial comments

var jdbcUrl = configValues.get(JAKARTA_JDBC_URL);
public void configure(Map<String, Object> configurationValues) {

batchSize = ConfigurationHelper.getInt(AvailableSettings.STATEMENT_BATCH_SIZE, configurationValues, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach conflicts with the one proposed in #31 (there, we would have a new setting in MongoDialectSettings). Let's deal with the configuration approach first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we want to centralize all Hibernate related configs in one single class? That is ambitous. This PR would be merged after yours for sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the latest commit has removed the usage of batchable and addBatch() method alone will decide. For that reason, no configuration is needed at all in this PR. I removed the copied ConfigurationHelper as well.

this(mongoClient, clientSession, mongoConnection, mql, false);
}

public MongoPreparedStatement(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MongoPreparedStatement is package access, there is no point in introducing a public constructor.

I have just noticed, the same applies to:

  • the existing MongoPreparedStatement constructor
  • the existing MongoStatement constructor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! changed.


/**
* MongoDB Dialect's JDBC {@link java.sql.PreparedStatement} implementation class.
*
* <p>It only focuses on API methods MongoDB Dialect will support. All the other methods are implemented by throwing
* exceptions in its parent {@link PreparedStatementAdapter adapter interface}.
*/
final class MongoPreparedStatement extends MongoStatement implements PreparedStatementAdapter {
class MongoPreparedStatement extends MongoStatement implements PreparedStatementAdapter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason behind making this class extensible in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first iteration is to separate batch related logic into a child class called MongoBatchablePreparedStatement, later on I referred to other JDBC driver implementation and gave up the idea; then I forgot to add back final. Changed.

@NathanQingyangXu NathanQingyangXu force-pushed the HIBERNATE-35-batch-processing branch from 1cc8940 to 691777f Compare February 7, 2025 14:51
@NathanQingyangXu
Copy link
Contributor Author

@jyemin given the timeout concern will be dealt with in #40 (which will centralize the updating in one place, regardless of bulk updating or not), could we continue reviewing this PR? I think the two PRs are tightly related (which one is merged first requires updating of the second one), but I think the necessary refactoring should be straightforward.

@jyemin
Copy link
Collaborator

jyemin commented Feb 18, 2025

Yes, we can continue. Did you mean to re-request reviews?

@NathanQingyangXu
Copy link
Contributor Author

Yes, we can continue. Did you mean to re-request reviews?

Yeah, I clicked the re-request button then.

jyemin
jyemin previously approved these changes Feb 18, 2025
# Conflicts:
#	src/main/java/com/mongodb/hibernate/jdbc/MongoConnectionProvider.java
#	src/main/java/com/mongodb/hibernate/jdbc/MongoPreparedStatement.java
#	src/test/java/com/mongodb/hibernate/jdbc/MongoConnectionTests.java
#	src/test/java/com/mongodb/hibernate/jdbc/MongoPreparedStatementTests.java
#	src/test/resources/hibernate.properties
Copy link
Member

@stIncMale stIncMale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[I have not reviewed this PR]

We still have TODO-HIBERNATE-35 comments in the code, but all of them are supposed to be addressed as part of https://jira.mongodb.org/browse/HIBERNATE-35, which this PR implements. What is the plan for those?

I overlooked. Those references should have been deleted for Hibernate only supports batch processing in PreparedStatement, not Statement.

command4 -> assertThat(command4.getFirstKey()).isEqualTo("commitTransaction"));
}

// TODO-HIBERNATE-19 https://jira.mongodb.org/browse/HIBERNATE-19
Copy link
Member

@stIncMale stIncMale Mar 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds a TODO-HIBERNATE-19 note, but #59 for https://jira.mongodb.org/browse/HIBERNATE-19 has already been created. Now #59 (HIBERNATE-19) depends on the current PR (#35 (HIBERNATE-35)), yet they are being worked on concurrently.

If we proceed as is, it is not impossible that we'll merge #59 (HIBERNATE-19) first, then the currernt PR (#35 (HIBERNATE-35)), and the TODO-HIBERNATE-19 added by the current PR will stay in main despite HIBERNATE-19 having been resolved. This scenario is actually very likely, given this comment.

What is our plan on making sure the above is not going to happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems we can avoid such deadlock by disabling one PR for the time being until the other one got merged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disabled this PR for the time being so we could focus on #59.

Copy link
Member

@stIncMale stIncMale Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that when we create another PR for HIBERNATE-35, it will be based on this one (I don't think you'll just throw away these changes and start from scratch in the future), and the TODO-HIBERNATE-19 will slip there, despite HIBERNATE-19 having been closed. Can we replace this TODO-HIBERNATE-19 comment with a TODO-HIBERNATE-35 comment explaining what needs to be done, and that it should be done after implementing HIBERNATE-19.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did it. good point.

@NathanQingyangXu
Copy link
Contributor Author

closed to avoid conflict with #59. Will open it when the dependency PR has been merged.

@NathanQingyangXu
Copy link
Contributor Author

This PR is re-opened for the "basic update translation" is merged so a complete integration testing suite is possible.
@jyemin , the only new thing is the new integration testing case (including 'insertion', 'deletion' and 'update') at https://github.com/mongodb/mongo-hibernate/pull/35/files#diff-208727884b6fd50b4fce3475acee3115f5a5d880432232cf412e3a37ee48a235

@NathanQingyangXu
Copy link
Contributor Author

close this PR for it might be out of scope of M1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants