Skip to content

Limit/Offset/Fetch MQL translation #94

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 60 commits into from
Jul 2, 2025

Conversation

NathanQingyangXu
Copy link
Contributor

@NathanQingyangXu NathanQingyangXu commented May 8, 2025

https://jira.mongodb.org/browse/HIBERNATE-70

Hibernate user guide see https://docs.jboss.org/hibernate/orm/6.6/userguide/html_single/Hibernate_User_Guide.html#hql-limit-offset

Relevant HQL BNF grammar is at https://github.com/hibernate/hibernate-orm/blob/main/hibernate-core/src/main/antlr/org/hibernate/grammars/hql/HqlParser.g4#L180:

queryOrder
	: orderByClause limitClause? offsetClause? fetchClause?
	;

and the three clauses grammar is at https://github.com/hibernate/hibernate-orm/blob/main/hibernate-core/src/main/antlr/org/hibernate/grammars/hql/HqlParser.g4#L591:

A natural followup of the sorting ticket at https://jira.mongodb.org/browse/HIBERNATE-68 for as seen above, this ticket requires orderByClause as precondition.

So basically fetchClause is the verbose version of limitClause and they can't be used at the same time. When fetchClause is used, it should show up after offsetClause.

What is really complex is Hibernate supports two ways of limit/offset spec. The above is for HQL we are familar with, but there is another way to pass similar config through QueryOptions. For instance:

var selectionQuery = session.createSelectionQuery("from Book order by id", Book.class).setFirstRow(10).setMaxRows(50);

The above setFirstRow() and setMaxRows will end up with the a Limit field (contianing both firstRow and maxRows fields mirroring the above query setter usage) in QueryOptions.

The important tech detail the config passed through QueryOptions would overshallow the corresponding HQL counterparts.

Why does Hibernate provide two ways of the seemingly simple limit/offset configs? It boils down to some special SQL dialects:

  1. Some SQL dialects has native way to spec in its native SQL (some even put them at the beginning, not to the end);
  2. Some SQL dialects doesn't support native SQL support and JDBC API has to be relied upon (e.g. ResultSet#absolute(int), PreparedStatement#setMaxRows(int))

For that reason, Hibernate has dialect customized LimitHandler to go about the above special logic.

MongoDB dialect has friendly support by $skip and $limit aggregate stages; however, we still need to support the above two ways (and their precedence). That is the gist of this PR.

One important tech complexity is Hibernate tries its best to tap into query plan cache to avoid unnecessary duplicated SQL translation. If only HQL way is used, there is nothing new. But Hibernate also to ensure the following two QueryOptions usages would end up with query plan cache hit (so the second query would reuse the same SQL translatied by the first one):

  • session.createSelectionQuery("from Book order by id", Book.class).setFirstRow(5)
  • session.createSelectionQuery("from Book order by id", Book.class).setFirstRow(10)

Hibernate has to include the Limit info in QueryOptions in its cache logic. The trick is to make the above firstRow Limit end up with a JdbcParameter. However, there is nontrivial convoluted logic to figure out completely.

In this PR, I tried to reuse existing default logic in AbstractSqlAstTranslator to achieve the goal of incoporating QueryOptions's Limit into cache management with good integration testing cases to prove. See further details in relevant code-specific comments.

NathanQingyangXu and others added 21 commits May 5, 2025 10:51
# Conflicts:
#	src/integrationTest/java/com/mongodb/hibernate/query/select/AbstractSelectionQueryIntegrationTests.java
#	src/integrationTest/java/com/mongodb/hibernate/query/select/Book.java
#	src/main/java/com/mongodb/hibernate/internal/translate/AbstractMqlTranslator.java
…rtingSelectQueryIntegrationTests.java

Co-authored-by: Viacheslav Babanin <[email protected]>
…tMqlTranslator.java

Co-authored-by: Viacheslav Babanin <[email protected]>
…-68-new

# Conflicts:
#	src/integrationTest/java/com/mongodb/hibernate/query/select/SortingSelectQueryIntegrationTests.java
var stages = new ArrayList<AstStage>(2);
final Expression skipExpression;
final Expression limitExpression;
if (queryPart.isRoot() && limit != null && !limit.isEmpty()) {
Copy link
Contributor Author

@NathanQingyangXu NathanQingyangXu May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QueryOptions's Limit only applies for top-level query (or not in subquery whose isRoot() would return false). However, given subquery is not supported for now, it is hard to cover this logic in testing code.

.bind(
statement,
parameterValueAccess.apply(
executionContext.getQueryOptions().getLimit()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the magic to achieve Limit cache. Firstly a JdbcParameter was created, then a special ParameterBinder is created here to use the executeContext environment variable which shares the same QueryOptions so we can get the different Limit value at runtime to populate the static JDBC parameter placeholder.

getAffectedTableNames(),
0,
Integer.MAX_VALUE,
Collections.emptyMap(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first two values are dummy or default values, meaning rowsToSkip and maxRows are not set. Why?
The two values are only meant for those rare dialects who doesn't support SQL (native) level spec and pure JDBC API (ResultSet#absolute(int) and PreparedStatement#setMaxRows(int) ) usage is the only vialbe solution.

To furhter investigate, let us analyze the only usptream of the above JdbcSelect object returned from the translator method, i.e. DeferedResultSetAccess (https://github.com/hibernate/hibernate-orm/blob/main/hibernate-core/src/main/java/org/hibernate/sql/results/jdbc/internal/DeferredResultSetAccess.java).

Particularly the following method is the main logic driving everything:

private void executeQuery() {
		final LogicalConnectionImplementor logicalConnection = getPersistenceContext().getJdbcCoordinator().getLogicalConnection();

		final SharedSessionContractImplementor session = executionContext.getSession();
		try {
			LOG.tracef( "Executing query to retrieve ResultSet : %s", finalSql );
			// prepare the query
			preparedStatement = statementCreator.createStatement( executionContext, finalSql );

			bindParameters( preparedStatement );

			final SessionEventListenerManager eventListenerManager = session
					.getEventListenerManager();

			long executeStartNanos = 0;
			if ( sqlStatementLogger.getLogSlowQuery() > 0 ) {
				executeStartNanos = System.nanoTime();
			}
			final EventManager eventManager = session.getEventManager();
			final HibernateMonitoringEvent jdbcPreparedStatementExecutionEvent = eventManager.beginJdbcPreparedStatementExecutionEvent();
			try {
				eventListenerManager.jdbcExecuteStatementStart();
				resultSet = wrapResultSet( preparedStatement.executeQuery() );
			}
			finally {
				eventManager.completeJdbcPreparedStatementExecutionEvent( jdbcPreparedStatementExecutionEvent, finalSql );
				eventListenerManager.jdbcExecuteStatementEnd();
				sqlStatementLogger.logSlowQuery( finalSql, executeStartNanos, context() );
			}

			skipRows( resultSet );
			logicalConnection.getResourceRegistry().register( resultSet, preparedStatement );
		}
		catch (SQLException e) {
			try {
				release();
			}
			catch (RuntimeException e2) {
				e.addSuppressed( e2 );
			}
			throw session.getJdbcServices().getSqlExceptionHelper().convert(
					e,
					"JDBC exception executing SQL [" + finalSql + "]"
			);
		}
	}

Notice there is a skipRows statement above right after ResultSet has been returned. Below is its logic:

protected void skipRows(ResultSet resultSet) throws SQLException {
		// For dialects that don't support an offset clause
		final int rowsToSkip;
		if ( !jdbcSelect.usesLimitParameters() && limit != null && limit.getFirstRow() != null && !limitHandler.supportsLimitOffset() ) {
			rowsToSkip = limit.getFirstRow();
		}
		else {
			rowsToSkip = jdbcSelect.getRowsToSkip();
		}
		if ( rowsToSkip != 0 ) {
			try {
				resultSet.absolute( rowsToSkip );
			}
			catch (SQLException ex) {
				// This could happen with the jTDS driver which throws an exception on non-scrollable result sets
				// To avoid throwing a wrong exception in case this was some other error, check if we can advance to next
				try {
					resultSet.next();
				}
				catch (SQLException ex2) {
					throw ex;
				}
				// Traverse to the actual row
				for (int i = 1; i < rowsToSkip && resultSet.next(); i++) {}
			}
		}
	}

the jdbcSelect corresponds to what we have returned from SelectMqlTranslator and we set its rowsToSkip to 0 above.
Note that the !jdbcSelect.usesLimitParameters() && limit != null && limit.getFirstRow() != null would be always false for when limit is not empty, we should have created LimitParameters already, so rowsToSkip would be evaulated to zero for that is the value we returned, thus ending up with skipping the ResultSet#absolute(int) in the first place.

How about the second value or maxRows which we set to Integer.MAX_VALUE above?

Below is the relevant method in the same class:

protected void bindParameters(PreparedStatement preparedStatement) throws SQLException {
		final QueryOptions queryOptions = executionContext.getQueryOptions();

		// set options
		if ( queryOptions != null ) {
			if ( queryOptions.getFetchSize() != null ) {
				preparedStatement.setFetchSize( queryOptions.getFetchSize() );
			}
			if ( queryOptions.getTimeout() != null ) {
				preparedStatement.setQueryTimeout( queryOptions.getTimeout() );
			}
		}

		// bind parameters
		// 		todo : validate that all query parameters were bound?
		int paramBindingPosition = 1;
		paramBindingPosition += limitHandler.bindLimitParametersAtStartOfQuery( limit, preparedStatement, paramBindingPosition );
		for ( JdbcParameterBinder parameterBinder : jdbcSelect.getParameterBinders() ) {
			parameterBinder.bindParameterValue(
					preparedStatement,
					paramBindingPosition++,
					jdbcParameterBindings,
					executionContext
			);
		}

		paramBindingPosition += limitHandler.bindLimitParametersAtEndOfQuery( limit, preparedStatement, paramBindingPosition );

		if ( !jdbcSelect.usesLimitParameters() && limit != null && limit.getMaxRows() != null ) {
			limitHandler.setMaxRows( limit, preparedStatement );
		}
		else {
			final int maxRows = jdbcSelect.getMaxRows();
			if ( maxRows != Integer.MAX_VALUE ) {
				preparedStatement.setMaxRows( maxRows );
			}
		}
	}

Again, !jdbcSelect.usesLimitParameters() && limit != null && limit.getMaxRows() != null is always false, so we end up within the following logic branch below:

final int maxRows = jdbcSelect.getMaxRows();
			if ( maxRows != Integer.MAX_VALUE ) {
				preparedStatement.setMaxRows( maxRows );
			}

again, given we provided the default value or Integer.MAX_VALUE, so we would end up skipping the invocation of PreparedStatement#setMaxRows(int).

Overall, given we don't need to tap into JDBC API's special methods as last resort, we have to set both fields above to default values to skip them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The third value above is appliedParameters which will impact query plan cache logic (if some parameter has been applied or value has been inserted into the translated SQL, we can't reuse it for future query unless the applied value is indentical). Given we are not applying any parameter, we simply return empty map above.

@NathanQingyangXu NathanQingyangXu changed the base branch from HIBERNATE-68-new to main May 23, 2025 17:16
Copy link
Member

@stIncMale stIncMale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last reviewed commit is c677897.

Copy link
Member

@stIncMale stIncMale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last reviewed commit is acd7e1d.

stIncMale
stIncMale previously approved these changes Jun 23, 2025
Copy link
Member

@stIncMale stIncMale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last reviewed commit is 2aca1d7.

vbabanin
vbabanin previously approved these changes Jul 2, 2025
Copy link
Member

@vbabanin vbabanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

…ced in LimitOffsetFetchClauseIntegrationTests
@NathanQingyangXu NathanQingyangXu dismissed stale reviews from vbabanin and stIncMale via 2a7cb9e July 2, 2025 20:09
@NathanQingyangXu NathanQingyangXu requested a review from vbabanin July 2, 2025 20:36
@NathanQingyangXu NathanQingyangXu merged commit 3bc4692 into mongodb:main Jul 2, 2025
6 checks passed
@NathanQingyangXu NathanQingyangXu deleted the HIBERNATE-70-new branch July 2, 2025 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants