Skip to content

fix(\OC\DB\Adapter): Ensure insertIfNotExist is atomic #54041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

provokateurin
Copy link
Member

Summary

INSERT ... SELECT ... are not guaranteed to be atomic, at least in MySQL and PostgreSQL. I didn't check the other databases, since one problematic one is already enough.

This should fix the root cause of #54014.

Checklist

@provokateurin provokateurin added this to the Nextcloud 32 milestone Jul 22, 2025
@provokateurin provokateurin requested a review from a team as a code owner July 22, 2025 12:29
@provokateurin provokateurin requested review from ArtificialOwl and removed request for a team July 22, 2025 12:29
@provokateurin provokateurin added the 3. to review Waiting for reviews label Jul 22, 2025
@@ -103,7 +103,10 @@ public function insertIfNotExist($table, $input, ?array $compare = null) {
$query .= ' HAVING COUNT(*) = 0';

try {
return $this->conn->executeUpdate($query, $inserts);
$this->conn->beginTransaction();
$rows = $this->conn->executeUpdate($query, $inserts);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would the transaction stay open if this line throws? I think an explicit rollback is needed, at least for postgres

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could alternatively use the atomic() helper from OCP\AppFramework\Db\TTransactional perhaps?

@ChristophWurst
Copy link
Member

I can't prove the atomicity yet. Here's the script I've had executed four times in parallel so there is a race for the time-based mount point insert:

<?php

require_once './lib/base.php';

$db = \OCP\Server::get(\OCP\IDBConnection::class);
for ($i = 0; $i <= 10000000; $i++) {
	$id = time();
	$inserted = $db->insertIfNotExist('*PREFIX*mounts', [
		'storage_id' => 1,
		'root_id' => 2,
		'user_id' => 'dummy',
		'mount_point' => "/mount/point/$id",
		'mount_id' => 3,
		'mount_provider_class' => '\\OCA\\Dummy\\MountProvider',
	], ['root_id', 'user_id', 'mount_point']);

	if ($inserted === 0) {
		echo "$i no insert\n";
	} else {
		echo "$i insert\n";
	}

	usleep(2);
}

there will still be duplicates for dummy after a few seconds 🤔

@salmart-dev
Copy link
Contributor

@provokateurin from my understanding, the transaction in this case won't do much as we are always inserting one row if the condition evaluates to true. Between two different transactions in two different requests, there still can be the case where both queries count the entries, both return 0 and both insert.

Actually it may even make things worse if the transaction adds a bit more delay so that there is more time for other parallel queries to see that there are 0 rows of the selected type 🤔

@ChristophWurst
Copy link
Member

You're probably right. We use READ COMMITTED transaction isolation level. Two transactions will read the row does not exist and continue. This operations is not serialized.

@salmart-dev
Copy link
Contributor

I'm afraid the only solution to this problem is a unique constraint or the use of a lock, if editing the table structure is not an option.

Another possibility would be to have a background job to check for this case and clean up the table periodically to remove the duplicated entries: the good thing in this case is that here it would be possible to use locks in the FOR UPDATE form, so that even if the script runs during the day, the entries would be locked for writer when they need to be fixed by the script. The script can be made to run as efficiently and least disruptive as possible (like process a user at a time) so to keep the amount of duplicated entries to a minimum on a rolling base.

The one problem with the solution above is that... it's bound to be a case that will keep happening every time we cannot use a unique constraint on a table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants