-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
fix(\OC\DB\Adapter): Ensure insertIfNotExist is atomic #54041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: provokateurin <[email protected]>
@@ -103,7 +103,10 @@ public function insertIfNotExist($table, $input, ?array $compare = null) { | |||
$query .= ' HAVING COUNT(*) = 0'; | |||
|
|||
try { | |||
return $this->conn->executeUpdate($query, $inserts); | |||
$this->conn->beginTransaction(); | |||
$rows = $this->conn->executeUpdate($query, $inserts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would the transaction stay open if this line throws? I think an explicit rollback is needed, at least for postgres
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could alternatively use the atomic()
helper from OCP\AppFramework\Db\TTransactional
perhaps?
I can't prove the atomicity yet. Here's the script I've had executed four times in parallel so there is a race for the time-based mount point insert: <?php
require_once './lib/base.php';
$db = \OCP\Server::get(\OCP\IDBConnection::class);
for ($i = 0; $i <= 10000000; $i++) {
$id = time();
$inserted = $db->insertIfNotExist('*PREFIX*mounts', [
'storage_id' => 1,
'root_id' => 2,
'user_id' => 'dummy',
'mount_point' => "/mount/point/$id",
'mount_id' => 3,
'mount_provider_class' => '\\OCA\\Dummy\\MountProvider',
], ['root_id', 'user_id', 'mount_point']);
if ($inserted === 0) {
echo "$i no insert\n";
} else {
echo "$i insert\n";
}
usleep(2);
} there will still be duplicates for |
@provokateurin from my understanding, the transaction in this case won't do much as we are always inserting one row if the condition evaluates to true. Between two different transactions in two different requests, there still can be the case where both queries count the entries, both return 0 and both insert. Actually it may even make things worse if the transaction adds a bit more delay so that there is more time for other parallel queries to see that there are 0 rows of the selected type 🤔 |
You're probably right. We use READ COMMITTED transaction isolation level. Two transactions will read the row does not exist and continue. This operations is not serialized. |
I'm afraid the only solution to this problem is a unique constraint or the use of a lock, if editing the table structure is not an option. Another possibility would be to have a background job to check for this case and clean up the table periodically to remove the duplicated entries: the good thing in this case is that here it would be possible to use locks in the The one problem with the solution above is that... it's bound to be a case that will keep happening every time we cannot use a unique constraint on a table. |
Summary
INSERT ... SELECT ...
are not guaranteed to be atomic, at least in MySQL and PostgreSQL. I didn't check the other databases, since one problematic one is already enough.This should fix the root cause of #54014.
Checklist