Skip to content

Conversation

@janlindstrom
Copy link
Contributor

  • The Jira issue number for this PR is: MDEV-36923

Description

Added warning messages for applier FK failures. Warnings are printed by default and can be disabled using
set global wsrep_mode=APPLIER_DISABLE_FK_WARNINGS;

Ported to MariaDB [email protected]

Release Notes

TODO: What should the release notes say about this change?
Include any changed system variables, status variables or behaviour. Optionally list any https://mariadb.com/kb/ pages that need changing.

How can this PR be tested?

TODO: modify the automated test suite to verify that the PR causes MariaDB to behave as intended.
Consult the documentation on "Writing good test cases".

If the changes are not amenable to automated testing, please explain why not and carefully describe how to test manually.

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • [ x] This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • [x ] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • [x ] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@janlindstrom janlindstrom self-assigned this Oct 6, 2025
@janlindstrom janlindstrom requested a review from dr-m October 6, 2025 12:06
@svoj svoj added the Codership Codership Galera label Oct 6, 2025
@janlindstrom janlindstrom force-pushed the 10.11-MDEV-36923 branch 2 times, most recently from e80adde to 04491fc Compare October 9, 2025 07:28
@janlindstrom janlindstrom force-pushed the 10.11-MDEV-36923 branch 2 times, most recently from 3595516 to ada1189 Compare October 15, 2025 14:27
@janlindstrom janlindstrom force-pushed the 10.11-MDEV-36923 branch 5 times, most recently from 22e5686 to a40d8be Compare October 20, 2025 12:36
Comment on lines 23 to 24
SET GLOBAL wsrep_mode=128;
ERROR 42000: Variable 'wsrep_mode' can't be set to the value of '128'
SET GLOBAL wsrep_mode=65536;
ERROR 42000: Variable 'wsrep_mode' can't be set to the value of '65536'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a tight limit? As far as I understand, the Sys_var_set Sys_wsrep_mode was extended with only one member. Hence I’d expect 256 to be a disallowed value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines 51 to 54
--let $assert_count = 1
--let $assert_file = $MYSQLTEST_VARDIR/log/mysqld.2.err
--let $assert_text = Foreign key warning in table
--let $assert_select = Foreign key warning in table
--source include/assert_grep.inc
--let $assert_text = adding an index entry to a child table failed
--let $assert_select = adding an index entry to a child table failed
--source include/assert_grep.inc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table name and the constraint name are being omitted here, although a comment right before the statement INSERT INTO grandchild claims to know them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added full text.

Comment on lines 750 to 755
#ifndef UNIV_DEBUG
/* In release builds do not output lock wait warnings to avoid
flooding error log. */
if (err == DB_LOCK_WAIT)
return;
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this even in debug instrumented builds? Would a Galera specific DBUG_EXECUTE_IF in row_mysql_handle_errors() serve a similar purpose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really do not want to log lock waits in release builds, they are not errors this change is intended. Case where (err != DB_SUCCESS && err != DB_LOCK_WAIT) is the only interesting case I just do not know test case to produce it.

Comment on lines 761 to 765
const char* end;
end= innobase_convert_name(db_table, sizeof db_table,
foreign->foreign_table_name,
strlen(foreign->foreign_table_name),
trx->mysql_thd);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TAB should not be for formatting new functions. Why are the declaration and initialization split?

Comment on lines 766 to 768
db_table[end - db_table] = '\0';

const char *fk_id= strchr(foreign->id, '/');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end may post one char past the end of db_table. If that is the case, this assignment would constitute a buffer overflow and fail to ensure that the db_table is NUL terminated. In my previous review I suggested the use of "%.*s" and specifying the length explicitly. The buffer overflow would have been avoided by that.

The formatting around the assignment operator = is inconsistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think so, see ut0ut.cc funtion ut_get_name, db_table size is bigger than actual table name can be even in multibyte names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make buffer slightly longer to avoid overflow.

Comment on lines 769 to 772
if (fk_id)
fk_id++;
else
fk_id= foreign->id; /* a constraint name from before MySQL 4.0.18 */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not implementing dict_foreign_t::sql_id() will cause a run-time merge conflict with #3960. If the tests are not revised to actually cover the constraint names that the messages are expected to output, then such conflicts will remain undetected.

Comment on lines 1265 to 1253
if (err != DB_SUCCESS) {

goto nonstandard_exit_func;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heikki Tuuri insisted that there be an extra blank line before any goto statement. I disagree with that, but then again I think that we should avoid any changes to formatting unless some code nearby is being changed. This is not the only occurence of a blank line before a goto statement in this function.

Comment on lines 1797 to 1788
if (trx->is_wsrep()) {
wsrep_applier_log_fk(trx, foreign,
"Setting shared record lock "
"to matching record in the "
"child table failed err: ",
err);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my previous review, I pointed out that we have some duplicated err: err: strings in the output. This still seems to be the case at least for this message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Add warning messages if applier write set execution
fails because of foreign key constraint error.
Warnings are printed by default and can be disabled using
SET GLOBAL wsrep_mode=APPLIER_DISABLE_FK_WARNINGS.
Comment on lines +27 to +35
include/assert_grep.inc [Foreign key warning in table: \`test\`\.\`grandchild\` constraint \`fk_2\` failed err: adding an index entry to a child table failed.]
connection node_1;
drop table grandchild;
drop table child;
drop table parent;
SET GLOBAL wsrep_provider_options = 'pc.ignore_sb=false';
SET GLOBAL wsrep_provider_options = 'pc.weight=1';
connection node_2;
call mtr.add_suppression("WSREP: Foreign key warning in table");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some a duplicated word failed in the message as well as some confusion between warning and err. Why is the mtr.add_suppression() less specific? Do we really want to ignore any errors that could be issued for other table or constraint names?

Comment on lines +1 to +2
--- /home/jan/work/mariadb/10.11/mysql-test/suite/galera/r/galera_FK_duplicate_client_insert.result
+++ /home/jan/work/mariadb/10.11/mysql-test/suite/galera/r/galera_FK_duplicate_client_insert.reject
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute path names do not add any useful information. Some other redundant information had been removed in 88d9348.

Comment on lines +365 to +366
--let $assert_text = Foreign key warning in table: \`test\`\.\`cg\` constraint \`fk_1\` failed err: Setting shared record lock to matching record in the child table failed \(Lock wait\).
--let $assert_select = Foreign key warning in table: \`test\`\.\`cg\` constraint \`fk_1\` failed err: Setting shared record lock to matching record in the child table failed \(Lock wait\).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no table named cg or constraint fk_1 in this test. The error message is misleading or incorrect.

Also, the "child table" appears to be incorrect as well. The SQL terminology would be "referencing table". But, the referencing table is cg-, and there is no need to set any shared lock on it. INSERT would typically acquire an implicit exclusive lock. I believe that a shared lock would be set on a record in the referenced table, or "parent table".

Furthermore, I think that it is a dangerous path to issue "fake" errors in a debug instrumented executable. Why do we need to be so specific? Why can’t we just report "deadlock", "lock wait timeout" or "interrupted" without specifying which lock wait was aborted by that condition? If we need to be specific, then the exact message would have to be output somewhere in row_mysql_handle_errors() and in my opinion it should not be specific to Galera replication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Table cg was created on line 322 as well as fk_1, I will move this little bit earlier.

I'm sorry but I do not understand your second paragraph.

Comment on lines +1667 to +1670
Starting with MariaDB 12, constraint names are prepended with the
dict_table_t::name and the invalid UTF-8 sequence 0xff. */
const char *s;
return ((s= strchr(id, '\377')) || (s= strchr(id, '/'))) ? ++s : id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not what I requested earlier. We must not create an impression that a downgrade would work if it has not been tested. The check for '\377' must not be added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Codership Codership Galera

Development

Successfully merging this pull request may close these issues.

3 participants