-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blacklist more SMTLIB keywords as variable/uf names and automatically escape all variable names that would be illegal #424
base: master
Are you sure you want to change the base?
Conversation
… in AbstractFormulaManager
Not allowing any variable names starting with "bv" lead to too many issues as the prefix is used frequently throughout the tests.
// + any symbol starting with "str." | ||
"str.concat", | ||
// + any symbol starting with "re." | ||
"re.opt"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String and Regex operations are matched via "startsWith" and not via String equality. Do we need these entries here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The keywords in RESERVED
are used by VariableNamesTest
and I've included these two entries as "examples" for the tests. Otherwise, they could be removed. Come to think of it, VariableNamesTest
is already kind of slow and it might even need to test all the entries in RESERVED
. Maybe one or two examples for each "kind" of reserved word would be sufficient? Then we could just include those test cases directly in VariableNamesTest
and remove these two dummy keywords from RESERVED
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change disallows quite a few names for symbols.
Are we sure that we want to go in that direction?
What is the motivation for this PR? Parsing and writing SMTLIB is complex, and solvers need to escape all unexpeted names anyway. This PR seems to forbid such cases upfront. This might be valid, but it might be unfriendly for existing callers.
This was motivated by an issue in CPAchecker (Bitwuzla issues when using a distinct solver for interpolation) were we try to use Bitwuzla as the main solver for CPAchecker. Since interpolation is not supported in Bitwuzla, we need to translate formulas from/to MathSAT whenever CPAchecker wants to calculate interpolants. For this we need to first dump the formulas to SMTLIB and then read them back in with the "target" solver. However, this will often fail with a parse error:
The issue is that many of the programs use the C function
Since As @PhilippWendler pointed out there is already an old JavaSMT issues here, but it's never been entirely resolved. The proper fix would be to escape such names transparently, but this would require us to write our own parser and printer for SMTLIB formulas, which still seem out of reach. Blacklisting the reserved symbols for now at least makes sure that the user gets an error immediately. Once the parser/printer is ready we can then allow them again. Note that I added some tests in supported_symbols_test to see what symbol names are actually allowed by the solvers. You can run From the tests it seems that most solvers don't quote reserved words when printing formulas and then fail to read back their own output. It's even worse for Bitwuzla, which fails to parse any reserved word, even when quotes are used. This means that just fixing formula printing won't solve the issue either. |
… Uf names. This is work in progress and will eventually allow us to support arbitrary names for variables.
I've added transparent escaping/unescapig with a43cb6d. This means that all variables are now allowed, and the escaping is done automatically when needed. We still have to figure out what to do about all the tests that expect |
…on is thrown if an invalid symbol name is used in the input script
7573c1f
to
b1fa368
Compare
…es on the Windows test system
cdcb730
to
bc047e0
Compare
Hello,
this will update the list of forbidden variable/uf names to include all SMTLIB commands, such as
check-sat
orassert
, and all predefined function names, likemod
orfp.add
. While such symbol names are fine when using the API, they cause issues when printing the formulas as most solvers will fail to add the necessary quotes to the symbols. This will then causing issues inFormulaManager.translateFrom()
as the generated output is invalid and can't to be read back by the target solver to translate the formula.By forbidding such symbol names entirely we circumvent the issues. User should always use
FormulaManager.isValidName()
to check if their name is on the list, and substitute it withFormulaManager.escape()
if necessary.