Fix #13398 (Clang import: use json output instead of debug output) #7646

danmar · 2025-07-06T07:59:48Z

No description provided.

danmar · 2025-07-07T16:04:43Z

@firewave any opinion?
I've always considered clang import to be experimental it wasn't robust . but with the json input it should be possible to make it work well.
this PR deactivates support for c++ clang import because I think that it far from working. But if we work on it.. I hope we can reactivate it later..

firewave · 2025-07-07T20:44:24Z

@firewave any opinion?
I've always considered clang import to be experimental it wasn't robust . but with the json input it should be possible to make it work well.

I am all for more parsable inputs. But as I pasted in the ticket the source documentation states that this format is not stable and the data might be incomplete. No such note is present with the old test output. So it would be good to check the official documentation and also get some feedback from upstream (you can just file a question as a ticket).

this PR deactivates support for c++ clang import because I think that it far from working. But if we work on it.. I hope we can reactivate it later..

I was not aware of that. It seemed to work rather well IIRC.

Running TEST_CPPCHECK_INJECT_CLANG=clang python3 -m pytest test/cli failures show several issues with language-related stuff. I will have a look at that. But IIRC most failures were about missing system includes and inconsistencies in the output.

firewave · 2025-07-07T20:55:00Z

failures show several issues with language-related stuff. I will have a look at that.

Actually it is because the location is set after tokens have already been parsed:

cppcheck: lib/tokenlist.cpp:105: int TokenList::appendFileIfNew(std::string): Assertion `mTokensFrontBack->front == nullptr' failed.`

That should be fixed before the code is switched over to the other format.

I still need to understand how the information is not applied to the existing tokens.

I also think we should defer merging this until after the release.

firewave · 2025-07-07T21:08:08Z

A problem with the existing unit tests is that they are (were?) lacking location data which would have existed in the actual data.

They should rather be generated from actual source code we dumped to AST. That would also solve possible issues with detecting changes in the Clang AST with newer versions. But I reckon getting such code for the existing tests is impossible...

danmar · 2025-07-08T07:29:16Z

A problem with the existing unit tests is that they are (were?) lacking location data which would have existed in the actual data.

I intentionally skipped location data in testclangimport, to make the tests more manageable.

But if that cause this problem: "Actually it is because the location is set after tokens have already been parsed:" then maybe we should tweak that test so it does not fail for testclangimport.

They should rather be generated from actual source code we dumped to AST. That would also solve possible issues with detecting changes in the Clang AST with newer versions. But I reckon getting such code for the existing tests is impossible...

We use actual code in clang_import-test.py they can detect such possible issues however they are pretty basic.

It should also be pretty easy to re-generate the ast dumps in the future in testclangimport.cpp. The original c/c++ code for each unit test is either shown in a comment in each test or is pretty clear in the ASSERT_EQUALS. So I think I can re-generate the ast dumps manually in an hour or so.
And I think it would be good to reuse these code examples somehow to detect changes in the Clang AST with newer versions.. but I don't suggest we will look into that in this PR

danmar · 2025-07-08T07:40:34Z

I was not aware of that. It seemed to work rather well IIRC.

This PR breaks the handling. In the short term I think it's better to deactivate it. And then we can look into a fix later. It's not unfixable.

danmar · 2025-07-08T07:45:52Z

I also think we should defer merging this until after the release.

sure 👍

firewave · 2025-07-08T09:09:22Z

I intentionally skipped location data in testclangimport, to make the tests more manageable.

The problem is that it tests data which we will never encounter and requires us to have logic which is test-only (slightly similar to having different preprocessor logic in the unit tests than in actual application code). Hence the tickets about handled "invalid" locations as well as the current assertions.

It should also be pretty easy to re-generate the ast dumps in the future in testclangimport.cpp. The original c/c++ code for each unit test is either shown in a comment in each test or is pretty clear in the ASSERT_EQUALS. So I think I can re-generate the ast dumps manually in an hour or so.

That would be awesome.

And I think it would be good to reuse these code examples somehow to detect changes in the Clang AST with newer versions.. but I don't suggest we will look into that in this PR

Yeah, but I think this should be done before this PR is being applied - I assume I want to do it the other way around.

danmar · 2025-07-08T10:40:47Z

Yeah, but I think this should be done before this PR is being applied - I assume I want to do it the other way around.

what do you suggest such test would do?

extract c/c++ code from each testcase in testclangimport.
run cppcheck --clang
.. what exact verification do you expect here ..

firewave · 2025-07-09T12:33:12Z

Yeah, but I think this should be done before this PR is being applied - I assume I want to do it the other way around.

what do you suggest such test would do?
1. extract c/c++ code from each testcase in testclangimport.

2. run cppcheck --clang

3. .. what exact verification do you expect here ..

Some (most? all?) of the tests could probably just be moved to a Python test which simply takes a source input and an expected Cppcheck debug output.

In cases where we need to check more internal stuff we should go with unit tests.

firewave · 2025-07-09T12:41:00Z

.. what exact verification do you expect here ..

The verification that the behavior should be the same between using --clang is implicitly tested by running the tests with the TEST_CPPCHECK_INJECT_CLANG=clang.

There is also the idea to make sure that the AST/processed output stays the same over a bigger input (see https://trac.cppcheck.net/ticket/12358).

danmar · 2025-07-10T14:45:22Z

Some (most? all?) of the tests could probably just be moved to a Python test which simply takes a source input and an expected Cppcheck debug output.

When you say "move".. Let's keep the unit tests that does not execute clang. I think the advantage with the unit tests that then we test that the "original" clang ast generates the expected debug output. So it detects regressions.

Then it does make sense also as an additional test to generate ast with newer clang and check that the debug output matches. So failures here would not be regressions but just missing forward compatibility..

danmar force-pushed the fix-13398-2 branch 6 times, most recently from bad7991 to a434fdc Compare July 7, 2025 05:47

Fix #13398 (Clang import: use json output instead of debug output)

a863251

danmar force-pushed the fix-13398-2 branch from 2d28d8d to a863251 Compare July 7, 2025 15:35

releasenote

6c5303d

danmar added the merge-after-next-release Wait with merging this PR until after the next Release label Jul 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix #13398 (Clang import: use json output instead of debug output) #7646

Fix #13398 (Clang import: use json output instead of debug output) #7646

Uh oh!

danmar commented Jul 6, 2025

Uh oh!

danmar commented Jul 7, 2025

Uh oh!

firewave commented Jul 7, 2025

Uh oh!

firewave commented Jul 7, 2025

Uh oh!

firewave commented Jul 7, 2025

Uh oh!

danmar commented Jul 8, 2025 •

edited

Loading

Uh oh!

danmar commented Jul 8, 2025

Uh oh!

danmar commented Jul 8, 2025

Uh oh!

firewave commented Jul 8, 2025

Uh oh!

danmar commented Jul 8, 2025 •

edited

Loading

Uh oh!

firewave commented Jul 9, 2025

Uh oh!

firewave commented Jul 9, 2025 •

edited

Loading

Uh oh!

danmar commented Jul 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Fix #13398 (Clang import: use json output instead of debug output) #7646

Are you sure you want to change the base?

Fix #13398 (Clang import: use json output instead of debug output) #7646

Uh oh!

Conversation

danmar commented Jul 6, 2025

Uh oh!

danmar commented Jul 7, 2025

Uh oh!

firewave commented Jul 7, 2025

Uh oh!

firewave commented Jul 7, 2025

Uh oh!

firewave commented Jul 7, 2025

Uh oh!

danmar commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danmar commented Jul 8, 2025

Uh oh!

danmar commented Jul 8, 2025

Uh oh!

firewave commented Jul 8, 2025

Uh oh!

danmar commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

firewave commented Jul 9, 2025

Uh oh!

firewave commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danmar commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

danmar commented Jul 8, 2025 •

edited

Loading

danmar commented Jul 8, 2025 •

edited

Loading

firewave commented Jul 9, 2025 •

edited

Loading

danmar commented Jul 10, 2025 •

edited

Loading